Skip to main content
Trainer API provides HTTP endpoints to run AI Toolkit LoRA training. You bring your dataset and an AI Toolkit YAML config, and RunComfy runs the job on GPUs, then returns the outputs you need (e.g. LoRA checkpoints, resolved config, and sample images) so you can automate end-to-end fine-tuning in code. After training, you can take the resulting LoRA and run inference via Model APIs (on-demand) or deploy it with Serverless API (LoRA). Not sure which one to use for inference? Start with Choose a LoRA Inference API.

Key concepts

Trainer API revolves around two objects:

Dataset

A Dataset is the training data you upload for a LoRA run. You create it with the Dataset API, then upload its files until it’s READY. It typically includes:
  • images/videos
  • caption .txt files (same base filename as the media)
Datasets have a lifecycle (DRAFTUPLOADINGREADY).
Only datasets in READY can be mounted by a training job.

Training Job

A Training Job is an async run that executes your AI Toolkit YAML on a GPU. When a job starts, RunComfy:
  • mounts your READY dataset into the training container (under /app/ai-toolkit/datasets/{dataset_name})
  • runs the YAML config you provide
  • produces artifacts such as LoRA checkpoints (.safetensors), resolved config, and sample outputs
A training job references its input dataset by dataset_name (from the Dataset API).

Typical workflow

  1. Create a dataset (metadata)
  2. Upload dataset files, then wait until it becomes READY
  3. Submit a training job with your AI Toolkit YAML config (and gpu_type)
  4. Poll status and download results (checkpoints, samples, config)
  5. Run inference with the trained LoRA:
Next step: Quickstart