Skip to main content
A dataset is the collection of training data you use when training a LoRA. Before you start a training job, you must create and upload a dataset first—only datasets in READY status can be mounted and used by a training job.

Quickstart (minimum working flow)

  1. POST /prod/v1/trainers/datasets → create a dataset (get dataset_id + dataset_name)
  2. Upload files
    • ≤150MB per file: POST /prod/v1/trainers/datasets/{dataset_id}/upload
    • >150MB per file: POST /prod/v1/trainers/datasets/{dataset_id}/get-upload-endpointPUT each file to the returned upload_url
  3. GET /prod/v1/trainers/datasets/{dataset_id}/status → poll until READY
  4. Use dataset_name in training job requests

Dataset status lifecycle

Datasets move through these statuses:
  • DRAFT: dataset resource created, but it contains no uploaded files yet
  • UPLOADING: dataset is currently receiving files (either direct upload or signed URL uploads)
  • READY: all uploaded files are complete and validation passed; the duration depends on file count, file size, and whether all uploads complete successfully; when it is READY, the dataset can be mounted by a training job
  • FAILED: upload/validation failed; error field is present

Endpoints

Base URL: https://trainer-api.runcomfy.net
EndpointMethodDescription
/prod/v1/trainers/datasetsPOSTCreate a dataset resource (metadata only)
/prod/v1/trainers/datasets/{dataset_id}/uploadPOSTUpload a dataset file (≤150MB)
/prod/v1/trainers/datasets/{dataset_id}/get-upload-endpointPOSTGet signed upload URLs (for larger/multi-file uploads)
/prod/v1/trainers/datasets/{dataset_id}/statusGETGet a dataset status
/prod/v1/trainers/datasetsGETList datasets
/prod/v1/trainers/datasets/{dataset_id}DELETEDelete a dataset

Common Parameters

FieldTypeDescription
idstringStable identifier for this dataset (used as dataset_id in API paths for upload/status/delete)
namestringHuman-readable dataset name (used as dataset_name in training job requests; must be unique within your account)
statusstringOne of: DRAFT, UPLOADING, READY, FAILED
created_atstringISO 8601 timestamp (microsecond precision, e.g. 2025-07-22T13:05:16.143086)
updated_atstringISO 8601 timestamp (microsecond precision, e.g. 2025-07-22T13:05:16.143086)
errorobjectPresent when status = FAILED

Create a dataset

Create a new dataset resource (metadata only) that you will upload training files into. Right after creation, the dataset is empty (no files uploaded yet) and its status is DRAFT.
POST /prod/v1/trainers/datasets

Request body

FieldTypeRequiredDescription
namestringnoOptional. Human-readable dataset name. Must be unique within your account. This value is used as dataset_name in training job requests. If omitted, RunComfy generates one (e.g. ds_...).

Request example

curl --request POST \
  --url "https://trainer-api.runcomfy.net/prod/v1/trainers/datasets" \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer <token>" \
  --data '{
    "name": "<YOUR_DATASET_NAME>"
  }'

Response example

{
  "id": "{dataset_id}",
  "name": "{dataset_name}",
  "status": "DRAFT",
  "created_at": "2026-01-31T10:20:30.143086",
  "updated_at": "2026-01-31T10:20:30.143086"
}

Upload a dataset file (≤150MB)

Use this endpoint for small files. For larger uploads or multi-file batches, use Get signed upload URLs. Rules (important):
  • Size limit: ≤150MB per file (150,000,000 bytes). For larger files, use signed URLs.
  • Supported file types: images, videos, and caption .txt files.
  • Caption naming rule (critical for LoRA / AI Toolkit): each image/video must have a caption file with the same base filename.
    • Example: img_0001.jpgimg_0001.txt
    • Example: clip_0001.mp4clip_0001.txt
  • Track upload success per file: check the response for each upload request. If an upload fails, the response returns an error and the file is not added to the dataset.
  • If the same filename is uploaded multiple times within the same dataset_id, the latest upload overwrites the previous one.
  • In curl --form "file=@./path/to/file", the @./path/to/file is a local path on the machine running curl (relative to your current directory or an absolute path).
POST /prod/v1/trainers/datasets/{dataset_id}/upload

Request

  • file (required): the file to upload

Request example

curl --request POST \
  --url "https://trainer-api.runcomfy.net/prod/v1/trainers/datasets/{dataset_id}/upload" \
  --header "Authorization: Bearer <token>" \
  --form "file=@./dog_01.jpg"

Response example

{
  "id": "{dataset_id}",
  "name": "{dataset_name}",
  "object": "file",
  "bytes": 2134567,
  "created_at": "2026-01-31T10:21:05.143086",
  "filename": "dog_01.jpg"
}

Get signed upload URLs (file size > 150MB)

RunComfy returns short-lived signed URLs you can upload to (typically object storage). Use this when a file is >150MB.
POST /prod/v1/trainers/datasets/{dataset_id}/get-upload-endpoint

Request body

For multi-file uploads, provide a map of filename -> size_in_bytes. Notes: size_in_bytes must exactly match the actual file size in bytes. RunComfy generates signed upload URLs based on the byte size you provide. If the size is incorrect (larger or smaller than the real file), the upload may be rejected by the storage service and fail.
{
  "filenameToByteSize": {
    "img_0001.jpg": 2000000,
    "img_0001.txt": 12000,
    "img_0002.jpg": 3100000,
    "img_0002.txt": 14000
  }
}

Request example

curl --request POST \
  --url "https://trainer-api.runcomfy.net/prod/v1/trainers/datasets/{dataset_id}/get-upload-endpoint" \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer <token>" \
  --data '{
    "filenameToByteSize": {
      "img_0001.jpg": 2000000,
      "img_0001.txt": 12000,
      "img_0002.jpg": 3100000,
      "img_0002.txt": 14000
    }
  }'

Response example

{
  "uploads": {
    "img_0001.jpg": {
      "upload_url": "https://storage.example.com/presigned/datasets/ds_123/img_0001.jpg?X-Amz-Signature=...",
      "method": "PUT",
      "headers": {
        "Content-Type": "image/jpeg"
      },
      "expires_at": "2026-01-31T10:40:30Z"
    },
    "img_0001.txt": {
      "upload_url": "https://storage.example.com/presigned/datasets/ds_123/img_0001.txt?X-Amz-Signature=...",
      "method": "PUT",
      "headers": {
        "Content-Type": "text/plain"
      },
      "expires_at": "2026-01-31T10:40:30Z"
    },
    "img_0002.jpg": {
      "upload_url": "https://storage.example.com/presigned/datasets/ds_123/img_0002.jpg?X-Amz-Signature=...",
      "method": "PUT",
      "headers": {
        "Content-Type": "image/jpeg"
      },
      "expires_at": "2026-01-31T10:40:30Z"
    },
    "img_0002.txt": {
      "upload_url": "https://storage.example.com/presigned/datasets/ds_123/img_0002.txt?X-Amz-Signature=...",
      "method": "PUT",
      "headers": {
        "Content-Type": "text/plain"
      },
      "expires_at": "2026-01-31T10:40:30Z"
    }
  }
}

Upload bytes to the signed URL

Use the method and headers returned in the response.
curl -X PUT \
  --upload-file "./img_0001.jpg" \
  -H "Content-Type: image/jpeg" \
  "<upload_url>"

Note:

  • If a signed URL expires, call get-upload-endpoint again to get a fresh URL.
  • Track upload success per file: your client should record whether each PUT succeeded. A successful PUT typically returns HTTP 200 or 204. If a PUT fails, the response returns an error and the file is not added to the dataset.
  • After all files have uploaded successfully, poll GET /prod/v1/trainers/datasets/{dataset_id}/status until READY.

Get a dataset status

After you finish uploading your dataset (direct upload or signed URLs), poll this endpoint until the dataset becomes READY. If it becomes FAILED, check the error field, fix the issue, and re-upload (or create a new dataset). The response includes a files array so you can see which files are currently available in the dataset. Only successfully uploaded files appear in files—files that are still uploading or that failed to upload are not listed.
GET /prod/v1/trainers/datasets/{dataset_id}/status

Request example

curl --request GET \
  --url "https://trainer-api.runcomfy.net/prod/v1/trainers/datasets/{dataset_id}/status" \
  --header "Authorization: Bearer <token>"

Response example

{
  "id": "{dataset_id}",
  "name": "{dataset_name}",
  "status": "READY",
  "files": [
    {
      "filename": "img_0001.png",
      "size_bytes": 215290
    },
    {
      "filename": "img_0001.txt",
      "size_bytes": 24
    }
  ],
  "created_at": "2026-01-31T10:20:30.143086",
  "updated_at": "2026-01-31T10:41:02.143086"
}

List datasets

List all datasets in your account, including their current status. Use this to find the dataset name and id you’ll reference in training requests.
GET /prod/v1/trainers/datasets

Request example

curl --request GET \
  --url "https://trainer-api.runcomfy.net/prod/v1/trainers/datasets" \
  --header "Authorization: Bearer <token>"

Response example

{
  "datasets": [
    {
      "id": "{dataset_id}",
      "name": "{dataset_name}",
      "status": "DRAFT",
      "created_at": "2026-01-31T10:20:30.143086",
      "updated_at": "2026-01-31T10:20:30.143086"
    },
    {
      "id": "{dataset_id}",
      "name": "{dataset_name}",
      "status": "READY",
      "created_at": "2026-01-31T10:20:30.143086",
      "updated_at": "2026-01-31T10:20:30.143086"
    }
  ]
}

Delete a dataset

Permanently delete a dataset by dataset_id. This is irreversible—only delete datasets you no longer need for training.
DELETE /prod/v1/trainers/datasets/{dataset_id}

Request example

curl --request DELETE \
  --url "https://trainer-api.runcomfy.net/prod/v1/trainers/datasets/{dataset_id}" \
  --header "Authorization: Bearer <token>"

Response example

{
  "id": "{dataset_id}",
  "name": "{dataset_name}",
  "deleted": true
}