request_id > fetch status/results), but they solve different problems.
Serverless API (ComfyUI)
Deploy ComfyUI workflows as serverless endpoints.
Serverless API (LoRA)
Deploy LoRAs as serverless endpoints.
Model API
Run hosted models on-demand with no deployment — pay per request.
Which API should I use?
Use this as a quick decision guide:| What you are trying to do | Recommended API | What you call with | Deployment required? |
|---|---|---|---|
| Run a model from the RunComfy Models catalog (or a hosted pipeline) | Model API | model_id | No |
| Run inference with a LoRA without deploying anything | Model API | model_id + LoRA inputs | No |
| Turn a ComfyUI workflow into a production endpoint (versions, autoscaling, webhooks, instance proxy) | Serverless API (ComfyUI) | deployment_id | Yes |
| Serve a LoRA behind a dedicated, scalable endpoint | Serverless API (LoRA) | deployment_id | Yes |
Both Serverless API (LoRA) and Serverless API (ComfyUI) are built on the same serverless deployment system. The difference is what you deploy and therefore what the request schema looks like.
Getting started
- Model API: start with Quickstart, then see Async Queue Endpoints.
- Serverless API (ComfyUI): start with Quickstart, then learn about Overrides and workflow files.
- Serverless API (LoRA): start with Choose a LoRA inference API, then follow the Quickstart.
Common request pattern
Most RunComfy endpoints are asynchronous:- Submit a job (
POST …/inference) > get arequest_id - Poll status (
GET …/status) until it completes - Fetch outputs (
GET …/result) or use webhooks for push-based updates
