Skip to main content
Serverless API (ComfyUI) lets you turn a cloud-saved ComfyUI workflow into a callable, scalable endpoint (a Deployment). You deploy a workflow once, then your application calls that deployment by deployment_id using an async queue API (submit → get request_id → poll status/result, or receive updates via webhooks).

What you get

With Serverless API (ComfyUI) you can:
  • Deploy a workflow as an API, no infra to manage (RunComfy handles containerization + GPU orchestration)
  • Choose hardware per deployment (GPU/VRAM tier) and change it later if requirements evolve
  • Autoscale with explicit knobs (min/max instances, queue threshold, keep-warm duration)
  • Version workflows safely (deployments are pinned to a workflow version; upgrades are explicit and reversible)
  • Integrate in production with webhooks and the instance proxy for advanced operations

Key objects

  • Workflow (cloud-saved): a ComfyUI workflow packaged together with its runtime (nodes, models, dependencies).
  • Workflow version: each Cloud Save creates an immutable version (like a container image snapshot).
  • Deployment: the serverless endpoint you call (identified by deployment_id), pinned to a workflow version.
  • Request: a single async inference job against a deployment (identified by request_id).
  • Instance: a running container for a deployment that actually executes requests; instances scale up/down based on your autoscaling settings.

Typical workflow

  1. Build or customize a workflow in RunComfy’s ComfyUI Cloud.
  2. Cloud Save the workflow (creates a version).
  3. Create a Deployment (choose hardware + autoscaling).
  4. Submit inference: POST /prod/v1/deployments/{deployment_id}/inference
  5. Poll status/result (or use webhooks).
Next step: Quickstart

How this relates to the other RunComfy APIs

  • Model API: on-demand inference for hosted models/pipelines, no deployment, per-request billing, call by model_id.
  • Serverless API (LoRA): built on the same serverless deployment system, but what you deploy is a Trainer LoRA (instead of a workflow).

Alternative option: Server API (ComfyUI)

If you need full control of a dedicated ComfyUI backend instance (for example, to integrate ComfyUI directly into tools like Krita, Photoshop, Blender, iClone, etc.), RunComfy also provides a Server API paired with the ComfyUI Backend API. See the Server API documentation here: RunComfy ComfyUI Backend API