Skip to main content
Serverless API (ComfyUI) offers flexible, pay-per-use pricing with no upfront costs. Unlike the Model API (per-request pricing), Serverless API pricing is based on GPU instance uptime for your deployments (billed per second).

Pricing overview

Serverless API (ComfyUI) supports two billing plans:
  • Pay as You Go: standard hourly rates by machine tier
  • Pro (subscription): 20%–30% discount on Pay as You Go rates
Prices and machine availability may change. Refer to RunComfy Pricing for the latest machine rates, plan benefits, and extras.
Machine TypeGPU OptionsVRAMRAMvCPUsPay as You Go PricePro Price
MediumT4, A400016GB16GB8$0.99/hour$0.79/hour
LargeA10G, A500024GB32GB8$1.75/hour$1.39/hour
X-LargeA600048GB48GB28$2.50/hour$1.99/hour
X-Large PlusL40S, L4048GB64GB28$2.99/hour$2.15/hour
2X-LargeA10080GB96GB28$4.99/hour$3.99/hour
2X-Large PlusH10080GB180GB28$7.49/hour$5.99/hour
3X-LargeH200141GB240GB24$8.75/hour$6.99/hour

How billing works

Billing is usage-based and calculated per second:
  • Billing starts when an instance is signaled to wake up (cold start + initialization).
  • Billing stops when the instance is fully shut down.
Your deployment can run a mix of persistent and on-demand instances, controlled by:
  • min_instances / max_instances (autoscaling bounds)
  • keep_warm_duration_in_seconds (how long to keep idle instances warm)
See also: Creating a Deployment

Persistent instances

  • Set min_instances > 0 to keep that many instances running.
  • You are billed for the full uptime (including idle time) until you scale down.

On-demand instances

  • Additional instances spin up to handle demand above min_instances (or all demand when min_instances = 0).
  • Instances scale down after the keep-warm period.
  • You are billed for cold start, execution, and keep-warm time.

Instance cost breakdown

  • Cold start: instance boots and loads models/assets. Duration depends on machine tier, workflow complexity, and model size.
  • Execution time: workflows run. This is the main compute time.
  • Keep-warm time: idle time before scale-down. This time is billed.
Note: You may also see Queue Time (waiting for resources or concurrency). Queue time is not billed.

Support

If you believe you’ve been incorrectly billed, contact us at [email protected] with your deployment_id, the request_id (if applicable), and the approximate time of the issue.