1
keeps one instance always warm, avoiding cold-start delays (3–5 minutes to boot), but incurs ongoing costs. For infrequent traffic, use 0
to save costs, though users may experience initial waits.
3
means no more than three instances will run at once—excess requests queue instead. This caps costs while handling moderate bursts. For heavier loads (>10 concurrent jobs), contact hi@runcomfy.com to discuss custom limits.
1
, the arrival of a second request triggers an additional instance to reduce wait times. Ideal for latency-sensitive apps, though may increase costs during short spikes.
60
means it lingers for one minute, ready for reuse without a cold start. Slight costs may occur if no new requests arrive.
Tip: Start with defaults (minimum0
, maximum1
, queue size1
, keep warm60
) for most setups. Then monitor traffic patterns via request and billing data to fine-tune.
deployment_id
is displayed—use this in all API calls to the provided endpoints.