RunComfy gives you a fully managed ComfyUI Cloud environment that stays in sync with the official comfyanonymous/ComfyUI repository. This means everything you’re used to locally, from custom nodes to downloaded models, works exactly the same in the RunComfy cloud. You can install new nodes, bring in your own models, and run workflows without compatibility issues.
In ComfyUI, a workflow is a visual program built from interconnected nodes. Each node performs a specific function, and together they form a pipeline for generative AI tasks, such as creating images, videos, or other media.
The exported API JSON from a workflow details all nodes, inputs, defaults, and connections, serving as the blueprint for identifying customizable parameters that can be dynamically adjusted in API interactions.
In RunComfy, Cloud Saving packages your entire ComfyUI workflow, including its runtime environment, drivers, libraries, custom nodes, models, and dependencies, into a fully reproducible container image. This ensures your workflow runs consistently in the cloud, regardless of the underlying hardware or environment.Cloud Saving keeps workflows deployment-ready, supports versioning for iterative updates, and enables private sharing within your team, so you can collaborate smoothly without worrying about dependency conflicts.
Note: Community workflows in RunComfy are already pre-saved with Cloud Saving, so you can use them immediately or modify and save them as your own.
A deployment turns a cloud-saved ComfyUI workflow into a serverless API endpoint. You choose the hardware (e.g., GPU type) and autoscaling settings, and RunComfy handles containerization and GPU orchestration. Your deployment becomes the production-ready interface for inference requests, identified by a unique deployment_id that you’ll use in all API calls.
An instance is a running containerized environment of your deployed workflow on a dedicated GPU. It’s the execution engine that processes inference requests using the full workflow. Instances are isolated for performance and security, configured at the deployment level, and ephemeral, they start and stop automatically based on demand, keeping costs efficient.
Scaling in RunComfy automatically adjusts the number of active instances based on workload and your deployment settings. You can control parameters like minimum/maximum instances, queue size limits, and keep-warm durations to balance cost efficiency with low-latency performance. This ensures smooth handling of bursty or unpredictable workloads.
Overrides let you customize specific workflow inputs directly in your API calls without resending the full workflow JSON each time. Using node IDs from the workflow’s API JSON, you can change values like prompts, seeds, or media inputs while leaving everything else unchanged. This makes requests lighter, faster, and easier to maintain.