Skip to main content
The Instance Proxy API lets you call ComfyUI’s native backend endpoints on a live instance. After you submit a job via the async queue, the status response will include an instance_id once the instance is active. With that ID, you can send authenticated requests through a proxy path to perform operational tasks such as unloading models or freeing GPU memory.

When to use the proxy

Start proxy calls once the request status is in_progress (after cold start) or completed, and you can read instance_id from the Status endpoint:

Instance proxy endpoint

Base URL: https://api.runcomfy.net
POST /prod/v2/deployments/{deployment_id}/instances/{instance_id}/proxy/{comfy_backend_path}

Path parameters

  • deployment_id: string (required)
  • instance_id: string (required)
  • comfy_backend_path: string (required) — the target ComfyUI backend route, e.g. api/free

What you can call

The proxy forwards your request to the live instance. Common targets include:
  • ComfyUI backend endpoints (e.g. GET /object_info, POST /api/prompt)
  • ComfyUI Manager endpoints (e.g. POST /api/free)

Free memory / unload models

You can release GPU memory or unload models via ComfyUI Manager’s native POST /api/free endpoint. This can be useful in long-running sessions to ensure the next request starts from a clean state.

Request example: unload models only

curl --request POST \
  --url "https://api.runcomfy.net/prod/v2/deployments/{deployment_id}/instances/{instance_id}/proxy/api/free" \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer <token>" \
  --data '{
    "unload_models": true
  }'
Meaning in ComfyUI: Unloads currently loaded model weights (checkpoints/LoRAs/CLIP/VAE) from memory; does not clear the execution cache.
Equivalent in the ComfyUI web UI: Manager → Unload models.
Unload models in ComfyUI

Request example: unload models and free memory

curl --request POST \
  --url "https://api.runcomfy.net/prod/v2/deployments/{deployment_id}/instances/{instance_id}/proxy/api/free" \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer <token>" \
  --data '{
    "unload_models": true,
    "free_memory": true
  }'
Meaning in ComfyUI: Unloads models and clears the execution cache to return cached GPU VRAM.
Equivalent in the ComfyUI web UI: Manager → Unload models and Clear execution cache (free memory).
Unload models + clear execution cache in ComfyUI

Response example

200 OK (no response body)

Lifecycle notes and errors

  • An instance_id is valid only while its instance is running. If the instance shuts down due to keep-warm/idle timeout, subsequent proxy calls will fail. Submit a new job to start a fresh instance and obtain a new instance_id.
  • Immediately after submitting a request, proxy calls may fail until the job status shows in_progress (after cold start) or completed in the status endpoint. Poll status and retry once it transitions.