Switchbox for AI apps

A model kill switch that can't go down

Ship prompts, model choice, and parameters as edge config — not a redeploy. Swap models, roll one out to 10% of traffic, or kill a misbehaving one in seconds. Your AI config keeps serving from Cloudflare's edge even if our entire backend is down.

An AI app's worst dependency is something in its inference hot path. Switchbox structurally isn't — kill our API, the database, the whole backend, and your prompts and model config keep serving from Cloudflare's edge. The SDK reads a static JSON file and evaluates locally; we're never in your request path.

Claimable today

What you can ship right now

Model kill switch & fallback

A model erroring or burning money? Flip a flag and your backend falls back to a safe model in ~30 seconds — no redeploy, no incident call.

Gradual model migration

Move gpt-4 → claude at 10% → 50% → 100% with a rollout percentage. Deterministic per user, reversible instantly if quality drops.

Per-tenant prompt & model routing

Serve a different model or system prompt per cohort with targeting rules — enterprise tenants, beta users, or a single region.

Prompts & params as config

Keep the system prompt, temperature, and max_tokens in a json flag. Tune them from the dashboard without shipping code.

How it looks

Your model config in a flag, read server-side

A json flag holds the prompt and parameters. Your backend reads it with a zero-dependency SDK and calls the model — change the config from the dashboard and every server picks it up within ~30 seconds.

assistant_config · json flag
{
  "model": "claude-sonnet-4-6",
  "temperature": 0.7,
  "max_tokens": 1024,
  "system_prompt": "You are a concise support agent."
}
server.py · your backend
from switchbox import Switchbox

client = Switchbox(sdk_key="your-server-sdk-key")

# Model + params come from a flag — change them with no deploy.
cfg = client.get_value("assistant_config", user={"user_id": user_id})

response = llm.chat(
    model=cfg["model"],
    temperature=cfg["temperature"],
    system=cfg["system_prompt"],
    messages=messages,
)

To migrate models gradually, put the new model behind a rollout — a deterministic slice of users gets it until you trust it, then dial to 100%.

Read this first

What Switchbox is not

The honest boundaries — so you know exactly where Switchbox fits in an AI stack and where it doesn't.

Config, not secrets

Never put a model API key (OpenAI, Anthropic, …) in a flag. Switchbox is delivery config, not a secrets manager.

Server-side, by design

Flag JSON is world-readable by its key, so read it from your backend — not a browser bundle, where the key (and your prompts) would be public.

We deliver, you measure

Switchbox is the control plane that ships the config. It has no LLM observability — pair it with your eval/analytics tool (Langfuse, PostHog, Helicone) to see which variant wins.

A few prompts, not a library

The config is static edge JSON, best for a handful of prompts. A large versioned prompt library per environment is outside the architecture's sweet spot.

Put your prompts on the edge

Create a workspace, add a json flag, and read it from your backend in minutes.