Omnisphere is an OpenAI-compatible multi-LLM router. Use it as a drop-in replacement for the OpenAI API.
Base URL: http://YOUR_SERVER:3005
Auth: Authorization: Bearer omni_your_key
All endpoints except /api/status, /v1/models, and /api/pricing require an API key.
Authorization: Bearer omni_your_api_key_here
| Tier | Rate Limit | Monthly Budget | Best For |
|---|---|---|---|
| Free | 60 req/min | $1.00 | Testing, hobby projects |
| Pro | 120 req/min | $25.00 | Production apps |
| Enterprise | 300 req/min | $50.00+ | High-volume, custom |
OpenAI-compatible chat endpoint. Drop-in replacement.
{
"model": "claude-sonnet-4", // or any model from /v1/models
"messages": [
{"role": "user", "content": "Hello, world!"}
],
"max_tokens": 1024 // optional, default 1024
}
Response: Standard OpenAI format + _meta with cost/latency info.
{
"id": "chatcmpl-1234",
"model": "claude-sonnet-4",
"choices": [{"message": {"role": "assistant", "content": "..."}}],
"usage": {"prompt_tokens": 10, "completion_tokens": 50},
"_meta": {"source": "live", "provider": "anthropic", "cost_usd": 0.001, "tier": "balanced"}
}
Auto-routed single query. Omnisphere picks the optimal model based on complexity.
{
"prompt": "What is the meaning of life?",
"tier": "balanced", // ultraCheap | cheap | balanced | premium
"max_tokens": 512 // optional
}
Query multiple models simultaneously and get a weighted synthesis.
{
"prompt": "Compare React vs Vue for enterprise apps",
"tier": "premium", // auto-selects 3 models from this tier
"models": ["claude-sonnet-4", "gpt-4o-mini", "gemini-2.0-flash"] // or specify
}
List all available models with pricing. No auth required.
Server health check. No auth required.
Detailed pricing for all models and tiers. No auth required.
Provider health status from SmartRouter. Shows which models are healthy/failing.
Aggregate metrics per model: calls, latency, cost, success rate.
Tests every model and reports live/demo status. Use to verify which providers are working.
| Model ID | Provider | Input $/M | Output $/M | Quality | Speed |
|---|---|---|---|---|---|
claude-sonnet-4 | Anthropic | $3.00 | $15.00 | ⭐9 | Fast |
claude-haiku-3.5 | Anthropic | $0.80 | $4.00 | ⭐7 | Fast |
gpt-4o-mini | OpenAI | $0.15 | $0.60 | ⭐7 | Fast |
gpt-4o | OpenAI | $2.50 | $10.00 | ⭐9 | Medium |
gemini-2.0-flash | $0.075 | $0.30 | ⭐7 | Fast | |
gemini-2.5-pro | $1.25 | $5.00 | ⭐9 | Slow | |
minimax-m2.5 | MiniMax | $0.11 | $0.11 | ⭐8 | Fast |
deepseek-v3.2 | DeepSeek | $0.14 | $0.28 | ⭐8 | Fast |
| Tier | Avg Cost | Models Used | Best For |
|---|---|---|---|
ultraCheap | $0.0001 | MiniMax, DeepSeek, Gemini Flash | Bulk, simple tasks |
cheap | $0.0005 | Gemini Flash, GPT-4o-mini, MiniMax | Standard Q&A |
balanced | $0.003 | Claude Sonnet, GPT-4o-mini, Gemini Flash | Coding, analysis |
premium | $0.01 | Claude Sonnet, GPT-4o, Gemini Pro | Research, strategy |
If you don't specify a model, Omnisphere analyzes your prompt complexity and auto-routes:
The SmartRouter tracks provider health and automatically routes around failures. If Claude is down, queries go to GPT. If both are down, Gemini. If all direct APIs fail, OpenRouter is used as universal fallback.
// 401 — Bad or missing API key
{"error": {"message": "API key required", "type": "auth_error"}}
// 429 — Rate limit exceeded
{"error": {"message": "Rate limit exceeded (60/min)", "type": "rate_limit"}}
// 402 — Monthly budget exhausted
{"error": {"message": "Monthly budget exhausted ($1.00/$1.00)", "type": "budget_exceeded"}}