MoltBot Docs
Everything you need to deploy, manage, and scale autonomous AI agent fleets on dedicated GPU infrastructure.
Quickstart
Deploy your first agent in 5 minutes
Swarm Guide
Orchestrate multi-agent pipelines
API Reference
Full REST API documentation
Model Routing
7 AI models, auto-optimized
Quickstart
Get a MoltBot agent running in under 5 minutes. You'll need a MoltBot Cloud account and an API key.
1. Install the MoltBot SDK
npm install @moltbot/sdk
# or
pip install moltbot
2. Initialize your client
import MoltBot from '@moltbot/sdk';
const client = new MoltBot({
apiKey: process.env.MOLTBOT_API_KEY,
region: 'us-east-1',
});
3. Deploy your first agent
const agent = await client.agents.deploy({
name: 'my-first-agent',
model: 'auto', // Smart model routing
tier: 'base', // RTX 4090, $49/mo
task: 'code-review',
});
console.log(`Agent deployed: ${agent.id}`);
// → Agent deployed: mb-a7x9f2
4. Run a task
const result = await agent.run({
prompt: 'Review this PR and identify bugs',
context: { pr_url: 'https://github.com/...' },
});
console.log(result.output);
Core Concepts
🤖 Agents
An agent is a persistent process running on a dedicated GPU VM. Each agent has its own memory, tool access, and model configuration. Agents run 24/7 and can be given recurring tasks or respond to triggers.
🌐 Swarms
A swarm is a coordinated group of agents working on a shared objective. MoltBot's orchestrator automatically assigns subtasks, manages dependencies, and aggregates results. Available on Swarm Matrix tier.
🧠 Memory
Each agent has persistent vector memory powered by ChromaDB. Memory survives reboots and can be shared across agents in the same organization. Semantic search enables agents to recall relevant context from past runs.
⚡ Model Routing Auto
MoltBot's arbitrage engine automatically selects the cheapest model capable of completing your task. Set model: 'auto' to save up to 90% on inference costs without sacrificing quality.
Supported Models
| Model | Provider | Context | Best For |
|---|---|---|---|
| Claude Opus 4.6 | Anthropic | 200K | Complex reasoning |
| GPT-5.4 | OpenAI | 128K | Code generation |
| Gemini 2.5 Pro | 1M | Long context analysis | |
| Qwen 3.5 Free | NVIDIA NIM | 32K | Fast, cost-free routing |
| Kimi k2 | Moonshot | 128K | Multilingual tasks |
| Grok 3 | xAI | 131K | Real-time search |
| DeepSeek R2 | DeepSeek | 128K | Mathematics, STEM |
GPU Tiers
Base — RTX 4090 (24GB VRAM) · $49/mo
1 persistent agent. Ideal for code review, content generation, data analysis, and single-threaded automation.
- 24GB VRAM, 128GB system RAM
- Qwen 3.5 local + cloud model routing
- 10GB persistent storage
Swarm Matrix — A100 80GB · $149/mo
3 concurrent agents with full model access. Smart routing across all 7 providers. Best for teams and complex pipelines.
- 80GB VRAM, 256GB RAM
- All 7 AI models
- 50GB persistent storage + ChromaDB memory
Enterprise — 4× H100 Cluster · $299/mo
10 concurrent agents, Omnisphere pipeline builder, priority support, and custom SLA.
- 4× H100 80GB cluster, 512GB RAM
- 200GB storage + dedicated vector DB
- Visual pipeline builder, webhook triggers, custom models
Plans & Billing
MoltBot uses monthly subscription billing via Stripe. Usage is metered for API calls above your plan's included allocation. You can upgrade, downgrade, or cancel at any time from the customer portal.
# Check your current usage via API
GET /api/usage
Authorization: Bearer YOUR_API_KEY
{
"plan": "swarm",
"period_start": "2026-04-01",
"api_calls_used": 12840,
"api_calls_included": 50000,
"overage_cost": "$0.00"
}