5 million AI requests included at $49/mo. One simple rate if you go over. Budget caps so you never get a surprise bill.
Every plan includes all AI providers with automatic failover.
| Feature | Starter | Growth | Scale |
|---|---|---|---|
| Base price | $49/mo | $149/mo | Custom |
| Included tokens | 5M | 20M | Volume |
| Overage rate | $3/M tokens | $2.50/M tokens | $2/M tokens |
| Projects | 1 | 5 | Unlimited |
| All AI providers | Yes | Yes | Yes |
| Smart routing | Basic | Yes | Dedicated |
| Budget caps | Yes | Yes | Yes |
| Per-client tracking | -- | Yes | Yes |
| Cost dashboards | -- | Yes | Yes |
| White-label | -- | -- | Yes |
| PII scrubbing | -- | -- | Yes |
| SLA | -- | -- | 99.9% |
| Log retention | 30 days | 60 days | 1 year |
| Support | Priority email | Dedicated + Slack |
Everything you need to know about AI Gateway billing.
Every plan includes a base monthly fee and a generous bundle of included tokens. Starter includes 5M tokens, Growth includes 20M tokens. Once you exceed your included tokens, you are billed at a simple per-million-token overage rate. No separate input/output billing -- one rate for all tokens. You can monitor your usage in real-time from your dashboard.
We keep it simple. Other platforms charge different rates for input and output tokens, making costs hard to predict. We charge one flat rate per million tokens. Smart routing means most of your tokens go through cost-effective models like Gemini Flash and Haiku, so your effective cost stays low while quality stays high.
Your service continues without interruption. Additional tokens are billed at your plan's simple overage rate -- $3/M tokens on Starter, $2.50/M on Growth, or $2/M on Scale. You will receive alerts at 80% and 95% of your included allocation so there are no surprises. You can also set hard budget caps to prevent overage entirely.
At $49/mo with 5M tokens included, Gateway is cheaper than Portkey ($49/mo + you still pay provider costs separately) and Helicone ($60/mo + provider costs). With direct API access you pay per-call with no routing optimization. Our smart routing automatically sends requests to the most cost-effective model for each task, saving you 60-80% compared to using premium models for everything.
Yes. Upgrades take effect immediately and you pay the prorated difference. Downgrades take effect at the start of your next billing cycle. You can change plans from your dashboard or by contacting us. No cancellation fees.
No, included tokens reset each billing cycle. This keeps pricing simple and predictable. If you consistently use fewer tokens than your plan includes, consider a lower tier and rely on the usage-based overage pricing for occasional spikes.
All major providers: OpenAI (GPT-4, GPT-4o, GPT-4o-mini), Anthropic (Claude Sonnet, Opus, Haiku), Google (Gemini Pro, Flash), Groq, Together AI, and more. One API endpoint, one bill, automatic failover between providers.
Growth and Scale plans include per-client attribution. Tag each API request with a client identifier and our dashboard breaks down token usage, cost, and model selection per client. Perfect for agencies billing clients for AI usage.
Budget caps let you set a hard spending limit so you never get surprised by overage charges. When you hit your cap, requests are paused until the next billing cycle or until you raise the limit. Every plan includes budget caps as a standard feature.
We offer a 14-day trial on all plans. Book a discovery call and we will get you set up. No credit card required to start the conversation.
Start with 5M tokens at $49/mo. No input/output complexity.
Book a 30-minute discovery call and we will find the right plan.