Frequently Asked Questions

Everything you need to know about AI Gateway billing.

How does usage-based billing work?

Every plan includes a base monthly fee and a generous bundle of included tokens. Starter includes 5M tokens, Growth includes 20M tokens. Once you exceed your included tokens, you are billed at a simple per-million-token overage rate. No separate input/output billing -- one rate for all tokens. You can monitor your usage in real-time from your dashboard.

Why is there no separate input vs output token rate?

We keep it simple. Other platforms charge different rates for input and output tokens, making costs hard to predict. We charge one flat rate per million tokens. Smart routing means most of your tokens go through cost-effective models like Gemini Flash and Haiku, so your effective cost stays low while quality stays high.

What happens if I exceed my included tokens?

Your service continues without interruption. Additional tokens are billed at your plan's simple overage rate -- $3/M tokens on Starter, $2.50/M on Growth, or $2/M on Scale. You will receive alerts at 80% and 95% of your included allocation so there are no surprises. You can also set hard budget caps to prevent overage entirely.

How does this compare to Portkey, Helicone, or direct API access?

At $49/mo with 5M tokens included, Gateway is cheaper than Portkey ($49/mo + you still pay provider costs separately) and Helicone ($60/mo + provider costs). With direct API access you pay per-call with no routing optimization. Our smart routing automatically sends requests to the most cost-effective model for each task, saving you 60-80% compared to using premium models for everything.

Can I upgrade or downgrade at any time?

Yes. Upgrades take effect immediately and you pay the prorated difference. Downgrades take effect at the start of your next billing cycle. You can change plans from your dashboard or by contacting us. No cancellation fees.

Do unused tokens roll over?

No, included tokens reset each billing cycle. This keeps pricing simple and predictable. If you consistently use fewer tokens than your plan includes, consider a lower tier and rely on the usage-based overage pricing for occasional spikes.

Which AI providers are supported?

All major providers: OpenAI (GPT-4, GPT-4o, GPT-4o-mini), Anthropic (Claude Sonnet, Opus, Haiku), Google (Gemini Pro, Flash), Groq, Together AI, and more. One API endpoint, one bill, automatic failover between providers.

How do I track usage per client?

Growth and Scale plans include per-client attribution. Tag each API request with a client identifier and our dashboard breaks down token usage, cost, and model selection per client. Perfect for agencies billing clients for AI usage.

What are budget caps?

Budget caps let you set a hard spending limit so you never get surprised by overage charges. When you hit your cap, requests are paused until the next billing cycle or until you raise the limit. Every plan includes budget caps as a standard feature.

Is there a free trial?

We offer a 14-day trial on all plans. Book a discovery call and we will get you set up. No credit card required to start the conversation.

Core Solutions

More

AI Gateway Pricing

Compare Plans

Frequently Asked Questions

Ready to take control of your AI costs?

Feature	Starter	Growth	Scale
Base price	$49/mo	$149/mo	Custom
Included tokens	5M	20M	Volume
Overage rate	$3/M tokens	$2.50/M tokens	$2/M tokens
Projects	1	5	Unlimited
All AI providers	Yes	Yes	Yes
Smart routing	Basic	Yes	Dedicated
Budget caps	Yes	Yes	Yes
Per-client tracking	--	Yes	Yes
Cost dashboards	--	Yes	Yes
White-label	--	--	Yes
PII scrubbing	--	--	Yes
SLA	--	--	99.9%
Log retention	30 days	60 days	1 year
Support	Email	Priority email	Dedicated + Slack