Founding Partner pricing available — Limited spots
Join 25+ Founding Partners

Your AI bill shouldn't be a mystery

One API key. All providers. Flat monthly pricing. Built for agencies and builders tired of unpredictable AI costs.

OpenAI bill jumped from $47 to $312 One viral feature, no warning, margin gone
Managing 4 different API keys OpenAI, Claude, Gemini... each with separate billing
Guessing what to charge clients No per-client tracking = awkward invoice conversations

The fix

$99/mo flat — Know costs before you ship
One API key — OpenAI + Claude + Gemini
Per-client tracking — Export and invoice in seconds

Join as a Founding Partner

$499/year (58% off $1,188) — Price locked forever

No credit card required • 14-day free trial

You're on the list!

Check your inbox for next steps.

What is an LLM Gateway?

An LLM Gateway is a unified API layer that routes requests to multiple AI providers (OpenAI, Anthropic, Google) through a single endpoint. Unlike direct API access, a gateway provides cost controls, usage tracking, automatic failover, and intelligent routing that can reduce costs by 40-50% while maintaining response quality.

What is Intelligent LLM Routing?

Intelligent LLM routing automatically analyzes each request and selects the most cost-effective model capable of handling it. Simple tasks like classification route to GPT-4o-mini ($0.15/1M tokens), while complex reasoning routes to Claude Sonnet ($3/1M tokens). This achieves 48.7% cost savings at 100% quality parity.

What is AI Bill Shock?

AI bill shock occurs when automation workflows consume unexpected amounts of LLM API tokens, resulting in surprise charges. Common causes include runaway loops, verbose prompts, and lack of usage monitoring. A single misconfigured Make.com scenario can drain $200+ in minutes without proper safeguards.

Frequently Asked Questions

How do I prevent AI bill shock?
AI Gateway prevents bill shock through budget alerts, spending caps, and automatic kill switches. Set a monthly limit, get notified at 80% usage, and automatically pause requests at 100%. You can also set per-project limits to prevent any single automation from draining your budget.
What's the cheapest LLM for classification tasks?
GPT-4o-mini is currently the most cost-effective at $0.15/1M input tokens and $0.60/1M output tokens. For classification, categorization, and simple extraction tasks, it performs at near-GPT-4 quality while being 20-30x cheaper. AI Gateway can automatically route these tasks to the cheapest suitable model.
How much does GPT-4o cost per 1 million tokens?
GPT-4o costs $2.50 per 1M input tokens and $10.00 per 1M output tokens. For comparison, GPT-4o-mini is $0.15/$0.60, and Claude 3.5 Sonnet is $3.00/$15.00. AI Gateway's intelligent routing can reduce your total costs by 40-50% by using cheaper models for simpler tasks.
Is Claude or GPT-4 cheaper for summarization?
GPT-4o is cheaper for summarization at $2.50/1M input tokens vs Claude Sonnet at $3/1M. However, Claude often requires fewer tokens to achieve the same quality summary. AI Gateway automatically tests both and routes to whichever provides the best cost-quality ratio for your specific use case.
How do I track AI costs per client?
AI Gateway includes built-in per-client tracking. Tag each request with a client ID or project name, and the dashboard shows usage and costs broken down by client. Export to CSV for invoicing, set per-client budgets, and see which clients are most/least profitable.
What happens if I exceed my token budget?
Your requests automatically pause to prevent surprise bills. You get email alerts at 80% and 90% usage. At 100%, requests return a budget_exceeded error instead of consuming more tokens. You can upgrade your plan anytime or purchase additional tokens at overage rates.
How does intelligent routing reduce LLM costs?
Intelligent routing analyzes each request's complexity and automatically selects the cheapest model that can handle it. Simple tasks (classification, extraction) use GPT-4o-mini, while complex tasks (reasoning, code generation) use more powerful models. This achieves 40-50% cost savings while maintaining quality.
What's the ROI of using an LLM gateway?
Most customers see 10X ROI within 90 days. At $99/month for Pro plan with 3M tokens included, you'd pay $5-7.50 for those same tokens directly. Add intelligent routing (40-50% savings), bill shock prevention, and time saved managing multiple APIs—typical ROI is 10-15X in the first quarter.
What is an LLM gateway?
An LLM gateway is a unified API that sits between your application and multiple AI providers (OpenAI, Anthropic, Google). It provides a single endpoint for all providers, automatic failover, cost tracking, budget controls, and intelligent routing. Think of it as a "reverse proxy" for LLM APIs.
How do I switch from OpenAI to AI Gateway?
Change two lines of code: replace your OpenAI API key with your Gateway key, and set the base_url to gateway.resultantai.com/v1. Your existing code works unchanged—same function calls, same response format. Migration takes 5 minutes. No code rewrite needed.
Does AI Gateway work with Make.com?
Yes. In Make.com, use the HTTP module to call AI Gateway's API instead of OpenAI's API directly. You get the same responses, but with built-in budget controls, per-scenario tracking, and automatic failover. We also provide Make.com blueprint templates to get started faster.
Is AI Gateway compatible with the OpenAI SDK?
Yes, 100% compatible. AI Gateway implements the OpenAI API specification, so any code using the official OpenAI Python/Node.js SDK works by just changing the base_url. This includes streaming, function calling, vision, and all other OpenAI features.
What happens when OpenAI goes down?
AI Gateway automatically fails over to Anthropic (Claude) or Google (Gemini) within milliseconds. Your requests continue working with zero downtime. The failover is transparent—same API call, same response format, just a different underlying provider. You never notice the switch.
How does automatic failover work?
When a provider returns an error (downtime, rate limits, timeout), AI Gateway immediately retries with an equivalent model from a different provider. GPT-4o fails over to Claude Sonnet, GPT-4o-mini to Claude Haiku. Response format stays the same, so your code doesn't break.
What's the latency overhead of using a gateway?
Typically 15-30ms overhead for routing logic. For a GPT-4o request that takes 800ms total, that's ~2-4% slower. The trade-off: automatic failover means your requests never fail due to provider downtime (which costs you hours, not milliseconds). For most use cases, the reliability is worth the tiny latency.
Can I use my own API keys with AI Gateway?
On Enterprise plans, yes. You can provide your own OpenAI/Anthropic/Google keys and AI Gateway handles routing, monitoring, and failover while billing goes to your provider accounts. This is useful for companies with existing Enterprise agreements or credit commitments.
Is GPT-4o-mini good enough for customer support?
Yes, for 70-80% of customer support queries. GPT-4o-mini handles FAQ responses, simple troubleshooting, and information retrieval excellently at 1/20th the cost of GPT-4o. For complex technical support or escalated issues, route to GPT-4o. AI Gateway can automatically decide which to use based on query complexity.
Which LLM is best for code generation?
Claude 3.5 Sonnet currently leads for code generation, especially for full-stack applications. GPT-4o is slightly better for Python data science. For simple code snippets or refactoring, GPT-4o-mini works at 1/10th the cost. AI Gateway can route based on code complexity automatically.
What model should I use for data extraction?
GPT-4o-mini is perfect for structured data extraction from documents, emails, or web pages. It's 95%+ accurate for extraction tasks at $0.15/1M tokens (vs $2.50 for GPT-4o). Only use more expensive models if extraction requires deep reasoning or ambiguous interpretation.
How do agencies bill clients for AI usage?
Most agencies either include AI in their retainer (predict avg usage) or bill actual costs plus markup (typically 20-50%). AI Gateway's per-client tracking makes this easy: export monthly usage by client, apply your markup, add to invoice. Some agencies bundle "X tokens included" in packages.
Can I set different models for different tasks?
Yes. Tag requests with task type (classification, generation, etc.), and AI Gateway routes accordingly. Or use intelligent routing to automatically select the best model based on request content. You can also manually specify model per request while keeping budget controls and tracking.
What's the difference between AI Gateway and Portkey?
AI Gateway includes tokens ($99/mo gets you 3M tokens). Portkey charges $49/mo platform fee PLUS you pay provider costs separately (typically $150+/month total). For 3M tokens, AI Gateway is $99 all-in vs Portkey's $50+ platform + $75+ provider = $125+ total. AI Gateway is better for predictable pricing.
Is AI Gateway better than using OpenAI directly?
If you need budget controls, cost tracking, or use multiple providers—yes. OpenAI direct is simpler for single-provider use under 1M tokens/month. But once you hit bill shock, need per-client tracking, or want automatic failover, a gateway pays for itself. Most customers switch after their first surprise $300+ bill.
How does AI Gateway compare to Helicone?
Helicone focuses on observability and logging (~$60/mo). AI Gateway focuses on cost management and includes tokens. If you mainly need analytics, Helicone. If you need budget controls + token inclusions + intelligent routing, AI Gateway. Some customers use both (AI Gateway for routing, Helicone for logging).
Should I use LiteLLM or AI Gateway?
LiteLLM (open source) is free but requires self-hosting, DevOps, and monitoring. AI Gateway is fully managed—no servers to maintain. If you have a DevOps team and want full control, use LiteLLM. If you want a managed service with included tokens and zero ops overhead, use AI Gateway.

Still comparing solutions?

See how AI Gateway stacks up against Portkey, Helicone, LiteLLM, and direct API access.

View All Comparisons →