Guaranteed 30% off your current AI inference bill for teams spending $200 or more per month.

Book a call →

Flat monthly pricing for AI inference

One fixed monthly price. No billing surprises. No usage calculations.
Start free and scale when you're ready.

Free

For developers getting started with Oxlo.ai.

$0/month
Limit:
  • 60 requests / day
  • Requests may be queued behind paid plans
Get started for free

What you get:
  • Access to 12 plus open source models
  • Clear usage limits
  • No credit card required
Hot

Pro

For developers building and shipping AI-powered products.

$14.9 $35/month
Limit:
  • 300 requests / day
  • Optimized models under 8B parameters
Subscribe now

7 day free trial

Everything in Free, plus
  • Faster request handling
  • Access to optimised models for development and prototyping
  • Higher throughput for development workloads

Premium

For teams running production workloads.

$49.9 $80/month
Limit:
  • 2,000 requests / day
  • Production-grade performance
Subscribe now

7 day free trial.

Everything in Pro, plus
  • Priority execution
  • Higher and consistent throughput
  • All large reasoning models including DeepSeek R1 and Kimi K2

FOR HIGH-VOLUME TEAMS

Enterprise

For teams ready to cut their AI infrastructure costs significantly.

OUR COMMITMENT

30% off your current AI bill.

Guaranteed. For teams spending $200 or more per month on AI inference with any provider.

Book a Call →

No commitment. 30 minute conversation.

Everything in Premium, plus
  • Custom usage limits
  • Dedicated support
  • Tailored deployment options

Compare the plans

All plans use request-based pricing. No token calculations.

Usage & Limits
Requests included60 / day300 / dayHigh request limitsCustom
Burst rate limit5 / minute30 / min120 / min (tunable)Custom
Monthly request capYesYesNoneCustom
Request priority levelLowestStandardHighDedicated
Models & Performance
Optimized models over 8BNoLimitedYesYes
Production-grade inferenceNoNoYesYes
Priority executionLowestMediumHighestOptional
Average Response Latency≤ 7 seconds≤ 1 second≤ 100 ms- tunable
Request & Context Limits

(Caps are for safety and performance, not billing)

Input tokens / requestUp to 8KUp to 16KUp to 32KCustom (up to 128K)
Output tokens / requestUp to 2KUp to 4KUp to 8KCustom (up to 128K)
Pricing & Billing
Request-based pricingYesYesYesYes
Token-based billingNoNoNoNo
Fixed monthly limitsYesYesYesCustom
Usage limits visible upfrontYesYesYesYes
Developer Experience
Open-source modelsYesYesYesYes
Simple API integrationYesYesYesYes
Model-agnostic pricingYesYesYesYes
Support levelCommunityCommunityPriorityDedicated
Infrastructure & Technical Differentiation
Gateway-level request meteringYesYesYesYes
Pricing independent of prompt lengthYesYesYesYes
Traffic prioritization by planNoYesYesYes
Async and batch-friendly workloadsYesYesYesYes
Based on current platform design and publicly available competitor offerings. Features may evolve over time.

Pricing FAQ

With Oxlo.ai's request-based pricing, you pay a flat monthly subscription that includes a set number of API requests per day. Each request costs the same regardless of how many tokens are in your prompt or response. A 100-token prompt costs the same as a 50,000-token prompt. This is fundamentally different from token-based pricing used by OpenAI, Together AI, Fireworks AI, OpenRouter, and Replicate.

For teams running long-context or reasoning model workloads, yes. Together AI, Fireworks AI, and OpenRouter all charge per token, so costs scale linearly with prompt length. Running 500 API calls per day with 3,000-token prompts costs approximately $40-60/month on these providers vs $49.90/month on Oxlo.ai Premium. But as prompt length increases beyond 10,000 tokens, Oxlo.ai can be 10-100x cheaper since every request costs the same flat rate.

Yes. New users get a 7-day free trial with full access to all 40+ models including Qwen 3 32B, Llama 3.3 70B, DeepSeek R1, and premium image generation. No credit card required to start. The Free tier (60 requests/day, 16+ models) is available permanently.

When you reach your daily request limit, additional requests are queued until the next day or you can upgrade your plan for higher limits. There are no overage charges - your costs are always predictable and fixed. This is unlike token-based providers where a single runaway prompt can spike your bill.

Yes, you can upgrade or downgrade your plan at any time. When upgrading, you get immediate access to the higher plan's limits. All plans are billed monthly with no long-term contracts required.

Yes. Teams currently spending $200 or more per month on AI inference with providers like Together AI, Fireworks AI or OpenRouter are eligible for our Enterprise plan which guarantees a minimum 30 percent reduction on their current monthly bill. Contact us at hello@oxlo.ai to discuss your current usage.