Integrating Oxlo.ai with Other AI Tools and Services: A Step-by-Step Guide

Most production AI stacks are not monoliths. They are pipelines that route prompts across models, vector stores, automation platforms, and monitoring layers. The friction in building these pipelines usually comes from incompatible APIs, hidden token costs, and cold starts that break synchronous workflows. Oxlo.ai removes that friction by offering a fully OpenAI-compatible inference platform with request-based pricing, a broad model catalog, and zero cold starts on popular models. Because the Oxlo.ai API follows the same schema as OpenAI, you can integrate it into existing toolchains by changing a single line of configuration: the base URL.

Drop-in Replacement with the OpenAI SDK

The fastest way to integrate Oxlo.ai is through the official OpenAI SDKs. Oxlo.ai exposes the standard chat/completions, embeddings, images/generations, audio/transcriptions, and audio/speech endpoints at https://api.oxlo.ai/v1. This means any Python, Node.js, or cURL script written for OpenAI works without structural changes.

Here is a minimal Python example that routes requests to Llama 3.3 70B:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key="YOUR_OXLO_API_KEY"
)

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Explain request-based pricing."}],
    stream=False
)
print(response.choices[0].message.content)

In Node.js, the pattern is identical:

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.oxlo.ai/v1',
  apiKey: process.env.OXLO_API_KEY,
});

const stream = await client.chat.completions.create({
  model: 'deepseek-r1-671b',
  messages: [{ role: 'user', content: 'Write a React hook.' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Because Oxlo.ai supports streaming, JSON mode, function calling, vision, and multi-turn conversations out of the box, you do not lose functionality when you switch the base URL.

Connecting Orchestration Frameworks

Frameworks like LangChain and LlamaIndex assume an OpenAI-style interface. You can point them to Oxlo.ai by overriding the base URL and API key in the constructor.

With LangChain in Python:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="qwen3-32b",
    base_url="https://api.oxlo.ai/v1",
    api_key="YOUR_OXLO_API_KEY",
    temperature=0.2
)

llm.invoke("Draft an email to the engineering team.")

For LlamaIndex, use the OpenAI-compatible LLM class:

from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="kimi-k2-6",
    api_key="YOUR_OXLO_API_KEY",
    api_base="https://api.oxlo.ai/v1"
)

response = llm.complete("Summarize the following legal text...")

This compatibility extends to retrieval-augmented generation chains, agents, and evaluation pipelines. You can mix Oxlo.ai models for different subtasks, such as routing simple queries to DeepSeek V4 Flash and complex reasoning to DeepSeek R1 671B MoE, all within the same framework.

Building Agentic Workflows with Function Calling

Modern agents rely on tool use loops. Each loop can inflate prompt length as prior tool outputs are fed back into context. On token-based providers, this makes agentic workloads expensive and unpredictable. Oxlo.ai uses request-based pricing: one flat cost per API request regardless of prompt length. That makes long-context agent loops significantly cheaper than token-based alternatives such as Together AI, Fireworks AI, OpenRouter, Replicate, or Anyscale.

Oxlo.ai supports function calling across its chat models. Below is a Python pattern that registers a weather tool:

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="glm-5",
    messages=[{"role":

Integrating Oxlo.ai with Other AI Tools and Services: A Step-by-Step Guide

Drop-in Replacement with the OpenAI SDK

Connecting Orchestration Frameworks

Building Agentic Workflows with Function Calling

Applying LLM to Physics Research

Using LLM for Data Visualization

Building Data Analysis Tools with LLM

LLM-Powered Data Agents for Data Analysis

Optimizing LLMs for Data Analysis: A Cost Optimization Perspective

A Beginner's Guide to Using LLMs for Art Generation

`Ready to build with Oxlo.ai?`

Integrating Oxlo.ai with Other AI Tools and Services: A Step-by-Step Guide

Drop-in Replacement with the OpenAI SDK

Connecting Orchestration Frameworks

Building Agentic Workflows with Function Calling

Related articles

Applying LLM to Physics Research

Using LLM for Data Visualization

Building Data Analysis Tools with LLM

LLM-Powered Data Agents for Data Analysis

Optimizing LLMs for Data Analysis: A Cost Optimization Perspective

A Beginner's Guide to Using LLMs for Art Generation

Ready to build with Oxlo.ai?

`Related articles`

`Ready to build with Oxlo.ai?`