
We are building a document intelligence agent that turns raw, unstructured text into structured JSON covering sentiment, entities, urgency, and a summary. If you process support tickets, user feedback, or news clips, this replaces a patchwork of single-purpose NLP pipelines with one consistent API call.
What you'll need
- Python 3.10 or newer
- The OpenAI SDK:
pip install openai - An Oxlo.ai API key from https://portal.oxlo.ai
Step 1: Set up the Oxlo.ai client
I always start by verifying the endpoint. Because Oxlo.ai is fully OpenAI SDK compatible, I do not need a custom library. I point the base URL to https://api.oxlo.ai/v1, plug in my key, and ask Llama 3.3 70B for a quick sanity check. I like this model as a baseline because it balances speed and instruction adherence for English text. If the sanity check returns a completion, the pipeline is live and I can move on to the prompt.
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.oxlo.ai/v1",
api_key=os.environ.get("OXLO_API_KEY")
)
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Reply with 'Connection OK' and nothing else."},
],
)
print(response.choices[0].message.content)
Step 2: Define the system prompt
The system prompt is the schema contract. I keep it strict: exact field names, allowed values, and no markdown fences. This lets me parse the output with standard JSON libraries instead of regex. I also include a fallback rule for ambiguous text so the model never hallucinates a confidence score outside the allowed set. I keep the prompt concise out of habit, though Oxlo.ai does not charge by the token, so I do not need to micro-optimize length for cost reasons.
SYSTEM_PROMPT = """You are an NLP processing engine. Analyze the user's text and return a single JSON object with exactly these keys:
- sentiment: one of [positive, negative, neutral]
- entities: list of objects, each with keys "name" and "type"
- summary: one sentence, maximum 20 words
- urgency: integer from 1 (low) to 5 (critical)
- topics: list of up to 3 descriptive strings
Rules:
1. Do not wrap the JSON in markdown code blocks.
2. Do not add explanation before or after the JSON.
3. If no entities are present, return an empty list for entities.
4. If the text is ambiguous, use your best judgment and set sentiment to neutral."""
Step 3: Build the analysis function
Now I wrap the call in a reusable function. I set response_format to json_object so the model knows I want valid JSON, and I pin the model to llama-3.3-70b because it follows instruction formatting reliably at this size. If I were processing multilingual documents, I would switch to qwen-3-32b without changing any other code. I import json outside the function so I am not re-importing inside the loop later.
import json
def analyze_text(text: str) -> dict:
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": text},
],
response_format={"type": "json_object"},
)
raw = response.choices[0].message.content
return json.loads(raw)
Step 4: Harden for production
Real pipelines fail on network blips or unexpected formatting. I add exception handling around the API call and the JSON parse so one bad document does not kill the batch. I validate that every required key is present before returning the dict. If validation fails, I raise a clear error. I also truncate the raw input to 200 characters in the error payload so I can debug without leaking sensitive data. In a real deployment, you might log the full input to an internal observability stack.
def analyze_text_safe(text: str) -> dict:
try:
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": text},
],
response_format={"type": "json_object"},
)
raw = response.choices[0].message.content
result = json.loads(raw)
required = {"sentiment", "entities", "summary", "urgency", "topics"}
if not required.issubset(result.keys()):
raise ValueError(f"Missing keys: {required - set(result.keys())}")
return result
except Exception as exc:
return {"error": str(exc), "raw_input": text[:200]}
Step 5: Batch process documents
For my use case, I need to process a list of support tickets. Because Oxlo.ai uses flat per-request pricing, I do not pay extra for long tickets or large system prompts. Stuffing a verbose ticket with a full stack trace into the same request costs the same as a one-line message. This makes the architecture simple: one ticket, one request, one structured result. I process tickets sequentially here. If you need higher throughput, you can wrap the calls in asyncio.gather or a ThreadPoolExecutor. Oxlo.ai has no cold starts on popular models, so concurrent requests do not trigger warmup latency. If you need deeper reasoning on complex legal or medical text, swapping the model ID to kimi-k2.6 or deepseek-v3.2 is a single-line change.
tickets = [
"I love the new dashboard, but the export button disappears on mobile. Please fix it soon.",
"URGENT: Our production API key stopped working and all integrations are down. We are losing revenue.",
"Thanks for the quick turnaround on my billing question. Great support experience.",
]
results = []
for ticket in tickets:
result = analyze_text_safe(ticket)
results.append(result)
print(json.dumps(result, indent=2))
Run it
Putting it all together, here is the complete script and the output I get when I run it against the sample tickets. Execution takes under a second per ticket on Llama 3.3 70B with no cold starts. The output below is representative. Your exact JSON may vary slightly in entity names or topic wording, but the schema and types will remain consistent.
import os
import json
from openai import OpenAI
client = OpenAI(
base_url="https://api.oxlo.ai/v1",
api_key=os.environ.get("OXLO_API_KEY")
)
SYSTEM_PROMPT = """You are an NLP processing engine. Analyze the user's text and return a single JSON object with exactly these keys:
- sentiment: one of [positive, negative, neutral]
- entities: list of objects, each with keys "name" and "type"
- summary: one sentence, maximum 20 words
- urgency: integer from 1 (low) to 5 (critical)
- topics: list of up to 3 descriptive strings
Rules:
1. Do not wrap the JSON in markdown code blocks.
2. Do not add explanation before or after the JSON.
3. If no entities are present, return an empty list for entities.
4. If the text is ambiguous, use your best judgment and set sentiment to neutral."""
def analyze_text_safe(text: str) -> dict:
try:
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": text},
],
response_format={"type": "json_object"},
)
raw = response.choices[0].message.content
result = json.loads(raw)
required = {"sentiment", "entities", "summary", "urgency", "topics"}
if not required.issubset(result.keys()):
raise ValueError(f"Missing keys: {required - set(result.keys())}")
return result
except Exception as exc:
return {"error": str(exc), "raw_input": text[:200]}
if __name__ == "__main__":
tickets = [
"I love the new dashboard, but the export button disappears on mobile. Please fix it soon.",
"URGENT: Our production API key stopped working and all integrations are down. We are losing revenue.",
"Thanks for the quick turnaround on my billing question. Great support experience.",
]
for ticket in tickets:
out = analyze_text_safe(ticket)
print(json.dumps(out, indent=2))
Example output:
{
"sentiment": "neutral",
"entities": [
{"name": "dashboard", "type": "product_feature"},
{"name": "export button", "type": "ui_element"},
{"name": "mobile", "type": "platform"}
],
"summary": "User reports missing export button on mobile despite liking the dashboard.",
"urgency": 3,
"topics": ["ui_bug", "mobile", "export"]
}
{
"sentiment": "negative",
"entities": [
{"name": "production API key", "type": "credential"},
{"name": "integrations", "type": "system"}
],
"summary": "Production API key failure has caused complete integration outage.",
"urgency": 5,
"topics": ["outage", "api_key", "production"]
}
{
"sentiment": "positive",
"entities": [
{"name": "billing question", "type": "support_topic"}
],
"summary": "User expresses satisfaction with fast billing support response.",
"urgency": 1,
"topics": ["billing", "support", "feedback"]
}
Wrap-up and next steps
This agent replaces five separate NLP microservices with a single request to Oxlo.ai. Because the pricing is request-based rather than token-based, my costs stay flat even when users paste hundred-line log fragments into their tickets. For exact plan details, see the Oxlo.ai pricing page. Two concrete next steps: expose analyze_text_safe through a FastAPI endpoint so your CRM can POST tickets directly, or add a streaming loop that feeds results into a SQLite database for analytics. Both take less than fifty lines of code. If you want to experiment with reasoning models, try switching the model ID to deepseek-v3.2 for coding-heavy tickets or kimi-k2.6 for vision-enabled attachments. Both are available on the same endpoint.

