
Today we are building a support ticket classifier that categorizes incoming messages into priority levels using few-shot prompting. No fine-tuning, no labeled dataset, just a handful of examples embedded in the system prompt. If you run a support queue and need something running in the next hour, this is for you.
What you'll need
You need Python 3.10 or newer, the OpenAI SDK, and an API key from Oxlo.ai. Because Oxlo.ai charges a flat rate per request instead of per token, long system prompts filled with examples do not increase your cost. That makes few-shot learning economically viable in production, especially when you want twenty examples instead of two.
- Python 3.10+
pip install openai- An Oxlo.ai API key from https://portal.oxlo.ai
Step 1: Set up the Oxlo.ai client
The OpenAI SDK works as a drop-in client for Oxlo.ai. I instantiate it once with the Oxlo.ai base URL and my API key. I keep this at module level so the classifier function can reuse the same connection. On Oxlo.ai there are no cold starts on popular models, so the first request after import returns just as fast as the tenth.
from openai import OpenAI
client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")
Step 2: Curate the examples
Few-shot learning lives or dies by the examples you choose. I picked four real support scenarios that cover the full severity spectrum, from total outage to billing question. Each example includes the raw ticket text, the correct priority label, and a short reasoning sentence. The reasoning helps the model generalize to edge cases because it exposes the logic, not just the label.
I deliberately kept the examples short. Long examples eat prompt space and can confuse the model. I also avoided ambiguous tickets that could reasonably be two different priorities. If your domain has nuance, add more examples, but start with four. Remember, on Oxlo.ai the cost is per request, so expanding the system prompt with ten more examples does not change your unit economics the way it would on a token-based provider.
Step 3: Design the system prompt
This prompt is the entire classifier. It contains the instructions, the priority definitions, the four curated examples, and a strict JSON output schema. I place everything in the system message so it acts as persistent context for every incoming ticket. You can edit the examples below to match your own product terminology.
SYSTEM_PROMPT = '''You are a support ticket classifier. Read the ticket and output valid JSON with exactly two keys: priority and reasoning.
Priority definitions:
- P1: Production outage, security incident, or confirmed data loss.
- P2: Bug blocking workflow, or severe performance degradation.
- P3: Feature request, or minor bug with a documented workaround.
- P4: General question, documentation request, or account inquiry.
Examples:
Ticket: "Our API returns 500 errors for all users since 9 AM. Revenue dashboard is down."
Priority: P1
Reasoning: Complete production outage affecting all users and business metrics.
Ticket: "The export button is grayed out when using Firefox. Works in Chrome."
Priority: P2
Reasoning: Blocking workflow for a subset of users with no immediate workaround.
Ticket: "Can you add dark mode to the admin panel?"
Priority: P3
Reasoning: New feature request with no impact on current functionality.
Ticket: "How do I reset my password?"
Priority: P4
Reasoning: General usage question covered by existing documentation.
Output ONLY valid JSON in this exact format:
{"priority": "P?", "reasoning": "..."}'''
Step 4: Build the classifier function
I wrap the API call in a small function called classify_ticket. It takes raw ticket text, prepends the word "Ticket:" to match the few-shot format, and sends everything to Llama 3.3 70B on Oxlo.ai. Llama 3.3 70B follows structured instructions well, but you can swap the model string to qwen-3-32b for multilingual queues, kimi-k2.6 for deeper reasoning, or deepseek-v3.2 if you want to experiment on a free tier. The rest of the code stays identical.
I strip markdown fences because some models occasionally wrap JSON in triple backticks. A simple line-based cleaner handles it without importing heavy dependencies.
import json
def classify_ticket(ticket_text: str) -> dict:
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": f"Ticket: {ticket_text}"},
],
)
raw = response.choices[0].message.content.strip()
# Strip markdown code fences if the model adds them
if raw.startswith("```"):
lines = raw.splitlines()
if lines[0].startswith("```"):
lines = lines[1:]
if lines and lines[-1].startswith("```"):
lines = lines[:-1]
raw = "\n".join(lines).strip()
return json.loads(raw)
Step 5: Process a batch of tickets
In production this function hooks into a webhook, email listener, or ticketing API. For the demo I use a Python list of strings. I loop through them, classify each one, and collect the results. Because the few-shot examples live in the system prompt, every call carries that context automatically. On Oxlo.ai this is ideal because request-based pricing means your cost per classification is predictable whether the system prompt is two hundred tokens or two thousand.
tickets = [
"All database writes are failing. Customers cannot check out and we are losing orders.",
"The new reporting page loads in 12 seconds. It used to load in under 2 seconds.",
"Do you offer annual billing instead of monthly?",
"Would be great if CSV exports included a timestamp column by default.",
"Unauthorized users can view admin settings by modifying the URL parameter.",
]
results = []
for t in tickets:
try:
label = classify_ticket(t)
results.append(label)
print(f"Ticket: {t[:55]}... -> {label}")
except Exception as e:
print(f"Failed on ticket: {e}")
Run it
Running the script produces structured output immediately. Here is what I see:
Ticket: All database writes are failing. Customers cannot ch... -> {'priority': 'P1', 'reasoning': 'Production outage preventing core functionality and revenue.'}
Ticket: The new reporting page loads in 12 seconds. It used ... -> {'priority': 'P2', 'reasoning': 'Severe performance degradation blocking normal workflow.'}
Ticket: Do you offer annual billing instead of monthly?... -> {'priority': 'P4', 'reasoning': 'General account and billing inquiry.'}
Ticket: Would be great if CSV exports included a timestamp c... -> {'priority': 'P3', 'reasoning': 'Feature request for enhanced export functionality.'}
Ticket: Unauthorized users can view admin settings by modif... -> {'priority': 'P1', 'reasoning': 'Security incident exposing restricted admin functionality.'}
The model correctly escalated the security ticket to P1 even though none of the few-shot examples mentioned URL parameter tampering. It generalized from the P1 definition. If you need even deeper chain-of-thought reasoning, switch the model to kimi-k2.6 or deepseek-r1-671b on Oxlo.ai without touching the prompt.
Wrap-up
You now have a working few-shot classifier running on Oxlo.ai with no training pipeline and no token-based cost surprises. Two concrete next steps: integrate the classify_ticket function into a Slack incoming webhook so P1 alerts auto-post to an on-call channel, or expand the system prompt with historical examples from your own ticket archive. If you want to see how the flat per-request pricing scales for agentic or high-volume classification workloads, check https://oxlo.ai/pricing.

