LLM for Natural Language Processing Tasks: A Comprehensive Guide

We are building a document intelligence agent that turns raw, unstructured text into structured JSON covering sentiment, entities, urgency, and a summary. If you process support tickets, user feedback, or news clips, this replaces a patchwork of single-purpose NLP pipelines with one consistent API call.

What you'll need

Python 3.10 or newer
The OpenAI SDK: pip install openai
An Oxlo.ai API key from https://portal.oxlo.ai

Step 1: Set up the Oxlo.ai client

I always start by verifying the endpoint. Because Oxlo.ai is fully OpenAI SDK compatible, I do not need a custom library. I point the base URL to https://api.oxlo.ai/v1, plug in my key, and ask Llama 3.3 70B for a quick sanity check. I like this model as a baseline because it balances speed and instruction adherence for English text. If the sanity check returns a completion, the pipeline is live and I can move on to the prompt.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key=os.environ.get("OXLO_API_KEY")
)

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Reply with 'Connection OK' and nothing else."},
    ],
)

print(response.choices[0].message.content)

Step 2: Define the system prompt

The system prompt is the schema contract. I keep it strict: exact field names, allowed values, and no markdown fences. This lets me parse the output with standard JSON libraries instead of regex. I also include a fallback rule for ambiguous text so the model never hallucinates a confidence score outside the allowed set. I keep the prompt concise out of habit, though Oxlo.ai does not charge by the token, so I do not need to micro-optimize length for cost reasons.

SYSTEM_PROMPT = """You are an NLP processing engine. Analyze the user's text and return a single JSON object with exactly these keys:

- sentiment: one of [positive, negative, neutral]
- entities: list of objects, each with keys "name" and "type"
- summary: one sentence, maximum 20 words
- urgency: integer from 1 (low) to 5 (critical)
- topics: list of up to 3 descriptive strings

Rules:
1. Do not wrap the JSON in markdown code blocks.
2. Do not add explanation before or after the JSON.
3. If no entities are present, return an empty list for entities.
4. If the text is ambiguous, use your best judgment and set sentiment to neutral."""

Step 3: Build the analysis function

Now I wrap the call in a reusable function. I set response_format to json_object so the model knows I want valid JSON, and I pin the model to llama-3.3-70b because it follows instruction formatting reliably at this size. If I were processing multilingual documents, I would switch to qwen-3-32b without changing any other code. I import json outside the function so I am not re-importing inside the loop later.

import json

def analyze_text(text: str) -> dict:
    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": text},
        ],
        response_format={"type": "json_object"},
    )

    raw = response.choices[0].message.content
    return json.loads(raw)

Step 4: Harden for production

Real pipelines fail on network blips or unexpected formatting. I add exception handling around the API call and the JSON parse so one bad document does not kill the batch. I validate that every required key is present before returning the dict. If validation fails, I raise a clear error. I also truncate the raw input to 200 characters in the error payload so I can debug without leaking sensitive data. In a real deployment, you might log the full input to an internal observability stack.

def analyze_text_safe(text: str) -> dict:
    try:
        response = client.chat.completions.create(
            model="llama-3.3-70b",
            messages=[
                {"role": "system", "content": SYSTEM_PROMPT},
                {"role": "user", "content": text},
            ],
            response_format={"type": "json_object"},
        )
        raw = response.choices[0].message.content
        result = json.loads(raw)

        required = {"sentiment", "entities", "summary", "urgency", "topics"}
        if not required.issubset(result.keys()):
            raise ValueError(f"Missing keys: {required - set(result.keys())}")

        return result

    except Exception as exc:
        return {"error": str(exc), "raw_input": text[:200]}

Step 5: Batch process documents

For my use case, I need to process a list of support tickets. Because Oxlo.ai uses flat per-request pricing, I do not pay extra for long tickets or large system prompts. Stuffing a verbose ticket with a full stack trace into the same request costs the same as a one-line message. This makes the architecture simple: one ticket, one request, one structured result. I process tickets sequentially here. If you need higher throughput, you can wrap the calls in asyncio.gather or a ThreadPoolExecutor. Oxlo.ai has no cold starts on popular models, so concurrent requests do not trigger warmup latency. If you need deeper reasoning on complex legal or medical text, swapping the model ID to kimi-k2.6 or deepseek-v3.2 is a single-line change.

tickets = [
    "I love the new dashboard, but the export button disappears on mobile. Please fix it soon.",
    "URGENT: Our production API key stopped working and all integrations are down. We are losing revenue.",
    "Thanks for the quick turnaround on my billing question. Great support experience.",
]

results = []
for ticket in tickets:
    result = analyze_text_safe(ticket)
    results.append(result)
    print(json.dumps(result, indent=2))

Run it

Putting it all together, here is the complete script and the output I get when I run it against the sample tickets. Execution takes under a second per ticket on Llama 3.3 70B with no cold starts. The output below is representative. Your exact JSON may vary slightly in entity names or topic wording, but the schema and types will remain consistent.

import os
import json
from openai import OpenAI

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key=os.environ.get("OXLO_API_KEY")
)

SYSTEM_PROMPT = """You are an NLP processing engine. Analyze the user's text and return a single JSON object with exactly these keys:

- sentiment: one of [positive, negative, neutral]
- entities: list of objects, each with keys "name" and "type"
- summary: one sentence, maximum 20 words
- urgency: integer from 1 (low) to 5 (critical)
- topics: list of up to 3 descriptive strings

Rules:
1. Do not wrap the JSON in markdown code blocks.
2. Do not add explanation before or after the JSON.
3. If no entities are present, return an empty list for entities.
4. If the text is ambiguous, use your best judgment and set sentiment to neutral."""

def analyze_text_safe(text: str) -> dict:
    try:
        response = client.chat.completions.create(
            model="llama-3.3-70b",
            messages=[
                {"role": "system", "content": SYSTEM_PROMPT},
                {"role": "user", "content": text},
            ],
            response_format={"type": "json_object"},
        )
        raw = response.choices[0].message.content
        result = json.loads(raw)

        required = {"sentiment", "entities", "summary", "urgency", "topics"}
        if not required.issubset(result.keys()):
            raise ValueError(f"Missing keys: {required - set(result.keys())}")

        return result
    except Exception as exc:
        return {"error": str(exc), "raw_input": text[:200]}

if __name__ == "__main__":
    tickets = [
        "I love the new dashboard, but the export button disappears on mobile. Please fix it soon.",
        "URGENT: Our production API key stopped working and all integrations are down. We are losing revenue.",
        "Thanks for the quick turnaround on my billing question. Great support experience.",
    ]

    for ticket in tickets:
        out = analyze_text_safe(ticket)
        print(json.dumps(out, indent=2))

Example output:

{
  "sentiment": "neutral",
  "entities": [
    {"name": "dashboard", "type": "product_feature"},
    {"name": "export button", "type": "ui_element"},
    {"name": "mobile", "type": "platform"}
  ],
  "summary": "User reports missing export button on mobile despite liking the dashboard.",
  "urgency": 3,
  "topics": ["ui_bug", "mobile", "export"]
}
{
  "sentiment": "negative",
  "entities": [
    {"name": "production API key", "type": "credential"},
    {"name": "integrations", "type": "system"}
  ],
  "summary": "Production API key failure has caused complete integration outage.",
  "urgency": 5,
  "topics": ["outage", "api_key", "production"]
}
{
  "sentiment": "positive",
  "entities": [
    {"name": "billing question", "type": "support_topic"}
  ],
  "summary": "User expresses satisfaction with fast billing support response.",
  "urgency": 1,
  "topics": ["billing", "support", "feedback"]
}

Wrap-up and next steps

This agent replaces five separate NLP microservices with a single request to Oxlo.ai. Because the pricing is request-based rather than token-based, my costs stay flat even when users paste hundred-line log fragments into their tickets. For exact plan details, see the Oxlo.ai pricing page. Two concrete next steps: expose analyze_text_safe through a FastAPI endpoint so your CRM can POST tickets directly, or add a streaming loop that feeds results into a SQLite database for analytics. Both take less than fifty lines of code. If you want to experiment with reasoning models, try switching the model ID to deepseek-v3.2 for coding-heavy tickets or kimi-k2.6 for vision-enabled attachments. Both are available on the same endpoint.

LLM for Natural Language Processing Tasks: A Comprehensive Guide

What you'll need

Step 1: Set up the Oxlo.ai client

Step 2: Define the system prompt

Step 3: Build the analysis function

Step 4: Harden for production

Step 5: Batch process documents

Run it

Wrap-up and next steps

Related articles

Exploring the Intersection of LLMs and Humanities

A Practical Guide to Using LLMs in Social Science

Unlocking LLM Potential in Social Science Research

Building Environmental Science Tools with LLMs: A Tutorial

LLMs in Environmental Science: Applications and Opportunities

Using LLMs in Biology: A Guide

Ready to build with Oxlo.ai?