Guaranteed 15% off your current AI inference bill for team spending up to $20000 / month.

Book a call →
Back to Blogs
Learn AI

A Practical Guide to Using LLMs in Social Science

We are building a thematic coding assistant for social science survey research. It reads open-ended responses, applies a structured codebook, and returns JSON...

A Practical Guide to Using LLMs in Social Science

We are building a thematic coding assistant for social science survey research. It reads open-ended responses, applies a structured codebook, and returns JSON that you can drop straight into a pandas dataframe or R script. This cuts manual coding time from days to minutes without sacrificing methodological transparency.

What you'll need

  • Python 3.10 or newer
  • pip install openai pandas
  • An Oxlo.ai API key from https://portal.oxlo.ai
  • A CSV of survey responses, or use the sample data I provide below

Unlike token-based providers, Oxlo.ai does not scale cost with input length. A long response plus a detailed codebook costs the same as a short ping, which keeps pilot budgets predictable. If you want to test without spending quota, DeepSeek V3.2 is available on the free tier. See https://oxlo.ai/pricing for full plan details.

Step 1: Configure the Oxlo.ai client

I start by loading the OpenAI SDK and pointing it at Oxlo.ai. A quick ping confirms the endpoint and key are live. Because Oxlo.ai is fully OpenAI-compatible, the only difference from a standard script is the base URL.

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key=os.getenv("OXLO_API_KEY", "YOUR_OXLO_API_KEY"),
)

# Verify connectivity
ping = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Say OK"}],
    max_tokens=10,
)
assert "OK" in ping.choices[0].message.content
print("Oxlo.ai client ready")

Step 2: Define the coding schema and system prompt

The system prompt is the entire methodology. I treat it exactly like a codebook I would hand to a human research assistant. It lists valid codes, disambiguation rules, and the required JSON schema. Storing the codebook in version-controlled code rather than a GUI means every methodological tweak is diffable, which makes replication straightforward.

SYSTEM_PROMPT = """You are a qualitative research assistant coding open-ended survey responses.

CODEBOOK:
- ECON: respondent mentions economic factors such as jobs, wages, inflation, or cost of living
- HEALTH: respondent mentions healthcare access, public health, or medical services
- ENV: respondent mentions environment, climate, pollution, or natural resources
- INST: respondent mentions trust or distrust in institutions such as government, media, or corporations
- OTHER: none of the above categories apply

RULES:
1. Assign exactly one primary code.
2. If multiple topics appear, choose the dominant theme.
3. Write a 1-sentence justification memo.
4. Rate confidence from 1 (low) to 5 (high).

OUTPUT FORMAT: Return strict JSON with exactly these keys:
- primary_code: string, one of ECON, HEALTH, ENV, INST, OTHER
- confidence: integer from 1 to 5
- memo: string explaining the choice
- keywords: list of strings, important words from the response
"""

Step 3: Build the coding function with JSON mode

I wrap the API call in a small function that forces JSON mode. This guarantees parseable output on every request, which matters when you are batch-processing hundreds of responses overnight. Temperature 0.2 keeps output deterministic without forcing excessive rigidity, and JSON mode removes the need for fragile regex parsing. I use Llama 3.3 70B because it follows structured instructions reliably, but Oxlo.ai also hosts Qwen 3 32B and Kimi K2.6 if you need multilingual or advanced reasoning capabilities later.

import json

def code_response(response_text: str) -> dict:
    """Send a single survey response to Oxlo.ai for thematic coding."""
    resp = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": f"Survey response:\n{response_text}"},
        ],
        response_format={"type": "json_object"},
        temperature=0.2,
        max_tokens=512,
    )
    raw = resp.choices[0].message.content
    return json.loads(raw)

Step 4: Batch process the dataset

With the single-response function working, I loop over a dataframe. I keep the sample data inline so you can run this immediately without hunting for files. The half-second sleep is conservative. In production you can tune it or add exponential backoff if you hit rate limits, though Oxlo.ai has no cold starts so the queue clears quickly.

import pandas as pd
import time

sample_data = [
    {"id": "R001", "text": "I am worried about rising rent and grocery bills every month."},
    {"id": "R002", "text": "The local hospital wait times are unacceptable and nurses are overworked."},
    {"id": "R003", "text": "I do not trust the city council to handle the flooding issues near the river."},
    {"id": "R004", "text": "My main concern is air quality since the factory expansion started."},
]

records = []
for row in sample_data:
    try:
        result = code_response(row["text"])
        result["response_id"] = row["id"]
        records.append(result)
        time.sleep(0.5)
    except Exception as exc:
        print(f"Failed on {row['id']}: {exc}")

df = pd.DataFrame(records)
print(df[["response_id", "primary_code", "confidence", "memo"]])

Step 5: Aggregate and audit the results

Finally, I tabulate the codes and flag any low-confidence assignments for human review. This mirrors standard qualitative audit procedures, but automated. Export the audit subset to CSV, review the memos, and refine the system prompt if you notice a systematic bias.

# Frequency table
print(df["primary_code"].value_counts())

# Audit queue: anything scored 2 or lower
audit = df[df["confidence"] <= 2]
print(f"\nAudit queue: {len(audit)} item(s)")
if not audit.empty:
    print(audit[["response_id", "primary_code", "memo", "confidence"]])

Run it

Here is the complete entrypoint. I execute the pipeline against the four synthetic responses and print the structured output.

if __name__ == "__main__":
    for row in sample_data:
        coded = code_response(row["text"])
        print(f"{row['id']} -> {coded['primary_code']} (confidence {coded['confidence']})")
        print(f"   Memo: {coded['memo']}")
        print(f"   Keywords: {', '.join(coded['keywords'])}")

Expected output:

R001 -> ECON (confidence 5)
   Memo: Respondent explicitly cites rent and grocery bills as monthly economic worries.
   Keywords: rent, grocery bills, worried
R002 -> HEALTH (confidence 5)
   Memo: Focuses on hospital wait times and nursing workload.
   Keywords: hospital, wait times, nurses, overworked
R003 -> INST (confidence 4)
   Memo: Expresses distrust in city council regarding infrastructure management.
   Keywords: city council, trust, flooding, river
R004 -> ENV (confidence 5)
   Memo: Identifies air quality decline linked to industrial expansion.
   Keywords: air quality, factory expansion, concern

Wrap-up and next steps

This pipeline gives you a reproducible, version-controlled coder that runs on Oxlo.ai's flat per-request pricing. Because cost does not scale with prompt length, you can stuff long responses and detailed codebooks into every call without surprise bills.

Two concrete moves from here. First, run an inter-coder reliability pass by swapping the model to Qwen 3 32B or DeepSeek V3.2 and computing Cohen's kappa across the two automated coders. Second, scale up by pointing the script at a real CSV and upgrading to Oxlo.ai's Pro or Premium tiers for higher daily request limits. Both paths keep the same API shape, so the only change is the model string or plan level.

Ready to build with Oxlo.ai?

Get started building high-performance AI inference applications today.

Get started
Ox Assistant
Online
OxBot
OxBot

Hi there! Try our cost calculator to see what you'd save with Oxlo.ai.