
We are going to build a command-line Socratic tutor that reads a student's written explanation of a topic, diagnoses misconceptions, and asks targeted follow-up questions. It is designed for teachers who want to scale written feedback or for students who need a guided self-check before an exam. The entire tool is a single Python script that calls Oxlo.ai through the OpenAI SDK.
What you'll need
We will use Python 3.10 or newer for its improved typing and asyncio support, which you will want if you later scale this to a full class roster. You also need the OpenAI SDK installed via pip install openai. Oxlo.ai exposes a fully compatible API, so this is the only client library required. Grab an API key from the Oxlo.ai portal at https://portal.oxlo.ai. The free tier includes 60 requests per day across 16 models, including DeepSeek V3.2, which is enough to prototype and test with a small group of students. When you move beyond the prototype, Oxlo.ai's request-based pricing becomes a major advantage for education workloads. Unlike token-based platforms such as Together AI, Fireworks AI, or OpenRouter, the cost per call is flat regardless of prompt length. That means a ten-word quiz response and a ten-paragraph essay both cost the same single request. For details on plans, see https://oxlo.ai/pricing.
Step 1: Verify the connection to Oxlo.ai
Before writing any tutor logic, I verify that the client can authenticate and reach the endpoint. There is nothing worse than debugging a prompt when the issue is a stale key or a typo in the base URL. Oxlo.ai exposes a standard OpenAI-compatible endpoint at https://api.oxlo.ai/v1, so the only difference from a typical OpenAI script is that single line. I also like to confirm that Llama 3.3 70B is responding with no cold start, which Oxlo.ai guarantees on its popular models.
from openai import OpenAI
client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[{"role": "user", "content": "Say 'Connection OK'"}],
)
print(response.choices[0].message.content)
Step 2: Craft the system prompt
The system prompt is the entire product. It encodes the pedagogy. I want the model to act like a veteran teacher who never gives away the answer but instead spots the exact moment a student's mental model breaks. The prompt below enforces three constraints: identify specific misconceptions, explain the error in one sentence, and ask a guiding question. I also add an escape hatch for fully correct answers so the tool does not force false critiques. Keeping this prompt in a dedicated variable makes it easy for a subject matter expert to edit without reading Python.
SYSTEM_PROMPT = """You are a Socratic tutor evaluating a student's written explanation of a scientific concept.
Your task:
1. Identify up to three specific misconceptions or gaps in the student's explanation.
2. For each issue, write one concise sentence explaining what the student seems to misunderstand.
3. Ask one or two follow-up questions that guide the student to discover the correct understanding on their own.
Rules:
- Do not give the correct answer directly.
- Keep your total response under 200 words.
- Use a supportive tone.
- If the explanation is fully correct, confirm this and ask a deeper extension question.
"""
Step 3: Prepare a sample student explanation
To test the pipeline, we need a realistic student submission. I wrote the paragraph below to mimic a common middle-school misconception about photosynthesis: the idea that plants create energy rather than converting it, and that oxygen is produced primarily for animals. It is short enough to fit in a single API call but rich enough to surface multiple gaps. In a real deployment, you might feed entire lab reports or essay drafts. Because Oxlo.ai uses request-based pricing rather than token-based billing, those longer inputs do not inflate the cost. That is a major advantage for education workloads, where student writing is often long form and unpredictable in length.
STUDENT_EXPLANATION = """Plants do photosynthesis to make food from sunlight. The leaves suck up sunlight and turn it into glucose, which gives the plant energy. The plant uses this energy to grow and make oxygen for us to breathe. The sunlight is basically the plant's battery."""
Step 4: Build the evaluation function
Finally, we wrap the call in a function that accepts raw student text and returns formatted feedback. I set temperature to 0.4 so the tone stays consistent across different submissions. I set max_tokens to 512 to leave headroom for detailed questions without truncation. Notice that we do not need to count tokens or truncate the student text aggressively. Oxlo.ai charges one flat fee per request, so we can pass the full essay string directly. If you are prototyping on a budget, you can change the model to deepseek-v3.2, which is available on Oxlo.ai's free tier alongside 16 other models. For production, Llama 3.3 70B offers a strong balance of speed and reasoning quality.
def evaluate_explanation(student_text: str) -> str:
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": f"Student explanation:\n{student_text}"},
],
temperature=0.4,
max_tokens=512,
)
return response.choices[0].message.content
if __name__ == "__main__":
feedback = evaluate_explanation(STUDENT_EXPLANATION)
print(feedback)
Step 5: Batch process a class roster
Teachers rarely have just one essay to review. The function below reads a CSV with two columns, student_id and explanation, and writes the feedback to a new column. We sleep for half a second between calls to stay well within rate limits, even on the free tier. Because Oxlo.ai has no cold starts, the loop runs steadily without the lag spikes you sometimes see on serverless inference platforms.
import csv
import time
def process_roster(input_path: str, output_path: str):
with open(input_path, newline="", encoding="utf-8") as f_in, \
open(output_path, "w", newline="", encoding="utf-8") as f_out:
reader = csv.DictReader(f_in)
fieldnames = reader.fieldnames + ["feedback"]
writer = csv.DictWriter(f_out, fieldnames=fieldnames)
writer.writeheader()
for row in reader:
row["feedback"] = evaluate_explanation(row["explanation"])
writer.writerow(row)
time.sleep(0.5)
print(f"Wrote results to {output_path}")
if __name__ == "__main__":
process_roster("student_explanations.csv", "feedback_results.csv")
Run it
Executing the script with the sample photosynthesis explanation produces the output below. The model correctly flags the misconception about sunlight acting as a battery and challenges the teleological claim that plants make oxygen for humans. Your exact wording may vary slightly between runs, but the structure should remain stable because of the strict system prompt.
1. Misconception: You describe sunlight as a "battery" that the plant stores directly. Sunlight is not stored as light inside the plant.
Question: What actually happens to the light energy after the leaf absorbs it, and what molecule stores the chemical energy that is produced?
2. Misconception: You say the plant makes oxygen "for us to breathe." The plant does not produce oxygen for animals as a purpose.
Question: Why does the plant really release oxygen, and what does that tell you about where the oxygen atoms come from?
Next steps
This script is already useful as a local grading assistant, but its real value appears when you integrate it into a live workflow. Here are two concrete ways to ship it.
First, turn this script into a microservice. You can wrap the evaluate_explanation function in a FastAPI POST endpoint and host it on any small VPS. Point your LMS webhook, a Google Form, or a simple React front end at it. Because Oxlo.ai does not charge by the token, you can accept multi-page lab reports or essay drafts without watching a usage meter climb. For long-context education workloads, that flat per-request model can be significantly cheaper than token-based inference providers such as Together AI, Fireworks AI, OpenRouter, Replicate, or Anyscale. You can explore the exact tiers at https://oxlo.ai/pricing.
Second, treat model selection as a curriculum choice. If you teach in a multilingual environment, swap the model string to qwen-3-32b for robust reasoning across languages. If your course involves advanced mathematics, computer science, or detailed chain-of-thought analysis, try kimi-k2.6. Every model on Oxlo.ai, from vision to embeddings, shares the same base URL and SDK, so you can A/B test different pedagogical styles with a one-line change and zero infrastructure rework.

