Building a Language Translation Tool with LLMs: A Step-by-Step Guide

I recently needed to translate a batch of Markdown documentation for an internal CLI tool. Instead of copying text into a browser tab, I built a small Python script that calls an LLM through Oxlo.ai and writes translated files locally. This guide walks through that exact script, from a single API call to a batch-processing tool you can point at a folder.

What You'll Need

You need three things to follow along.

Python 3.10 or newer.
The OpenAI Python SDK. Install it with pip install openai.
An Oxlo.ai API key. Create one at https://portal.oxlo.ai.

Oxlo.ai exposes a fully OpenAI-compatible API, so the official SDK is all you need. One note on cost: Oxlo.ai charges a flat rate per API request, not per token, so translating a long document costs the same as translating a short one. For batch jobs with big files, that predictability matters. You can see plan details at https://oxlo.ai/pricing.

Step 1: Configure the Client

I start with a smoke test to verify the API key, base URL, and model name before I add any logic. I use qwen-3-32b because Oxlo.ai documents it as a strong multilingual model for reasoning and agent workflows. While Oxlo.ai carries over 45 models, I reach for this one here because it handles nuance across high-resource and low-resource languages without the overhead of a 70B+ parameter call. If the key or endpoint were wrong, I want to know immediately with a one-line request rather than after I have written a full file parser.

from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

response = client.chat.completions.create(
    model="qwen-3-32b",
    messages=[
        {"role": "user", "content": "Translate 'Hello, world' to Spanish."},
    ],
)

print(response.choices[0].message.content)

Run this script. If you see a Spanish translation and no errors, the plumbing is solid.

Step 2: Write the System Prompt

Raw LLM outputs vary. To get consistent, bare translations without explanations, I isolate the system prompt in its own constant. This separation of concerns means I can tune tone and formatting rules without touching the Python logic. I explicitly forbid explanations because many LLMs default to adding a note like "Here is the translation:" which breaks automation pipelines. I also keep the instructions short and imperative, which the model on Oxlo.ai follows reliably.

from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

SYSTEM_PROMPT = """You are a professional translator.
- Translate the user's text into the requested target language.
- Preserve all Markdown formatting, code blocks, and front matter exactly.
- Do not add explanations, notes, or preambles.
- If the source language is marked 'auto', detect it automatically."""

response = client.chat.completions.create(
    model="qwen-3-32b",
    messages=[
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": "Translate to French: The quick brown fox jumps over the lazy dog."},
    ],
)

print(response.choices[0].message.content)

The output should be pure French text with no surrounding chat.

Step 3: Create the Translation Function

Hardcoded strings are not reusable. I wrap the call in a function that accepts text, source language, and target language, plus type hints so my editor catches mistakes. A small helper formats the user message so the prompt structure stays consistent no matter what content I pass in. I call .strip() on the return value to remove any leading newlines the model might add, which keeps diffs clean when I commit the translations to git.

from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

SYSTEM_PROMPT = """You are a professional translator.
- Translate the user's text into the requested target language.
- Preserve all Markdown formatting, code blocks, and front matter exactly.
- Do not add explanations, notes, or preambles.
- If the source language is marked 'auto', detect it automatically."""

def translate_text(text: str, target_lang: str, source_lang: str = "auto") -> str:
    user_message = f"Source language: {source_lang}\nTarget language: {target_lang}\n\n{text}"
    response = client.chat.completions.create(
        model="qwen-3-32b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_message},
        ],
    )
    return response.choices[0].message.content.strip()

if __name__ == "__main__":
    sample = "Upload the file to the /tmp directory."
    result = translate_text(sample, target_lang="German")
    print(result)

The function returns a clean string. I run this to confirm German output with no extra commentary.

Step 4: Add File and CLI Support

Now I add file I/O and argparse so I can run the tool from the terminal against real documents. I read UTF-8 text explicitly to avoid Windows encoding issues, call translate_text, and write the result to a new file. Using argparse instead of hardcoded paths means I can wire this script into a Makefile or CI job later without editing source code. The output filename defaults to the original stem plus the target language code, so readme.md becomes readme.spanish.md automatically.

import argparse
from pathlib import Path
from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

SYSTEM_PROMPT = """You are a professional translator.
- Translate the user's text into the requested target language.
- Preserve all Markdown formatting, code blocks, and front matter exactly.
- Do not add explanations, notes, or preambles.
- If the source language is marked 'auto', detect it automatically."""

def translate_text(text: str, target_lang: str, source_lang: str = "auto") -> str:
    user_message = f"Source language: {source_lang}\nTarget language: {target_lang}\n\n{text}"
    response = client.chat.completions.create(
        model="qwen-3-32b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_message},
        ],
    )
    return response.choices[0].message.content.strip()

def main():
    parser = argparse.ArgumentParser(description="Translate a file using Oxlo.ai")
    parser.add_argument("input", type=Path, help="Path to input file")
    parser.add_argument("--target", required=True, help="Target language, e.g. Spanish")
    parser.add_argument("--source", default="auto", help="Source language (default: auto)")
    parser.add_argument("--output", type=Path, help="Output path (default: INPUT.LANG.md)")
    args = parser.parse_args()

    text = args.input.read_text(encoding="utf-8")
    translated = translate_text(text, args.target, args.source)

    out_path = args.output or args.input.with_suffix(f".{args.target.lower()}.md")
    out_path.write_text(translated, encoding="utf-8")
    print(f"Wrote translation to {out_path}")

if __name__ == "__main__":
    main()

I can now run python translate.py readme.md --target Japanese and get readme.japanese.md. The script handles files of any length, and because Oxlo.ai pricing is per request, I do not have to worry about token counts when I feed it a long README.

Step 5: Batch Process a Directory

For my docs folder, I want to point the tool at a directory and generate translated copies for every Markdown file. I use pathlib.Path.rglob to walk the tree recursively, skip files that already look like translations, and mirror the directory structure in an output folder. This keeps the translated site structure identical to the source, which matters if you are feeding the results into a static site generator. Because Oxlo.ai bills per request rather than per token, translating a 2,000-word architecture guide costs the same as translating a one-line title. That flat pricing makes batch jobs easy to budget.

import argparse
from pathlib import Path
from openai import OpenAI

client = OpenAI(base_url="https://api.oxlo.ai/v1", api_key="YOUR_OXLO_API_KEY")

SYSTEM_PROMPT = """You are a professional translator.
- Translate the user's text into the requested target language.
- Preserve all Markdown formatting, code blocks, and front matter exactly.
- Do not add explanations, notes, or preambles.
- If the source language is marked 'auto', detect it automatically."""

def translate_text(text: str, target_lang: str, source_lang: str = "auto") -> str:
    user_message = f"Source language: {source_lang}\nTarget language: {target_lang}\n\n{text}"
    response = client.chat.completions.create(
        model="qwen-3-32b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_message},
        ],
    )
    return response.choices[0].message.content.strip()

def main():
    parser = argparse.ArgumentParser(description="Batch translate Markdown files")
    parser.add_argument("input_dir", type=Path, help="Input directory")
    parser.add_argument("--target", required=True, help="Target language")
    parser.add_argument("--source", default="auto", help="Source language")
    parser.add_argument("--output-dir", type=Path, default=Path("translated"), help="Output directory")
    args = parser.parse_args()

    args.output_dir.mkdir(parents=True, exist_ok=True)
    skip_suffixes = [".es", ".fr", ".de", ".ja", ".zh", ".pt", ".it", ".ko"]

    for src_file in args.input_dir.rglob("*.md"):
        if any(src_file.name.endswith(s + ".md") for s in skip_suffixes):
            continue

        text = src_file.read_text(encoding="utf-8")
        translated = translate_text(text, args.target, args.source)

        rel = src_file.relative_to(args.input_dir)
        out_file = args.output_dir / rel.with_suffix(f".{args.target.lower()}.md")
        out_file.parent.mkdir(parents=True, exist_ok=True)
        out_file.write_text(translated, encoding="utf-8")
        print(f"Translated {src_file} -> {out_file}")

if __name__ == "__main__":
    main()

This walks every .md file recursively. If you have nested folders under docs/, the translated tree under translated/ keeps the same layout.

Run It

Here is a real run against a small file named quickstart.md. I pass Spanish as the target and inspect the output immediately.

Input file content:

# Quick Start

Install the package with pip:

    pip install mytool

Then run the init command:

    mytool init --name demo

Terminal command:

python translate.py quickstart.md --target Spanish --output quickstart.es.md

Output written to quickstart.es.md:

# Inicio rápido

Instala el paquete con pip:

    pip install mytool

Luego ejecuta el comando init:

    mytool init --name demo

Notice that the inline code and the fenced command block are preserved exactly. The model only touched the natural language, which is exactly the behavior the system prompt enforced.

Next Steps

The script works, but two additions make it production-ready. First, inject a domain glossary into the system prompt for specialized terms. If your product uses the word "workspace" in a specific technical sense, add a line that says - Always translate "workspace" as "espacio de trabajo" so the model stays consistent across dozens of files. Second, parallelize the loop with concurrent.futures.ThreadPoolExecutor. Oxlo.ai has no cold starts on popular models, so firing ten requests concurrently will saturate your local pipe without waiting for model spin-up. You could also add a simple retry loop with exponential backoff around the client call. Oxlo.ai serves requests reliably, but network blips happen and a retry keeps a large batch job from failing halfway through.

Building a Language Translation Tool with LLMs: A Step-by-Step Guide

What You'll Need

Step 1: Configure the Client

Step 2: Write the System Prompt

Step 3: Create the Translation Function

Step 4: Add File and CLI Support

Step 5: Batch Process a Directory

Run It

Next Steps

Related articles

Building a Music Generation Tool with LLM: Tips and Best Practices

Using LLM for Speech Generation: A Comprehensive Guide

LLM-Powered Speech Synthesis: A Deep Dive

Building Speech-to-Text Systems with LLMs

The Future of Language Generation: Exploring the Potential of LLMs

Unlocking the Power of LLMs for Machine Translation

Ready to build with Oxlo.ai?