Guaranteed 15% off your current AI inference bill for team spending up to $20000 / month.

Book a call →
Back to Blogs
Learn AI

Leveraging LLM for Data Visualization

We are going to build a lightweight data visualization agent that reads a raw CSV, accepts plain-English instructions, and emits executable Python plotting...

Leveraging LLM for Data Visualization

We are going to build a lightweight data visualization agent that reads a raw CSV, accepts plain-English instructions, and emits executable Python plotting code. It is useful for data analysts and backend engineers who want to skip repetitive matplotlib boilerplate and iterate on charts through conversation rather than switching between a notebook and an IDE.

What you'll need

You will need an Oxlo.ai API key from https://portal.oxlo.ai, Python 3.10 or newer, and the OpenAI SDK. Install the Python dependencies now:

pip install openai pandas matplotlib

Because Oxlo.ai uses flat per-request pricing instead of token-based metering, you can pass wide CSV previews or detailed system prompts into every call without watching input-length counters drive up cost. That matters here, because sending a data sample plus schema context inside the prompt is exactly how we keep the model from hallucinating column names. On token-based backends that habit becomes expensive quickly, but on Oxlo.ai the cost stays predictable regardless of prompt length. See https://oxlo.ai/pricing for current plan details.

Step 1: Prepare sample data and the Oxlo.ai client

I will use a small sales dataset so the tutorial is fully reproducible without external downloads. We create the CSV on disk, then initialize the OpenAI SDK pointing at Oxlo.ai. I am using Llama 3.3 70B as the workhorse because it handles general-purpose coding tasks reliably and follows formatting instructions closely. If you prefer, you can swap in DeepSeek V3.2 or Qwen 3 32B later to compare code-generation styles, since Oxlo.ai hosts all of them under the same endpoint. Notice that we do not need a special Oxlo.ai SDK. The standard openai package works because the platform is fully compatible with the OpenAI API spec.

import pandas as pd
import matplotlib.pyplot as plt
from openai import OpenAI

# Reproducible sample dataset
df = pd.DataFrame({
    "month": ["Jan", "Feb", "Mar", "Apr", "May", "Jun"],
    "revenue": [12000, 15000, 13000, 17000, 20000, 22500],
    "cost": [8000, 9000, 8500, 10000, 11000, 11500],
    "region": ["North", "North", "South", "South", "East", "East"]
})
df.to_csv("sales.csv", index=False)

# Oxlo.ai client - drop-in replacement for OpenAI
client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key="YOUR_OXLO_API_KEY"  # from https://portal.oxlo.ai
)

Step 2: Lock down the system prompt

The system prompt is the only contract between us and the model. We need to suppress markdown formatting, prevent plt.show() from hanging a headless server, and force a saved file so we can verify the result programmatically. I also instruct the model to assume the CSV uses a standard pandas read_csv call and to respect the exact column names it sees in the preview.

SYSTEM_PROMPT = """You are a data visualization assistant.
You receive three things: a CSV file path, a preview of the data, and a user request.
Output only valid Python code that does the following in order:
1. Import pandas as pd and matplotlib.pyplot as plt.
2. Read the CSV file from the exact path provided.
3. Build the chart described in the user request using matplotlib.
4. Call plt.tight_layout() followed by plt.savefig('chart.png', dpi=150).
Do not wrap the code in markdown fences. Do not include explanatory text. Do not call plt.show()."""

Step 3: Generate plotting code

To keep the model grounded in reality, we pass the first five rows of the dataframe as a plain-text preview. Tabular string format is closer to the model's training distribution than JSON, and it preserves the exact column names and dtypes. We compose a single user message containing the path, the preview, and the chart request, then send it to Oxlo.ai. If you need to send more than five rows for a very wide schema, Oxlo.ai's request-based pricing makes that cheap, whereas on token-based providers every extra row would inflate the bill. I also strip markdown fences defensively because even the best models occasionally forget instruction-level constraints when they are deep in coding mode. The strip is cheap insurance.

def generate_plot_code(user_request: str, csv_path: str) -> str:
    df = pd.read_csv(csv_path)
    preview = df.head().to_string()

    user_content = (
        f"CSV path: {csv_path}\n"
        f"Preview:\n{preview}\n\n"
        f"Request: {user_request}"
    )

    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_content},
        ],
    )

    code = response.choices[0].message.content.strip()
    # Defensive strip in case the model ignores the no-markdown rule
    code = code.removeprefix("```python").removeprefix("```").removesuffix("```").strip()
    return code

Step 4: Execute the generated code safely

Running LLM-generated code is inherently risky, so we sandbox execution rather than calling eval() or writing blindly to a shell script. We build a restricted globals dictionary that exposes only pandas, matplotlib, and standard Python builtins. If the model tries to import os or sys, exec will raise a NameError and we surface it immediately. For local prototyping this is sufficient. If you move this to production, replace exec with a containerized subprocess or a restricted environment such as a Firejail sandbox, but keep the same Oxlo.ai client logic.

import builtins

def run_generated_code(code: str):
    allowed_globals = {
        "__builtins__": builtins,
        "pd": pd,
        "plt": plt,
    }
    try:
        exec(code, allowed_globals)
        print("Success: chart saved to chart.png")
    except Exception as e:
        print(f"Execution failed: {e}")
        print("--- Generated code ---")
        print(code)

Step 5: Iterate with follow-up requests

First drafts rarely get colors, labels, and dimensions right on the first try. We make the agent stateful by keeping the full message history. The assistant's previous code block stays in context, so a follow-up like "sort by revenue descending" or "use a dark background" builds on the prior logic instead of starting over. Because Oxlo.ai has no cold starts on popular models, the second turn returns just as fast as the first, which keeps the iterative loop feeling responsive. Notice that we append both the user request and the assistant reply to the history list. This mirrors the standard chat completions pattern, so you could persist the list to Redis or SQLite if you want long-lived sessions across restarts.

def revise_plot(user_request: str, history: list) -> tuple[str, list]:
    history.append({"role": "user", "content": user_request})
    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=history,
    )
    code = response.choices[0].message.content.strip()
    code = code.removeprefix("```python").removeprefix("```").removesuffix("```").strip()
    history.append({"role": "assistant", "content": code})
    return code, history

Run it

Here is the end-to-end flow. I ask for a grouped bar chart comparing revenue and cost by month, then follow up with a styling tweak. The example output shows what you should see in your terminal when the agent behaves correctly.

if __name__ == "__main__":
    # First pass
    req = "Create a grouped bar chart of revenue and cost by month. Rotate x labels 45 degrees."
    code = generate_plot_code(req, "sales.csv")
    print("First generation:\n", code)
    run_generated_code(code)

    # Build history for multi-turn revision
    df = pd.read_csv("sales.csv")
    history = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": f"CSV path: sales.csv\nPreview:\n{df.head().to_string()}\n\nRequest: {req}"},
        {"role": "assistant", "content": code},
    ]

    # Second pass
    new_req = "Change the revenue bars to green, cost bars to red, and save as chart_v2.png."
    new_code, history = revise_plot(new_req, history)
    print("\nRevision:\n", new_code)
    run_generated_code(new_code)

When I ran this against Oxlo.ai, Llama 3.3 70B produced clean matplotlib code that read the CSV, grouped by month, and applied the 45-degree rotation on the first pass. On the second pass it reused the same structure but swapped the color arguments and updated the filename. Both images rendered correctly without human edits to the Python. The terminal output will show the raw Python for both generations, followed by the success message. You can open chart.png and chart_v2.png to verify that the color change and filename update were applied correctly. If the model had invented a column name, the exec sandbox would have raised a KeyError and we could feed that traceback back into the conversation as an error-turn.

Next steps

Two concrete directions to take this next. First, wrap the generator and executor in a FastAPI endpoint that accepts file uploads and returns the rendered PNG. That turns the local script into a service the rest of your stack can call. Second, upgrade the agent with Oxlo.ai function calling so it can query a SQL database or run Pandas aggregations before it decides what to plot. Giving the model tools lets it handle raw data sources instead of requiring a pre-built CSV for every request.

Ready to build with Oxlo.ai?

Get started building high-performance AI inference applications today.

Get started
Ox Assistant
Online
OxBot
OxBot

Hi there! Try our cost calculator to see what you'd save with Oxlo.ai.