Guaranteed 15% off your current AI inference bill for team spending up to $20000 / month.

Book a call →
Back to Blogs
Engineering

Building a Business Intelligence Tool with LLM

Business teams waste hours clicking through dashboards to answer simple questions. In this tutorial, I will build a lightweight BI agent that ingests a CSV...

Building a Business Intelligence Tool with LLM

Business teams waste hours clicking through dashboards to answer simple questions. In this tutorial, I will build a lightweight BI agent that ingests a CSV dataset, profiles it, and answers natural-language questions using an LLM for reasoning. We will run inference on Oxlo.ai, taking advantage of flat per-request pricing so that passing wide data contexts to the model never inflates the cost of a single call.

What you'll need

Before starting, grab an API key from the Oxlo.ai portal. You will need Python 3.10 or newer and the OpenAI SDK.

pip install openai pandas

Step 1: Configure the Oxlo.ai Client

I start by instantiating the OpenAI-compatible client. Pointing the base URL at Oxlo.ai is the only change needed from a standard OpenAI setup.

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key=os.getenv("OXLO_API_KEY", "YOUR_OXLO_API_KEY")
)

Step 2: Create a Sample Dataset and Profile

To keep the tutorial self-contained, I generate a small sales CSV in memory. The profile function extracts column types, row counts, nulls, and a few sample rows. This summary is what we will feed to the LLM, rather than dumping the full CSV, which keeps the prompt clean and cheap on Oxlo.ai.

import pandas as pd
from io import StringIO

csv_data = """date,product,region,revenue,cost
2024-01-01,Widget A,North,1200,800
2024-01-02,Widget B,South,950,500
2024-01-03,Widget A,North,1300,800
2024-01-04,Widget C,East,2100,1200
2024-01-05,Widget B,West,1100,600"""

df = pd.read_csv(StringIO(csv_data))

def profile_dataset(dataframe):
    return {
        "columns": dataframe.dtypes.astype(str).to_dict(),
        "row_count": len(dataframe),
        "null_counts": dataframe.isnull().sum().to_dict(),
        "sample": dataframe.head(3).to_dict(orient="records")
    }

profile = profile_dataset(df)

Step 3: Define the System Prompt

The system prompt tells the model it is a BI analyst and gives it the dataset profile. I keep it strict: the model must stay within the available columns and return a short pandas snippet when computation is needed.

SYSTEM_PROMPT = """You are a senior business intelligence analyst.
You have access to a pandas DataFrame with the following profile:

{profile}

Rules:
1. Answer using only the columns described above.
2. If computation is required, write a short pandas snippet inside ```python blocks.
3. After any calculation, provide a concise business insight in 1 to 2 sentences.
4. If the question cannot be answered with the available data, say so clearly."""

Step 4: Build the Query Handler

This function injects the live profile into the system prompt, sends the user question to Llama 3.3 70B on Oxlo.ai, and returns the raw text. I use a low temperature to keep the reasoning deterministic.

import json

def ask_bi_agent(question, profile):
    profile_str = json.dumps(profile, indent=2)
    
    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT.format(profile=profile_str)},
            {"role": "user", "content": question},
        ],
        temperature=0.2
    )
    
    return response.choices[0].message.content

Step 5: Execute Generated Code Safely

The LLM sometimes emits pandas code to compute aggregates. I extract any Python blocks, run them in a restricted namespace containing only the dataframe and pandas, and append the results. This grounds the LLM's narrative in real numbers.

import re

def run_bi_query(question, df, profile):
    raw_answer = ask_bi_agent(question, profile)
    code_blocks = re.findall(r'```python\n(.*?)```', raw_answer, re.DOTALL)
    
    computed = []
    for code in code_blocks:
        try:
            local_ns = {"df": df, "pd": pd, "result": None}
            exec(code, local_ns)
            if local_ns.get("result") is not None:
                computed.append(local_ns["result"])
        except Exception as e:
            computed.append(f"Execution error: {e}")
    
    return {"insight": raw_answer, "computed": computed}

Step 6: Format the Final Report

For readability, I wrap the raw LLM output and any computed tables into a simple Markdown report. This makes it easy to drop into a notebook or render in a web UI later.

def format_report(output):
    lines = []
    lines.append("# BI Agent Report\n")
    lines.append("## Insight\n")
    lines.append(output["insight"])
    
    if output["computed"]:
        lines.append("\n## Verified Data\n")
        for idx, item in enumerate(output["computed"], 1):
            lines.append(f"### Result {idx}\n")
            lines.append(f"```\n{item}\n```\n")
    
    return "\n".join(lines)

Run It

Now I ask a real business question: which region performed best? The agent returns both a narrative answer and a verified computation.

if __name__ == "__main__":
    question = "Which region has the highest total revenue, and what is the profit margin?"
    out = run_bi_query(question, df, profile)
    report = format_report(out)
    print(report)

When I run the script, the model first reasons over the schema and then emits a small pandas groupby. My executor runs that code against the live dataframe, so the insight is grounded in actual data rather than hallucinated figures. Example output:

# BI Agent Report

## Insight
North region has the highest total revenue at $2,500.
South has the highest profit margin at approximately 47.4%.

```python
result = df.groupby('region').agg({
    'revenue': 'sum',
    'cost': 'sum'
})
result['margin'] = (result['revenue'] - result['cost']) / result['revenue'] * 100
```

## Verified Data

### Result 1
```
            revenue  cost      margin
region                               
East           2100  1200   42.857143
North          2500  1600   36.000000
South           950   500   47.368421
West           1100   600   45.454545
```

Next Steps

This agent works for ad-hoc CSV analysis, but you can extend it in two directions. First, replace the in-memory CSV with a SQLAlchemy connection so the agent queries live warehouses instead of static files. Second, add a Streamlit frontend so non-technical users can ask questions without touching Python. If you need deeper reasoning for complex multi-table joins, swap the model string to deepseek-v3.2 or kimi-k2.6 on Oxlo.ai without changing any client code. Because Oxlo.ai charges a flat rate per request, you can pass wide table schemas or long multi-turn conversation histories to the model without watching token costs accumulate on every single call. For details on plans and request limits, see the Oxlo.ai pricing page.

Ready to build with Oxlo.ai?

Get started building high-performance AI inference applications today.

Get started
Ox Assistant
Online
OxBot
OxBot

Hi there! Try our cost calculator to see what you'd save with Oxlo.ai.