A Beginner's Guide to Using LLMs for Art Generation

We are going to build a command-line art director that turns a rough concept into a production-ready image. It pairs an LLM for prompt expansion with an image model for rendering, which saves hours of manual tweaking and fits neatly into any Python automation pipeline.

What you'll need

Python 3.10 or newer installed on your machine.
An Oxlo.ai API key. Create one at https://portal.oxlo.ai. The free tier gives you 60 requests per day, which is plenty to test this pipeline.
The OpenAI Python SDK. Install it with pip install openai.
Optionally python-dotenv if you prefer loading keys from a file. I will use os.environ to keep dependencies minimal.

Step 1: Bootstrap the client

I always start by pointing the OpenAI SDK at Oxlo.ai. Because the platform is fully OpenAI API compatible, this is a drop-in replacement with no wrapper classes required. I store the key in an environment variable instead of hard-coding it, since this script will likely end up in version control. I then run a quick smoke test with llama-3.3-70b to confirm the key is active and the network path is clean. There are no cold starts on popular models, so the response comes back immediately.

from openai import OpenAI
import os

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key=os.environ.get("OXLO_API_KEY")
)

# quick smoke test
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Say OK"}],
    max_tokens=5
)
assert "OK" in response.choices[0].message.content
print("Oxlo.ai client ready")

Step 2: Craft the system prompt

The raw ideas people provide are usually vague. "A cyberpunk cat" does not give an image model enough context to produce something consistent. The system prompt forces the LLM to act as a specialist photographer and art director. It must inject details about lighting, composition, color grading, and style references. I also explicitly ban markdown, because asterisks or quotation marks can leak into the prompt and confuse the image tokenizer. Keeping the output to a single paragraph makes parsing trivial.

ART_DIRECTOR_PROMPT = """You are an expert prompt engineer for text-to-image models.
Your job is to take a rough user idea and expand it into a single, detailed paragraph
optimized for image generation. Include style, lighting, mood, camera angle, and color palette.
Output only the final prompt. Do not add explanations or markdown formatting."""

Step 3: Generate the art prompt

With the system prompt in place, I wrap the user's rough idea and send it to the chat endpoint. I set temperature to 0.7 so the model stays creative but does not hallucinate unrelated concepts. I cap max_tokens at 300 because image prompts have diminishing returns after about two sentences. In my experience, prompts longer than roughly 250 tokens start to override each other in the image model's conditioning, causing detail loss. I use llama-3.3-70b because it follows instructions reliably. You could swap in qwen-3-32b if your source ideas are multilingual, or kimi-k2.6 if you want the model to reason about spatial composition before it writes the prompt.

def refine_prompt(idea: str) -> str:
    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=[
            {"role": "system", "content": ART_DIRECTOR_PROMPT},
            {"role": "user", "content": idea},
        ],
        temperature=0.7,
        max_tokens=300,
    )
    return response.choices[0].message.content.strip()

idea = "a cyberpunk street market at night"
detailed_prompt = refine_prompt(idea)
print(detailed_prompt)

Step 4: Generate the image

Now I hand the refined prompt to the image pipeline. Oxlo.ai hosts several image models through the same API key, so switching between them is a one-line change. I default to oxlo.ai-image-pro for crisp detail, but you could substitute flux.1 or stable-diffusion-3.5 depending on the style you want. The OpenAI SDK's client.images.generate method works out of the box. I request response_format="url" so I get a direct download link. If you prefer base64, change that field and decode the result instead. I keep the resolution at 1024x1024 for standard square assets.

def generate_image(prompt: str, filename: str = "art.png"):
    image_resp = client.images.generate(
        model="oxlo.ai-image-pro",
        prompt=prompt,
        size="1024x1024",
        response_format="url",
        n=1,
    )
    url = image_resp.data[0].url

    import urllib.request
    urllib.request.urlretrieve(url, filename)
    return os.path.abspath(filename)

Step 5: Save and display

The URL returned by the image endpoint is temporary, so I download it immediately to local storage. I use the standard library's urllib.request to avoid extra dependencies. Using os.path.abspath makes the printed path clickable in most terminals. If you plan to run this inside a Docker container, mount a volume for the output directory so the art persists after the container stops. At this point, the pipeline is complete: idea in, image out.

if __name__ == "__main__":
    import sys

    if len(sys.argv) < 2:
        print("Usage: python art_director.py 'your idea here'")
        sys.exit(1)

    user_idea = sys.argv[1]
    print(f"Refining idea: {user_idea}")

    final_prompt = refine_prompt(user_idea)
    print(f"Final prompt: {final_prompt[:200]}...")

    out = generate_image(final_prompt)
    print(f"Image saved to {out}")

Run it

Here is the complete script assembled into art_director.py. I pass a rough concept on the command line and the script handles the rest. The first call takes a moment as the LLM expands the prompt, but because there is no cold start on llama-3.3-70b, the text streams back immediately. The image generation then returns a URL that typically remains valid for a few minutes, which is why I download it straight away.

from openai import OpenAI
import os
import sys
import urllib.request

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key=os.environ["OXLO_API_KEY"]
)

ART_DIRECTOR_PROMPT = """You are an expert prompt engineer for text-to-image models.
Your job is to take a rough user idea and expand it into a single, detailed paragraph
optimized for image generation. Include style, lighting, mood, camera angle, and color palette.
Output only the final prompt. Do not add explanations or markdown formatting."""

def refine_prompt(idea: str) -> str:
    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=[
            {"role": "system", "content": ART_DIRECTOR_PROMPT},
            {"role": "user", "content": idea},
        ],
        temperature=0.7,
        max_tokens=300,
    )
    return response.choices[0].message.content.strip()

def generate_image(prompt: str, filename: str = "art.png"):
    image_resp = client.images.generate(
        model="oxlo.ai-image-pro",
        prompt=prompt,
        size="1024x1024",
        response_format="url",
        n=1,
    )
    url = image_resp.data[0].url
    urllib.request.urlretrieve(url, filename)
    return os.path.abspath(filename)

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python art_director.py 'your idea here'")
        sys.exit(1)

    user_idea = sys.argv[1]
    print(f"Refining idea: {user_idea}")

    final_prompt = refine_prompt(user_idea)
    print(f"Final prompt: {final_prompt[:200]}...")

    out = generate_image(final_prompt)
    print(f"Image saved to {out}")

Example terminal session:

$ export OXLO_API_KEY="sk-oxlo.ai-..."
$ python art_director.py "a retro-futuristic diner on Mars at sunset"
Refining idea: a retro-futuristic diner on Mars at sunset
Final prompt: A retro-futuristic diner on the surface of Mars during golden hour, neon signage casting long shadows on red dust, chrome trim reflecting the pale sun, cinematic wide-angle shot, warm orange and teal color grading, highly detailed, 8k concept art...
Image saved to /home/dev/projects/art_director/art.png

Wrap-up

This script is a minimal viable pipeline, but it is easy to extend. One concrete next step is to wrap it in a FastAPI endpoint so a frontend or game engine can request assets on demand. Another is to add a second LLM pass that critiques the generated prompt before it goes to the image model, creating a self-correcting pipeline. You could even use the vision models on Oxlo.ai to evaluate the resulting image and score how well it matched the original idea, giving you a closed feedback loop. Because Oxlo.ai uses flat request-based pricing, adding these extra reasoning steps does not inflate your bill the way it would on token-metered providers. For the latest plan details, see https://oxlo.ai/pricing.

A Beginner's Guide to Using LLMs for Art Generation

What you'll need

Step 1: Bootstrap the client

Step 2: Craft the system prompt

Step 3: Generate the art prompt

Step 4: Generate the image

Step 5: Save and display

Run it

Wrap-up

Related articles

Applying LLM to Physics Research

Using LLM for Data Visualization

Building Data Analysis Tools with LLM

LLM-Powered Data Agents for Data Analysis

Optimizing LLMs for Data Analysis: A Cost Optimization Perspective

Unlocking LLM Potential for Data Analysis

Ready to build with Oxlo.ai?