Guaranteed 15% off your current AI inference bill for team spending up to $20000 / month.

Book a call →
Back to Blogs
Learn AI

The Role of LLMs in Biology: Current Trends and Future Directions

Biomedical literature is growing faster than most labs can manually review. Last quarter I needed to screen hundreds of PubMed abstracts for a gene-disease...

The Role of LLMs in Biology: Current Trends and Future Directions

Biomedical literature is growing faster than most labs can manually review. Last quarter I needed to screen hundreds of PubMed abstracts for a gene-disease mapping project, and reading each full text was not an option. I built a small agent that ingests biology abstracts, extracts entities like genes, proteins, and chemical compounds, and proposes testable hypotheses. I run the pipeline on Oxlo.ai because biology text is naturally token-dense, full of long chemical names, gene symbols, and lengthy methods sections. Flat per-request pricing means I can pass complete abstracts without truncation, and the cost stays the same whether the input is two hundred or two thousand tokens.

What you'll need

  • An Oxlo.ai API key from https://portal.oxlo.ai
  • Python 3.10 or newer
  • The OpenAI SDK installed with pip install openai

You do not need a biology degree to follow this tutorial, but having a real abstract on hand will make the final output more interesting. If you do not have one, I provide a sample TP53 abstract in the run section below.

Step 1: Configure the Oxlo.ai client

Oxlo.ai exposes a fully OpenAI-compatible endpoint, so existing Python scripts need only a new base URL and model name. This drop-in compatibility means I can reuse my old tooling, logging wrappers, and retry logic without rewriting them. I picked Llama 3.3 70B as the default model because it handles dense scientific prose accurately, and Oxlo.ai serves it with no cold starts so my iterative prompt testing does not stall.

from openai import OpenAI
import json

client = OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key="YOUR_OXLO_API_KEY"
)

Step 2: Write the biology system prompt

The system prompt is the contract between me and the model. I need the agent to behave like a computational biologist, not a general assistant, so I explicitly request official HGNC symbols, flag causal language, and forbid markdown wrappers. Keeping the prompt in its own variable makes it easy to version control and iterate without touching the API logic. I also remind the model to hallucinate as little as possible when it encounters ambiguous gene aliases.

SYSTEM_PROMPT = """You are a precise computational biology assistant.
Read the provided biomedical abstract and extract the following fields:
- genes_and_proteins: list of official HGNC symbols
- diseases: list of diseases or phenotypes mentioned
- chemicals: list of drugs, compounds, or small molecules
- processes: list of biological processes such as apoptosis or phosphorylation
- relationships: list of objects describing how entities interact
- causal_claims: any statements implying one entity directly affects another

If the text implies a hypothesis, include it in a hypotheses field.
Respond with valid JSON only. Do not wrap the output in markdown."""

Step 3: Build the extraction function

I wrap the API call in a small, reusable function so I can later batch-process a directory of text files. I set response_format to JSON mode because downstream Python scripts expect a dictionary, not a prose paragraph wrapped in triple backticks. Temperature stays at 0.2 to keep gene symbol extraction conservative. The most important practical detail here is that I do not truncate the abstract. On token-based providers I would have to worry about methods sections inflating the bill, but Oxlo.ai charges per request, so the cost is flat regardless of input length.

def analyze_abstract(text: str) -> dict:
    response = client.chat.completions.create(
        model="llama-3.3-70b",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": text},
        ],
        response_format={"type": "json_object"},
        temperature=0.2,
    )
    raw = response.choices[0].message.content
    return json.loads(raw)

Step 4: Add the hypothesis generation layer

Entity extraction gives me structured data, but the real research value comes from reasoning across those entities. In this second pass I feed the extracted JSON back into a different model and ask for a concrete experiment. I switch to Kimi K2.6 because its advanced reasoning and long-context window handle multi-step biological synthesis well. By passing the previous JSON instead of the raw text, I reduce token volume and keep the model focused on relationships rather than re-reading fluff.

def hypothesize(entity_json: dict) -> str:
    context = json.dumps(entity_json, indent=2)
    prompt = (
        "You are a senior molecular biologist reviewing structured extractions "
        "from a recent paper. Propose one concrete, testable hypothesis and "
        "one follow-up experiment based on the entities and relationships below. "
        "Name specific assays, cell lines, or datasets where possible.\n\n"
        + context
    )

    response = client.chat.completions.create(
        model="kimi-k2.6",
        messages=[
            {"role": "system", "content": "You are a senior molecular biologist."},
            {"role": "user", "content": prompt},
        ],
        temperature=0.4,
        max_tokens=1024,
    )
    return response.choices[0].message.content

Run it

Below is a self-contained script that analyzes a real-world abstract about the tumor suppressor TP53. The first call extracts entities, and the second call generates a hypothesis.

if __name__ == "__main__":
    abstract = (
        "The tumor suppressor p53 is frequently mutated in human cancers. "
        "In this study, we demonstrate that phosphorylation of p53 at Serine 15 "
        "by ATM kinase enhances its transcriptional activity toward the CDKN1A promoter, "
        "leading to cell cycle arrest in response to DNA damage. "
        "Disruption of this phosphorylation site attenuates the DNA damage response "
        "and increases cellular sensitivity to ionizing radiation."
    )

    print("=== Extracted Entities ===")
    entities = analyze_abstract(abstract)
    print(json.dumps(entities, indent=2))

    print("\n=== Proposed Hypothesis ===")
    print(hypothesize(entities))

Running this against Oxlo.ai produced a clean JSON object. The genes_and_proteins array contained TP53 and ATM, while chemicals remained empty and processes listed phosphorylation, transcriptional regulation, cell cycle arrest, and DNA damage response. The relationships array captured that ATM phosphorylates TP53 at Serine 15, which enhances transcriptional activation of CDKN1A. In the hypothesis step, Kimi K2.6 suggested a luciferase reporter assay to measure TP53 binding affinity at the CDKN1A promoter under radiation-induced stress, and recommended generating a non-phosphorylatable TP53 S15A mutant via CRISPR to confirm the causal mechanism.

Next steps

This agent is already useful for triage, but two extensions make it production ready for a lab environment. First, wrap the functions in a loop that reads a CSV of abstracts. Because Oxlo.ai uses flat per-request pricing, your monthly bill is easy to forecast even when some papers contain lengthy methods sections. Second, add an embeddings layer to cluster similar findings across hundreds of papers. You can generate those vectors on Oxlo.ai as well through the BGE-Large endpoint, keeping the entire stack on one platform.

If you want to see how the pricing works for high-volume literature mining, check https://oxlo.ai/pricing. For researchers processing long biology texts daily, the flat per-request model removes the penalty on token-heavy scientific language.

Ready to build with Oxlo.ai?

Get started building high-performance AI inference applications today.

Get started
Ox Assistant
Online
OxBot
OxBot

Hi there! Try our cost calculator to see what you'd save with Oxlo.ai.