Structured LLM Output Using Ollama
Control your model responses effectively
With version 0.5, Ollama released a significant enhancement to its LLM API. By introducing structured outputs, Ollama now makes it possible to constrain a model’s output to a specific format defined by a JSON schema. Under the hood, most systems use Pydantic’s capabilities to enable this.
Structured output solves a nagging problem many developers face when a system or process takes the output from an LLM for further processing. It’s important for that system to “know” what to expect as its input to process it accurately with repeatable results each time.
Likewise, you want to display model output in the same format each time you display it to a user to avoid confusion and errors
Until now, ensuring consistent output formats from most models has been a pain, but the new functionality from Ollama makes doing so quite easy, as I hope to show in my example code snippets.
Before that, though, you need to install the latest version of Ollama. This isn’t a tutorial on Ollama or how to run it. If you want that information, click my article below, where I go through all that good stuff.
Introduction to Ollama — Part 1
Suffice it to say that Ollama runs on Windows, Linux, and macOS, and you can install the latest version on Windows or MacOS by navigating to https://ollama.com/ and clicking on the big black download button you’ll see onscreen. I’ll be using a Linux system, and for this, you can install it by running this command,
$ curl -fsSL https://ollama.com/install.sh | sh
When the download has finished, run the installer. Next, we need to set up our development environment.
Setting up our dev environment
Before coding, I always create a separate Python development environment where I can install any needed software. Now, anything I do in this environment is siloed and will not impact my other projects.
I use Miniconda for this, but you can use whatever method you know and that suits you best.
If you want to go down the Miniconda route and don’t already have it, you must install Miniconda first. Get it using this link,
Miniconda - Anaconda documentation
1/ Create our new dev environment and install the required libraries
(base) $ conda create -n ollama_test python=3.12 -y
(base) $ conda activate ollama_test
(ollama_test) $ pip install ollama --upgrade
(ollama_test) $ pip install pydantic bs4
# Check the installed version is >= 0.5
(ollama_test) $ ollama --version
ollama version is 0.5.1
(ollama_test) $
2/ Decide what model to use with Ollama
Ollama has access to hundreds of open-source models. Choose which one(s) you want to use and pull them from Ollama. Meta recently released their latest llama model (version 3.3), so I will use it. Also, as I’ll be trying out an image-based task, I’ll use Meta’s Lama3.2 vision model.
(ollama_test) $ ollama pull llama3.2-vision
(ollama_test) $ ollama pull llama3.3
I normally code my examples in a Jupyter Notebook. However, there is currently an issue when trying to run the latest versions of Jupyter with Ollama due to an incompatibility with a third-party library.
Jupyter expects a certain version of this library to be present, and Ollama expects a different version of it to be present.
So, this time, I’m simply saving my code in a Python file and running it with Python on the command line.
Example code 1 — Image interpretation
For this example, I’m asking the model to identify the different animal types in a PNG image. Here is that image.
Here is the code. It’s heavily commented and short, so I won’t go into the details of what it’s doing.
from ollama import chat
from pydantic import BaseModel
# Define a Pydantic model for representing a single animal with its type.
class Animal(BaseModel):
animal: str
# Define a Pydantic model for representing a list of animals.
# This model contains a list of Animal objects.
class AnimalList(BaseModel):
animals: list[Animal]
# Function to analyze an image and identify all animals present in it.
# Uses the Ollama `chat` function to interact with a vision-based model (`llama3.2-vision`).
# Returns the results as an AnimalList object.
def analyze_animals_in_image(image_path: str) -> AnimalList:
# Call the `chat` function with the specified model, format, and parameters.
response = chat(
model='llama3.2-vision',
format=AnimalList.model_json_schema(),
messages=[
{
'role': 'user',
'content': '''Analyze this image and identify all animals present. For each animal, provide:
- The type of animal
Return information for ALL animal types visible in the image.''',
'images': [image_path],
},
],
options={'temperature': 0} # Ensure deterministic output by setting temperature to 0
)
# Validate and parse the response JSON into an AnimalList object.
animals_data = AnimalList.model_validate_json(response.message.content)
return animals_data
# Main block to execute the script.
if __name__ == "__main__":
# Path to the image to be analyzed.
image_path = "D:/photos/2024/animals.png"
# Print an initial message before starting the analysis.
print("\nAnalyzing image for animals...")
# Call the function to analyze the image and get the results.
animals_result = analyze_animals_in_image(image_path)
# Print the analysis results.
print("Animal Analysis Results:")
print(f"Found {len(animals_result.animals)} animals in the image:")
# Loop through the list of animals and print details for each one.
for i, animal in enumerate(animals_result.animals, 1):
print(f"Animal #{i}:")
print(animal.model_dump_json)
This produced the following output.
Analyzing image for animals...
Animal Analysis Results:
Found 5 animals in the image:]
Animal #1:
<bound method BaseModel.model_dump_json of Animal(animal='Walrus')>
Animal #2:
<bound method BaseModel.model_dump_json of Animal(animal='Elephant Seal')>
Animal #3:
<bound method BaseModel.model_dump_json of Animal(animal='Zebra')>
Animal #4:
<bound method BaseModel.model_dump_json of Animal(animal='Elephants')>
Animal #5:
<bound method BaseModel.model_dump_json of Animal(animal='Kittens')>
That’s not too bad at all. The model may have gotten confused with the top left image. I’m unsure if it’s of a Walrus or an elephant seal. The former, I think.
Example code 2— Text summarisation
This is useful if you have a bunch of different texts you want to summarise but want the summaries to have the same structure. In this example, we’ll process the Wikipedia entries for some famous scientists and retrieve certain key facts about them in a highly organized way.
In our summary, we want to output the following structure for each scientist,
The name of the Scientist
When and where they were born
Their main claim to fame
The year they won the Nobel Prize
When and where they died
Here is the code.
from pydantic import BaseModel
import requests
from bs4 import BeautifulSoup
from ollama import chat
from typing import List
import json # For parsing JSON content from the response
# List of Wikipedia URLs
urls = [
"https://en.wikipedia.org/wiki/Albert_Einstein",
"https://en.wikipedia.org/wiki/Richard_Feynman",
"https://en.wikipedia.org/wiki/James_Clerk_Maxwell",
"https://en.wikipedia.org/wiki/Alan_Guth"
]
# Scientist names extracted from URLs for validation
specified_scientists = ["Albert Einstein", "Richard Feynman", "James Clerk Maxwell", "Alan Guth"]
# Function to scrape Wikipedia content
def get_article_content(url):
try:
print(f"Scraping URL: {url}") # Debug print
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
article = soup.find("div", class_="mw-body-content")
if article:
content = "\n".join(p.text for p in article.find_all("p"))
print(f"Successfully scraped content from: {url}") # Debug print
return content
else:
print(f"No content found in: {url}") # Debug print
return ""
except requests.exceptions.RequestException as e:
print(f"Error scraping {url}: {e}")
return ""
# Fetch content from each URL
print("Fetching content from all URLs...") # Debug print
contents = [get_article_content(url) for url in urls]
print("Finished fetching content from all URLs.") # Debug print
# Prompt for the summarization task
summarization_prompt = '''
You will be provided with content from an article about a famous scientist.
Your goal will be to summarize the article following the schema provided.
Focus only on the specified scientist in the article.
Here is a description of the parameters:
- name: The name of the Scientist
- born: When and where the scientist was born
- fame: A summary of what their main claim to fame is
- prize: The year they won the Nobel Prize
- death: When and where they died
'''
# Pydantic model classes
class ArticleSummary(BaseModel):
name: str
born: str
fame: str
prize: int
death: str
class ArticleSummaryList(BaseModel):
articles: List[ArticleSummary]
# Function to summarize an article
def get_article_summary(text: str):
try:
print("Sending content to chat model for summarization...") # Debug print
completion = chat(
model='llama3.3',
messages=[
{"role": "system", "content": summarization_prompt},
{"role": "user", "content": text}
],
format=ArticleSummaryList.model_json_schema(),
)
print("Chat model returned a response.") # Debug print
# Parse and validate the JSON response
articles = ArticleSummaryList.model_validate_json(completion.message.content)
print("Successfully validated and parsed articles.") # Debug print
return articles
except Exception as e:
print(f"Error during summarization: {e}")
return None
# Function to format and filter summaries
def format_summary(summary: ArticleSummaryList):
formatted = []
for article in summary.articles: # Accessing the 'articles' attribute directly
# Filter out scientists not in the specified list
if article.name in specified_scientists:
formatted.append(
f"The name of the Scientist: {article.name}\n"
f"When and where they were born: {article.born}\n"
f"Their main claim to fame: {article.fame}\n"
f"The year they won the Nobel Prize: {article.prize}\n"
f"When and where they died: {article.death}\n"
)
print("Finished formatting summary.") # Debug print
return "\n".join(formatted)
# Main function to process all articles
def main():
summaries = []
for i, content in enumerate(contents):
print(f"Processing content {i+1}/{len(contents)}...") # Debug print
if content.strip(): # Skip empty articles
summary = get_article_summary(content)
if summary:
formatted_summary = format_summary(summary)
if formatted_summary: # Only add if not empty after filtering
summaries.append(formatted_summary)
# Print all formatted summaries
print("Final Summaries:")
print("\n\n".join(summaries))
if __name__ == '__main__':
main()
Here is the final output. It took around 5 minutes to fully run, and my system is quite high-spec, so be warned. Also, the quality of the response is highly dependent on the quality of the LLM you use. I tried it with Llama3.2, and the output was significantly worse than when using the 3.3 version.
(ollama_test) C:\Users\thoma\ollama-test>python tomtest.py
Fetching content from all URLs...
Scraping URL: https://en.wikipedia.org/wiki/Albert_Einstein
Successfully scraped content from: https://en.wikipedia.org/wiki/Albert_Einstein
Scraping URL: https://en.wikipedia.org/wiki/Richard_Feynman
Successfully scraped content from: https://en.wikipedia.org/wiki/Richard_Feynman
Scraping URL: https://en.wikipedia.org/wiki/James_Clerk_Maxwell
Successfully scraped content from: https://en.wikipedia.org/wiki/James_Clerk_Maxwell
Scraping URL: https://en.wikipedia.org/wiki/Alan_Guth
Successfully scraped content from: https://en.wikipedia.org/wiki/Alan_Guth
Finished fetching content from all URLs.
Processing content 1/4...
Sending content to chat model for summarization...
Chat model returned a response.
Successfully validated and parsed articles.
Finished formatting summary.
Processing content 2/4...
Sending content to chat model for summarization...
Chat model returned a response.
Successfully validated and parsed articles.
Finished formatting summary.
Processing content 3/4...
Sending content to chat model for summarization...
Chat model returned a response.
Successfully validated and parsed articles.
Finished formatting summary.
Processing content 4/4...
Sending content to chat model for summarization...
Chat model returned a response.
Successfully validated and parsed articles.
Finished formatting summary.
Final Summaries:
The name of the Scientist: Albert Einstein
When and where they were born: 14 March 1879
Their main claim to fame: Einstein became one of the most famous scientific celebrities after the confirmation of his general theory of relativity in 1919.
The year they won the Nobel Prize: 1921
When and where they died: 18 April 1955
The name of the Scientist: Richard Feynman
When and where they were born: May 11, 1918
Their main claim to fame: Physicist and mathematician
The year they won the Nobel Prize: 1965
When and where they died: February 15, 1988
The name of the Scientist: James Clerk Maxwell
When and where they were born: 13 June 1831
Their main claim to fame: Scottish physicist and mathematician
The year they won the Nobel Prize: 0
When and where they died: 5 November 1879
The name of the Scientist: Alan Guth
When and where they were born:
Their main claim to fame: theoretical physics
The year they won the Nobel Prize: 2014
When and where they died:
Note that Alan Guth is still alive; hence, the When/Where they died part for him is blank. James Clerk Maxwell did not receive a Nobel prize as they weren't around during his lifetime. Also, note that the model could not extract the place of death for any of the scientists, even though that information was contained in the Wikipedia extracts.
Summary
In this article, I’ve provided code and demonstrated two key capabilities of structured outputs using Ollama. The first example showed the use of structured output in image processing, while the second focused on text summarization.
Specifying structured output from LLMs is a big step for Ollama and has many applications. By organizing information in a predictable JSON format, structured outputs improve clarity and make LLMs’ responses more consistent, reducing ambiguities. This structured approach enables seamless integration into downstream applications like APIs, databases, or visualization tools without extensive preprocessing while simplifying data parsing and automation.
Validation against predefined rules becomes easier, minimizing errors and ensuring compliance with expected standards. Ultimately, structured output transforms LLMs into highly practical tools for diverse real-world use cases.
That’s all from me for now. I hope you found this article useful. If you did, please check out my profile page at this link. From there, you can see my other published stories and subscribe to get notified when I post new content.
I know times are tough and wallets constrained, but if you got real value from this article, please consider buying me a wee dram.
If you liked this content, I think you’ll also find these articles interesting.
Structured LLM Output Using Ollama was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
from Datascience in Towards Data Science on Medium https://ift.tt/eInZSga
via IFTTT