Building deterministic AI agents with Pydantic AI

AI agents that are ready for production

Tom van Knippenberg

Mar 31, 2026

Agentic AI Python Pydantic AI

We’re hiring full stack software engineers.

Join us remote/on-site in ’s-Hertogenbosch, The Netherlands 🇳🇱

Join us in building a fintech company that provides fast and easy access to credit for small and medium sized businesses — like a bank, but without the white collars. You’ll work on software wiring million of euros, every day, to our customers.

We’re looking for both junior and experienced software developers.

Find out more…

Ruby on Rails PostgreSQL Docker AWS React React Native

The usage of agents within workflows is advertised all over the internet. You identify an opportunity within the business: risk managers look at macro-economic developments online to determine the risk of an active customer. The idea: do the internet search for the risk manager and come back with a conclusion about the current development of a specific sector, for example by providing average statistics. You start to work with LLMs, give it a search tool and let it generate a summary of average revenue and profit margin for the logistics sector. The first run results in the following answer:

“The average revenue is €10.000 and the profit margin is 3% for the logistics sector. Those companies are generally very healthy.”

You write a regex to extract the average revenue and the profit margin from the result. Amazing, solved this quickly in a day.

A month later, the economy has changed a bit and another customer appears working in the logistics sector. A new assessment is required, but the LLM decided to formulate a more poetic answer.

“The freedom of driving around to an unknown place is hard to describe. It makes up for the low profit margins that the sector is currently facing. The revenue can be around ten thousand euros per month on average.”

The regex does not work anymore, so another component to handle fully written numbers from the result. You can already see that this solution is a pain to maintain. As a developer, you are managing an unpredictable, chatty system.

At Floryn, we only build on reliable outputs. We need a strict schema that is combined with the power of LLM capabilities. That is where Pydantic AI is changing the game for us. Pydantic AI helps us move our LLM experiments into actual production systems.

Tech Stack

Two components are required: an LLM provider and an agent framework. There are tons of options to choose from. For this blog, we will focus on Ollama as our LLM provider and Pydantic AI as our agent framework.

Ollama is an open source model serving platform. Models can be downloaded from the repository after which they can be run locally on your machine. The benefits are that there is more security as no data is sent to an external server and it allows to experiment with a lot of different (small) models.

There are lots of other frameworks available for agentic workflows, such as Langchain and CrewAI. The agent framework of choice is Pydantic AI. Pydantic AI supports a lot of different LLM providers, one of which is Ollama. The library provides some tools out of the box that come in very handy. But the most important feature is the integration with Pydantic validation models.

To show the capabilities of the Pydantic AI framework, DuckDuckGo will be used as a tool. DuckDuckGo allows the agent to search through the internet. The advantage of using DuckDuckGo is that the LLM does not need these search capabilities and therefore it offers an advantage to try different models without changing the code. Mind that there is also the option to use internal search capabilities if the LLM provider allows it, for example when using Gemini or Claude. For the sake of open source usability, this blog will stick to using DuckDuckGo.

The fundamentals

Ollama can be easily installed and is ready to be used with minimal setup. Go to the website of Ollama and download the required installation, then perform the following steps in your terminal.

ollama serve;
ollama pullqwen3:8b

The service will run on http://localhost:11434/v1 by default.

For the Python environment, you need to add the following dependencies with uv.

uv add pydantic-ai
uv add pydantic-ai-slim[duckduckgo]

Alright, the basic setup is done, onto the fun stuff!

Creating your first agent

Creating your first agent is fairly simple. The first step is to define the model that will be used for your agent. Ollama does require a bit more setup compared to providers such as OpenAI, Claude or Gemini, where you can just provide the model name and an API key. The following code snippet shows the very first iteration of our agent.

from pydantic_ai import Agent
from pydantic_ai.models import openai
from pydantic_ai.providers import ollama


model = openai.OpenAIChatModel(
    model_name="qwen3:8b",
    provider=ollama.OllamaProvider(base_url="http://localhost:11434/v1"),
)


analyst_agent = Agent(
    model,
    system_prompt=(
        "You are a rigorous economic analyst. "
        "Provide a concise risk assessment on the requested sector"
    ),
)

topic = "Logistic sector in the Netherlands"
result = analyst_agent.run_sync(
    f"Analyze the current situation in 100 words regarding {topic}"
    " and conclude with a risk level of either low, medium, high, critical"
)
print(result.output)

>>> """The Dutch logistics sector remains a global hub, leveraging its 
strategic location, port infrastructure (e.g., Rotterdam), and digitalization 
advancements. Post-pandemic recovery has been buoyed by e-commerce growth and 
green transition initiatives, such as hydrogen-powered terminals. However, 
challenges persist: labor shortages, energy costs, and geopolitical tensions 
(e.g., Russia-Ukraine war) strain supply chains. Infrastructure projects face 
delays, and regulatory pressures for sustainability add costs. While the 
sector's resilience and innovation mitigate some risks, macroeconomic volatility 
and dependency on global trade expose it to medium risk. 
**Risk Level: Medium**."""

The LLM will respond with information about the sector that it has been trained on, however new developments are not taken into account. Therefore, it would be great if the agent can search the internet and find some more up to date information.

Adding tools

The real work of an agent starts to shine when it gets access to tools. Tools allow the agent to do more than just work with its internal knowledge. It gives the agent the ability to perform certain actions and use the information it returned to generate a proper result. In this case, we will show the ability of searching the internet for more up to date information. Other tools that can be used in other use cases might be to send an email or cancel an order.

from pydantic_ai import Agent
from pydantic_ai.models import openai
from pydantic_ai.providers import ollama
from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool


model = openai.OpenAIChatModel(
    model_name="qwen3:8b",
    provider=ollama.OllamaProvider(base_url="http://localhost:11434/v1"),
)


analyst_agent = Agent(
    model,
    tools=[duckduckgo_search_tool(max_results=10)],
    system_prompt=(
        "You are a rigorous economic analyst. "
        "Use the search tool to find the latest information on the requested sector. "
        "Provide a concise risk assessment on the requested sector"
    ),
)

topic = "Logistic sector in the Netherlands"
result = analyst_agent.run_sync(
    f"Analyze the current situation in 100 words regarding {topic}"
    " and conclude with a risk level of either low, medium, high, critical"
)
print(result.output)


>>> """The Netherlands' logistics sector shows robust growth, with a 5.5% 
CAGR projected to $126.49B by 2032, driven by e-commerce, globalization, 
and green initiatives. However, challenges like labor shortages, inflation 
(3.6% in 2024), and rising energy costs pressure margins. Government 
investments in infrastructure aim to enhance efficiency and sustainability, 
while confidence indices reflect cautious optimism. Despite recovery signs 
(€3.25B in 2024 investments), sector-specific risks from economic uncertainty 
and reduced online retail expansion could hinder growth. 
**Risk Level: Medium** (balanced growth drivers vs. operational and 
macroeconomic headwinds)."""

The result is a lot more detailed with projections about the future while mentioning the same result. The risk level is again mentioned in quite a structured way, but it is not exactly at the end of the output anymore.

Pydantic Models

The previous results show that the agent is able to identify a risk level for the logistics sector. The output is structured quite well, but to use the risk level in your application, string manipulation is still required. The solution that Pydantic AI provides is the usage of their Pydantic models to structure the output of the agent. The following example shows what the code looks like for our use case. Note the usage of retries defined in the agent. Models might get the output schema wrong the first time, so it should be allowed to redefine its answer during a retry.

from pydantic import BaseModel
from enum import Enum
from pydantic_ai import Agent
from pydantic_ai.models import openai
from pydantic_ai.providers import ollama
from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool


class RiskLevel(str, Enum):
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"
    CRITICAL = "critical"


class MarketRisk(BaseModel):
    sector: str
    risk_level: RiskLevel
    reasoning: str


model = openai.OpenAIChatModel(
    model_name="qwen3:8b",
    provider=ollama.OllamaProvider(base_url="http://localhost:11434/v1"),
)

analyst_agent = Agent(
    model,
    tools=[duckduckgo_search_tool(max_results=10)],
    retries=5,
    system_prompt=(
        "You are a rigorous economic analyst. "
        "Use the search tool to find the latest information on the requested sector. "
        "Provide a concise risk assessment on the requested sector"
    ),
    output_type=MarketRisk,
)

topic = "Logistic sector in the Netherlands"
result = analyst_agent.run_sync(
    f"Analyze the current situation in 100 words regarding {topic}"
    " and conclude with a risk level of either low, medium, high, critical"
)
print(result.output)


>>> MarketRisk(
  sector='Logistics',
  risk_level=<RiskLevel.MEDIUM: 'medium'>,
  reasoning="""The Netherlands' logistics sector remains a key European hub, 
  driven by strategic ports (e.g., Rotterdam) and infrastructure. 2024 saw 
  a 37% YoY investment surge, outperforming Western Europe. Recovery from 
  2023 losses is fueled by e-commerce growth. However, rising energy costs, 
  global trade dependencies, and rent pressures pose moderate risks. 
  Diversified activities (warehousing, distribution) and infrastructure 
  investments mitigate some vulnerabilities, but geopolitical tensions and 
  supply chain volatility keep risks elevated above low."""
)

The model now returns a structured output with all of the information in their respective property. It is now a lot easier to access the risk level compared to the string processing that needed to be performed in the previous processes. This makes the agent more suitable for work in production environments as we can lean on the structured output.

By enforcing a strict Pydantic schema, we shift the responsibility of output consistency from fragile parsing logic to the LLM itself.

Learnings along the way

The entire process of working and moving agents and LLMs into production is an enlightening experience. You encounter so many new variables that are not part of the traditional machine learning workflow.

The model selection already poses a few questions that you need to answer. Does it need to support tools? Does it need to support reasoning? These capabilities are not baked into every model. If you are going to use the DuckDuckGo tool, then you need a model that provides tool support.
The second aspect is the size of the local model. A small model is faster, but less powerful. You can ask the general LLMs to provide recommendations given the hardware that you are using. The model should be powerful enough to work with Pydantic validation models though.
The instructions written in the prompt are what need to be “feature engineered”. There are different concepts to evaluate the performance of your agent. One of the supported concepts is LLM-as-a-judge which can evaluate the output of the model based on self-defined metrics, such as a formal tone in the output. Similar to traditional machine learning, you would change the instructions (feature) and evaluate the performance.

Conclusion

With Pydantic AI, agents are moving from chatty systems to structured outputs which are required for our use cases. By enforcing types, we can trust that the output can be processed reliably in all our systems. It helped us move our agentic experiments into production in a reliable and precise manner. Pydantic AI is the foundation of building reliable financial systems.

Tom van Knippenberg

Machine Learning Engineer

Tom is driven by business impact and looks for ways to use machine learning and AI to make a difference. In his free time, he enjoys playing golf and making yet another Italian dessert.

Ask Tom about:

MLOps Data Science Golf Cooking