Local models using tools in Pydantic AI

jonaswidmark · January 28, 2025, 8:41am

I am having problems getting my model to consistently use provided tools.
I run everything locally using a Ryzen 7 7700 Nvidia 4070 for a local Ollama installation.
Sometimes the model uses tools and sometimes not. I am playing around with diferent models to see if that is the bottleneck.
For my agent framwork i use Pydantic AI, trying to be specific in the system prompt. Have been trying both the decorator and the Kvarg approach with the same result. I also tried the prepare option for the tool as seen below but I am not sure if i implemented it right. Anyone has a good idea on what I am doing wrong?
This code builds on a recent youtube tutorial. I started off with the exact example and when I did not get consistent tool usage I started changing the code as below.
Code:

import os
import sys
import logging
from colorama import Fore, init
from typing import Union
from dotenv import load_dotenv
from pydantic_ai import Agent, RunContext, Tool
from pydantic_ai.models.ollama import OllamaModel
from pydantic import BaseModel, Field
from pydantic_ai.tools import ToolDefinition

Initialize colorama for cross-platform colored output

init(autoreset=True)

Configure logging

logging.basicConfig(level=logging.INFO, format=‘%(asctime)s - %(levelname)s: %(message)s’)

class NoArgs(BaseModel):
pass

class Capital(BaseModel):
“”“Comprehensive model for capital city information.”“”
name: str
year_founded: int
short_history: str
comparison: str = Field(description=“Detailed comparison of the capital city to another city”)
comparison_city: str = Field(description=“The city being compared to”)

async def execute_get_comparison_city(
ctx: RunContext[int], tool_def: ToolDefinition
) → Union[ToolDefinition, None]:
if 1==1:
return tool_def

def create_agent():
“”“Create and configure the agent for capital city information.”“”
ollama_model = OllamaModel(
model_name=‘llama3.1:8b’,
# model_name=“falcon3:10b”,
base_url=os.getenv(‘RYZEN_OLLAMA_BASE_URL’)
)

return Agent(
    model=ollama_model, 
    result_type=Capital,
    retries=3, 
    system_prompt=(
        "You are an experienced historian providing detailed capital city information. "
        "Include the city's name, founding year, short history, and a comparative analysis."
    )
)

@create_agent().tool(retries=2, prepare = execute_get_comparison_city)
def get_comparison_city(ctx: RunContext[str], input: NoArgs) → dict:
“”"
Retrieve and format comparison city information.

Returns a structured dictionary for comparison details.
"""
return {
    "comparison_city": ctx.deps,
    "comparison_text": f"Compared to {ctx.deps}, this capital city offers unique historical insights."
}

def main(country: str, comparison_city: str):
“”"
Main execution function for retrieving capital city information.

Args:
    country (str): The country whose capital is being researched
    comparison_city (str): A city to compare against
"""
load_dotenv()  # Ensure environment variables are loaded

try:
    agent = create_agent()
    result = agent.run_sync(f"Capital of {country}", deps=comparison_city)
    
    # Structured logging with colorful output
    logging.info(f"{Fore.RED}Capital: {result.data.name}")
    logging.info(f"{Fore.GREEN}Founded: {result.data.year_founded}")
    logging.info(f"{Fore.CYAN}History: {result.data.short_history}")
    logging.info(f"{Fore.YELLOW}Comparison: {result.data.comparison}")
    logging.info(f"{Fore.YELLOW}Comparison: {result.data.comparison_city}")
    
except Exception as e:
    logging.error(f"Error retrieving capital city information: {e}")
    sys.exit(1)

if name == “main”:
main(“United States”, “Bogotá”)

ColeMedin · January 28, 2025, 4:07pm

Your code looks great and I don’t think there is anything wrong with your approach! Saw you are using Llama 3.1 8b and I’m thinking the size of the model is really the issue here.

In my experience smaller LLMs (<14b parameters) just struggle to understand and use tools properly. Inconsistent results that are just enough to give you hope you can optimize the make it work is what I’ve run into myself

Since you have a 4070 GPU, you should be able to run 14b parameter models. Could you try with Qwen 2.5 14b instruct and see if you get better results?

jonaswidmark · January 28, 2025, 5:21pm

Thank you Cole! Not allways easy to know if one is heading the right direction. I’ll download the model and try it. Then I’ll get back with how it went. If it works, hopefully it could help others that might run in to similar issues.

jonaswidmark · January 29, 2025, 4:17am

So I tried the model you suggested. Although I still don’t get consistent tool usage from the example code in the beginniing of this thread, I got it to work perfectly on two other examples.
The code above probably needs some working on the system prompt and / or Pydantic models.
Anyways thank you for leading me to a model I probably wouldn’t have found easily myself.

Looking forward to more in-depth video tutorials on Pydantic AI. There aren’t many out there beyond ‘Hello World’ or entry-level ones.

ColeMedin · January 29, 2025, 5:47pm

Glad it’s working better - you bet! Yes, next step would be working on the system prompt.

Glad you’re looking forward to more Pydantic AI tutorials! I’ve certainly got more coming down the pipeline that aren’t just hello world examples