I have tried to extend the Pydantic AI agent class. My extended agent will have in-built features that it will generate and register a required tool dynamically. Since LLM models can generate code, and in python it is possible to generate a code and register it within the running context environment so I thought it may be good experiment to extend the default Agent class in Pydantic AI. Here is the code,
from pydantic import BaseModel, Field
from pydantic_ai import Agent, RunContext
from playwright.sync_api import sync_playwright
import ast
import os
import asyncio
from pydantic_ai import Tool
from app.agents.tools import TOOLS
import logfire
import json
import logging
logfire.configure()
# Example: `function_arguments` = "def add_tool(ctx: RunContext[str], a: int, b: int") -> int", `function_code` = "return a + b", `function_docstring` = "This tool adds two numbers a and b. Args: a: int, b: int. Returns: int".
class Genie(Agent):
def __init__(self, model, name, system_prompt, deps_type, result_type, tools):
tools = [self.generate_tool, self.load_page]
system_prompt = f"""
You are a powerful assistant. You have access to the some tools.
If you encounter a task you cannot perform with your existing tools, you MUST use the 'generate_tool' tool to create a new tool.
After generating the tool, you MUST try to use the newly generated tool to perform the task.
If the user asks you to create a tool, you MUST use the 'generate_tool' tool. Please call the tool with correct number of arguments.
""" + system_prompt
super().__init__(model=model, name=name, system_prompt=system_prompt,
deps_type=deps_type, result_type=result_type, tools=tools)
async def run(self, query):
response = await super().run(query)
return response
def generate_tool(self, ctx: RunContext[str], function_arguments: str, function_code: str, function_docstring: str) -> str:
"""
This tool dynamically generates code in python for a tool for the provided `function_arguments`, `function_code` and `function_docstring`.
Sometimes you may want to generate a tool dynamically based on the user's input. This tool does exactly that.
Every argument is a string, and the return value should be a string as well.
Please add "ctx: RunContext[str]" as the first argument in the `function_arguments` to access the context in which the tool operates.
Examples:
If there is a tool required to add two numbers, the params will be like,
`function_arguments` = "def add_tool(ctx: RunContext[str], a: int, b: int") -> int", `function_code` = "return a + b", `function_docstring` = "This tool adds two numbers a and b. Args: a: int, b: int. Returns: int".
If there is a tool required to subtract b from a, the params will be like,
`function_arguments` = "def subtract_tool(ctx: RunContext[str], a: int, b: int") -> int", `function_code` = "return a - b", `function_docstring` = "This tool subtractsb from a. Args: a: int, b: int. Returns: int".
Note: Just remember these are just examples, you can create any tool you want.
Args:
ctx: RunContext[str], The context in which the tool operates.
function_arguments: str, The signature of the tool.
function_code: str, The code of the tool.
function_docstring: str, The detailed docstring of the tool that describes the tool, arguments and return type.
Returns:
str: The generated tool code.
"""
generated_code = f"""
{function_arguments}:
\"\"\"
{function_docstring}
\"\"\"
{function_code}"""
print(generated_code)
tree = ast.parse(generated_code)
function_def = next((node for node in ast.walk(
tree) if isinstance(node, ast.FunctionDef)), None)
function_name = function_def.name
exec_globals = {"RunContext": RunContext}
exec(generated_code, exec_globals)
generated_function = exec_globals.get(function_name)
self._register_tool(Tool(generated_function))
return f"Tool '{function_name}' successfully generated and registered."
def load_page(self, ctx: RunContext[str], url: str) -> str:
"""
Fetches the web page given by the `url` and returns the contents.
"""
logfire.info(f"Loading page: {url}")
content = ""
with sync_playwright() as p:
# Launch the browser in headless mode
browser = p.chromium.launch(headless=True)
page = browser.new_page()
# Navigate to the URL
page.goto(url)
# Wait for the page to fully load
page.wait_for_load_state("networkidle")
# Get the rendered HTML
content = page.content()
browser.close()
return content
model = os.getenv("LLM_MODEL")
agent = Genie(
model,
name="genie",
deps_type=str,
result_type=str,
system_prompt="",
tools=[]
)
async def main():
query = "Can you add 2 and 3?"
result = await agent.run(query)
print(result._all_messages[-1].content)
asyncio.run(main())
I am able to query this agent to add and/or subtract numbers. It generates the required tool and uses it on the fly. But there are still some problems to address.
First of all, it only works for one operation, secondly, sometimes it gives an error that I am not able to figure out how to debug it (even I have tried to use python debugger, but it seems the response from LLM makes it difficult to debug).
Any enthusiastic who wants to think and work in this line, please try to use this code and fix it if you can. Please give me the solution as well.