OpenAI Agent SDK and Tool Calling

In the last post we discussed about OpenAI Agent SDK and built some basic programs with it. We also showed how to use a simple tool to solve a problem. In this blog we will extend the concept of tool calling and also use an agent as a tool as well.

Recap: What is a Tool?

A tool is a custom function that allows an agent to incorporate external system and data into its workflow. This enables them to understand context better and respond more effectively. Let’s take some examples where tools may help automate agentic workflows better.

  • Agents can check warehouse supplies using external APIs and replenish supplies
  • For truck path planning, trucks can be rerouted based on live traffic information
  • In Healthcare, automate patient data management and appointment scheduling

We can think of various other examples for using a tool. However, at this time we will shift our focus to OpenAI tool calling.

Tools in OpenAI SDK

OpenAI supports several different types of tool calling. Let’s see what the different options are.

Hosted Tools

OpenAI supports a few built-in tools that are hosted on the web. These tools can be directly invoked from the API. As of writing, following are the tools supported:

  • WebSearchTool
    • This helps the agent to search the web and returns relevant results.
  • FileSearchTool
    • OpenAI provides a managed library called Vector Storage where you can keep files and documents. OpenAI automatically indexes the documents uploaded so that agents can retrieve them (think RAG). FileSearchTool helps in searching through this file store.
  • CodeInterpreterTool
    • It helps in running codes in sandboxed environments.
  • HostedMCPTool
    • This exposes a remote MCP server to be used within the agentic tools
  • ImageGenerationTool
    • As the name suggests, it can generate images from a prompt.
  • ToolSearchTool
    • This tool provides support for loading other tools at runtime so that the tools are only loaded on demand. This way we can optimize the use of model context sizes so that we do not keep too much unneeded information.

Since one of the primary intent for creating this series was to have locally hosted models, I will skip the hosted tools for now. Each of these tools can be easily built and deployed local, and we will have write ups in later blogs, but for now we will discuss the other options where we will be hosting/ using local tools/ models.

Function Tools

OpenAI SDK support running any Python function as a tool. We had built one such tool in the previous blog.

@staticmethod
@function_tool
def get_chinese_personality_traits(birth_year: int):
  """
  Get the Chinese Zodiac personality traits based on the birth year.
  
  :param int birth_year: Birth Year
  """
  print("Fetching Chinese Zodiac personality traits for year:", birth_year)
  elements = ['Wood', 'Fire', 'Earth', 'Metal', 'Water']
  element = elements[(birth_year - 4) % 10 // 2]
  #... and so on...

We define this as a function tool by annotating it as a @function_tool. The description of the tool will be taken from the docstring, so putting in proper docstring is very important. Usage for this method will be extracted from the method signature using inspect module.

In addition to returning Strings, function tools can also return the following as output,

  • ToolOutputImage
  • ToolOutputFileContent

We will see the full usage for these tools later in this blog when we provide an example usage.

Agent as Tool

We can also use an agent as a tool. This can be helpful when a central agent will be orchestrating commands across other agents. I this case we provide a description which contains the purpose for this tool. This can be done by using Agent.as_tool() call. However, if more control is desired, we can always have agents within function_tool and add them to the workflow. as_tool() provides a convenient way of adding an agent as a tool.

Let’s now build a very simple example to see how these work.

Example Implementation

I was recently working on processing resume samples that I had downloaded from Kaggle before, so I will continue to use them. I will also reuse one of the services that I had build that can return the resumes.

The resumes are structure in a directory that looks as below,

ACCOUNTANT
|_ 10554236.md
|_ 10674770.md
ADVOCATE
|_ 10186968.md
|_ 10344379.md
:::: and so on ::::
TEACHER
|_ 96547039.md
|_ 99244405.md

The original resumes were PDF, they were only converted to Markdown for easy processing.

I had a service that features the following,

  • Can return a list of all subjects (as in ACCOUNTING, ADVOCATE etc.)
  • Given a subject, can return all the candidates belonging to that directory
  • Given a candidate, return the resume for the candidate

We will take this service as available to use, and not go into any further details.

The final result we want to achieve to demonstrate use of functions tools and agents as tools is to get a list of all subjects using a functions. Then given a business requirement, understand what subject among this list fits the correct requirements, get all candidates and print random 5 candidates. Again, this is a toy example, and not really how you will be processing resumes in real environment.

We will start with the base models that we will use in this project.

Base Models

class CategoryList(BaseModel):
    categories: list[str]

class CategorySelectionInput(BaseModel):
    requirement: str
    categories: list[str]

class RequirementCategory(BaseModel):
    primary_category: str
    alternate_category: str | None

These all inherit from pydantic BaseModel. We will use these structured data as request/ response for our workflow.

To cater to our requirements, we will need the following tools,

  • Get all subjects
  • Identify what subject/ category is best fit for a given requirement
  • Get all candidates that had submitted resume under that category

Getting subjects or getting candidate list will require us to query data in internal system. So, we will build python based function tools. On the other hand, given a requirement and list of all available subjects, an agentic model will be able to reason what subject is most appropriate for the requirement. Let’s start building those next.

Function Tools

@staticmethod
@function_tool
def get_categories() -> CategoryList:
  """
  A function tool to get the list of available categories from the resume service. The output is a 
  CategoryList object containing the list of available categories.
  """
  # Get all available categories from the resume service
  all_categories = requests.get(ResumeProcessor.category_ep).json().get('subjects', [])
  print(f"Available categories: {all_categories}")
  return CategoryList(categories=all_categories)

@staticmethod
@function_tool
def get_candidates_for_category(category: str) -> list[str]:
  """
  A function tool to get the list of candidate IDs for the given category from the resume service. The input is the 
  category and the output is a list of candidate IDs for that category.
  
  :param category: Category to get a list of candidates
  """
  candidates = requests.get(ResumeProcessor.candidates_ep.format(subject=category)).json().get('candidates', [])
  print(f"Total candidates found for category {category}: {len(candidates)}")
  return candidates

Not much going on in either of these functions. The only thing that is of interest here is the annotation @function_tool. There is a doctool for each of these methods that describes the use of them.

Agent as tool

Now let’s look at the agent that will be used as a tool. Defining the agent is pretty much the same as we had done before.

# Categorizer Agent
client = AsyncOpenAI(base_url=ResumeProcessor.OLLAMA_EP, api_key=ResumeProcessor.OLLAMA_KEY)
model = OpenAIChatCompletionsModel(model=ResumeProcessor.OLLAMA_MODEL, openai_client=client)
self.categorizer_agent = Agent(
  model=model,
  name="CategorizerAgent",
  instructions="You are a helping agent. Identify one appropriate category for the requirement."
)

The capitalized values are pre-defined constants. What they mean should be fairly evident. Again, since we want to run the model local, we are relying on OLLAMA. We are using the OpenAI agent endpoint exposed by OLLAMA.

Having defined the model, the next step is to write a method that can convert the model output to a base model. This is needed so that we can pass this on to the next step. For now let’s just create the method, we will talk about the usage after we use it.

@staticmethod
async def extractor(cat_result: RunResult) -> RequirementCategory:
  """
  A function tool to extract the primary category and alternate category from the result returned by the categorizer agent. 
  The input is the result returned by the categorizer agent and the output is a RequirementCategory object containing the 
  primary category and alternate category.
  """
  print(f"Result from categorizer agent: {cat_result.final_output}")
  for item in reversed(cat_result.new_items):
    if isinstance(item, ToolCallOutputItem) and item.output.strip().startswith("{"):
      try:
        output_data = json.loads(item.output.strip())
        primary_category = output_data.get("primary_category", "")
        alternate_category = output_data.get("alternate_category", "")
        print(f"Extracted primary category: {primary_category}, alternate category: {alternate_category}")
        return RequirementCategory(primary_category=primary_category, alternate_category=alternate_category)
      except json.JSONDecodeError:
        continue
        return RequirementCategory(primary_category="", alternate_category="")

This method accepts a RunResult as input and tries to find the last instance of new_item. Within that it tries to find a JSON object. When it finds the JSON object, it will try to get the “primary” and the “alternate” categories and return this value.

Final Agent

Now that we have the groundwork done, we will write an instruction for a model articulate what the expectation is from the model, what tools it has access to, what each tool does and so on.

For my model, the instruction looked like this.

You are an experienced HR professional who is assigned the task of finding candidates for the
requirement provided below. You have access to the following tools:

  * get_categories: This tool returns the list of available categories from the resume service.
You can use this tool to get the list of categories. This category can be used to find candidates
for the requirement.

  * categorizer_agent: This tool should be run after running get_categories. Category selection 
returned by this tool should always be one of the categories returned by the get_categories tool. 
This is an agent that can be used to find the most appropriate category for 
the given requirement. You can use this agent to find the primary category and alternate category 
for the requirement. The primary category is the most appropriate category for the requirement and 
the alternate category is the second most appropriate category for the requirement. You can use the 
get_categories tool to get the list of available categories and then use the categorizer_agent to 
find the primary and alternate categories for the requirement.

  * get_candidates_for_category: This tool takes a category as input and returns the list of candidate IDs 
for that category from the resume service. You can use this tool to get the list of candidate IDs for the 
primary category. For now, you can ignore the alternate category and only focus on the primary category to 
find candidates for the requirement.

Use the above tools to find the candidates for the given requirement. Here are the steps you should follow to find 
the candidates for the requirement:

  1. You should first use the get_categories tool to get the list of available categories 
  2. Use the categorizer_agent to find the primary category for the requirement
  3. You should use the get_candidates_for_category tool to get the list of candidate IDs for the primary category 
  4. Return a list of random 5 candidate IDs obtained from the previous step.

Most of these are also generated by Code Assistant, so, may look canned. You can of course come up with your set of instructions.

Now the only thing remaining is to create the model that can use the instruction provided above.

client = AsyncOpenAI(base_url=ResumeProcessor.OLLAMA_EP, api_key=ResumeProcessor.OLLAMA_KEY)
model = OpenAIChatCompletionsModel(model=ResumeProcessor.OLLAMA_MODEL, openai_client=client)
agent = Agent(
  name="Resume Processor Agent",
  instructions=<instructions>,
  model=model,
  tools=[
    ResumeProcessor.get_categories,
    self.categorizer_agent.as_tool(
      tool_name="categorizer_agent",
      tool_description=<categorizer_instructions>,
      parameters=CategorySelectionInput,
      include_input_schema=True,
      custom_output_extractor=ResumeProcessor.extractor,
      on_stream=self.handle_stream
    ),
    ResumeProcessor.get_candidates_for_category,
    ResumeProcessor.get_resume
  ]
)
response = await Runner.run(agent, requirement)
print(f"Final response from agent: {response.final_output}")
return True

There are a few things of interest here. We have defined a custom_output_extractor. This method will be invoked with the final run result and the output returned by this agent will be what it returns. We have also handled on_stream here. This gives a view inside the executing tool viz. when the tool is executed, what is the output from the tool and so on and so forth. Rest of the code should be self explanatory by now.

The model executes, runs through the instructions given, uses the tools in correct order to arrive at the answer. A sample response is provided below. The requirement that I provided to the model looked as follows,

Find me candidates for a junior developer role with at least 3 years of experience in Python and machine learning.

% uv run resumeprocessor.py
Available categories: ['SALES', 'ARTS', 'ENGINEERING', 'AUTOMOBILE', 'AVIATION', 'ACCOUNTANT', 'TEACHER', 'INFORMATION-TECHNOLOGY', 'DESIGNER', 'BUSINESS-DEVELOPMENT', 'PUBLIC-RELATIONS', 'ADVOCATE', 'CONSTRUCTION', 'AGRICULTURE', 'HEALTHCARE', 'HR', 'BANKING', 'FITNESS', 'CHEF', 'BPO', 'APPAREL', 'DIGITAL-MEDIA', 'FINANCE', 'CONSULTANT']

[stream] CategorizerAgent updated

Result from categorizer agent: INFORMATION-TECHNOLOGY

Total candidates found for category INFORMATION-TECHNOLOGY: 120

Final response from agent: Based on the requirement for a junior developer role with at least 3 years of experience in Python and machine learning, I've identified the most appropriate category as "INFORMATION-TECHNOLOGY".

Here are 5 random candidate IDs from that category:

```
['18301617', '26768723', '31111279', '12763627', '24913648']
```

We can also verify from the running API that appropriate calls were made to get data from our resume service. Let’s check the output from that.

INFO:     ::1:51635 - "GET /api/get-subjects HTTP/1.1" 200 OK
INFO: ::1:51652 - "GET /api/get-candidates/INFORMATION-TECHNOLOGY HTTP/1.1" 200 OK

The first API above returned a list of all categories. The second API was called to get all candidate IDs for identified category.

Conclusion

That’s it for today. We will talk about MCP servers and how agents can talk to MCP server in the next blog. Ciao for now!

Leave a Reply

Your email address will not be published. Required fields are marked *