Part 2 - Exploring OpenAI Agent SDK with Local Models (Ollama)
Exploring multi-agent hand offs

This is the Part 3 of this exploration series, covering the nuances of guardrailsš¢ using OpenAI Agent SDK. Please refer to Part 1, Part 3 below
Recap
Welcome back! If you donāt get why Iām welcoming you back, you should go and read Part 1 of this series.
So, continuing our journey, now its time to bring in more agents in to the mix and see how the SDK facilitates multi agent coordination.
Multi Agents
Creating more than 1 agent, is pretty simple as creating multiple instances of Agent class. Lets see some examples
1ļøā£ As you remember in part-1, we learnt how to create a ModelProvider instance to return a OpenAIChatCompletionsModel made out of Ollama. I simply introduced this piece as a sub module, so that I can use it for all of my further explorations.
2ļøā£ We now create our first agent with name Accounts Agent.
3ļøā£ We fill in the handoff_description which will be used by some other higher level agent (we will see this later) to decide whether or not to hand off the control this agent.
4ļøā£ Then, we provide some system instruction via instructions.
5ļøā£ Finally, we set the model to the OllamaProviderAsync().get_model() to force the agent to use our local model
Please note that, even though OpenAI agent SDK does provide the
syncversion of therun()method i.eRunner.run_sync(), unfortunately it is not available for non-OpenAI models. This is because,OpenAIChatCompletionsModelcontains only theAsyncOpenAIclient instead ofOpenAIclient (which is thesyncversion), as evident here
Great! now lets create few other agents and see how they can communicate to each other.
1ļøā£ We create a credit_card_agent with its relevant handoff_description and instructions
2ļøā£ We then create a wire_transfer_agent agent with its relevant handoff_description and instructions
3ļøā£ For the wire_transfer_agent, we also provide access to get_wire_transfer_status tool,
4ļøā£ Which is, nothing but a functional call to return a dummy status, decorated with function_tool decorator. The function name and doc strings of the function is sufficient for the agent to select this tool if warranted.
5ļøā£ Finally, we create the main_agent aka routing agent called Operator Agent and set the handoffs to a list of allowed sub agents.
Here is a graphical view of the orchestration we made so far:
Substackās code highlighting is pathetic. If you like navigating the code in a more intuitive way, please consider visiting my website blog instead.
Then, we simply feed the main_agent with the userās input to see the routing happening. Atleast thatās what the OpenAI documentation claims š³ āļøā ļø. But if you run the below codeā¦
async def main():
result: RunResult = await Runner.run(main_agent, "What is the status of my wire transfer with id 1234")
print(result.final_output)
asyncio.run(main())you will end up withā¦
{"type":"function\",
\"name\":\"transfer_to_wire_transfer_agent\",
\"parameters\":{}}Well, What happened ??
Well, OpenAIās documentation is not that clear. It took me ~30 mins to figure out that, handoffs doesnāt automatically hand it off to the right agent. It simply sets the probable next agent in the result.last_agent property.
In other words, OpenAI agent SDK, simply sets the next agent to be called, according to the main_agentās LLM output. i.e main_agent has determined which agent the control should be handed off to. It is not handed off yet. To do that, you need to call the Runner.run() on the next agent. So the modified code would be :
1ļøā£ We simply inspect the value of last_agent from the result and if it one of the available hand offs agent list,
2ļøā£ We call the run() of that last_agent and pass the user_input again.
3ļøā£ Finally, we print the final_output of sub_result.
Here the final complete example with everything we learnt so far
Tool response as final response
Now, consider a case where you donāt require LLM to use the tool response (get_wire_transfer_status in case of wire_transfer_agent) to prepare the final answer, instead just return the tool response as the final answer? Well, agent-SDK has made it very simpler!
1ļøā£ Simply set the tool_use_behavior="stop_on_first_tool" on the respective agent.
This would change the output as follows :
Routing back to main_agent
Now, what if the query doesnāt pertain to any sub agent and you want the main agent to answer it? Well this is working only incase of thinking models. With qwen3 model for the below :
For an input like Who is the president of USA?, Iām seeing :
But, if I use normal LLMs like llama3.2 etc, Iām seeing :
As you can see, the agent is making up a tool to lookup this fact, even though no such tool is listed in the main_agent1
Wrapping up
Great! So in this post, I covered how to use OpenAI agent-sdkās handoffs to orchestrate multi-agent collaboration. By and large the SDK is very intuitive and works as advertised with some teething issues that I covered. Given that OpenAI is the pioneer in not only introducing Chat-GPT to masses, but also the first one to standardize API inferencing, which forced almost every other model provider to embrace, adopt OpenAI API spec, I hope they will do everything in their power to make this SDK more robust.
Coming back to my exploration, I still consider my feet only wet, with deep dives in upcoming posts. I would say, stay-tuned, but we are not in analog world anymore, So Iām gonna say, stay subscribed and catch you again on my Part 3 post, very soon!
I have logged this issue in OpenAIās Github.ā©ļø












