Python: Fix approval routing for sub-agent tools created via .as_tool()#5300
Python: Fix approval routing for sub-agent tools created via .as_tool()#5300giles17 wants to merge 4 commits intomicrosoft:mainfrom
.as_tool()#5300Conversation
When a sub-agent tool's inner function requires approval, the approval request propagates correctly to the outer agent. However, the approval response was silently dropped because _auto_invoke_function looked up the inner tool name in the outer agent's tool_map (where it doesn't exist) and assumed it was a hosted tool. Fix: Tag propagated approval requests with parent tool metadata (_parent_tool_name, _parent_tool_args) in additional_properties. When processing an approval response for an unknown tool, check for this metadata and re-route the invocation through the parent sub-agent tool. The _agent_wrapper forwards the approval messages to the inner agent's run() with proper conversation context so the inner function invocation loop can execute the approved tool. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Python Test Coverage Report •
Python Unit Test Overview
|
|||||||||||||||||||||||||||||||||||
giles17
left a comment
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 85%
✗ Correctness
The core routing logic is sound and the approach fits the framework's patterns well. However, there is a correctness bug: at line 1374 of
_tools.py, the re-createdfunction_call_contentusesfunction_call_content.call_idwhich isNonefor approval response Content objects (becauseto_function_approval_response()/from_function_approval_response()don't setcall_id). The correct outer call ID is stored infunction_call_content.id. This means the function result produced by the sub-agent re-invocation will havecall_id=None, causing a mismatch with the outer assistant's original function_call. Most LLM providers (OpenAI, Azure OpenAI) require matching tool_call_ids and will reject the request. The test passes only becauseMockBaseChatClientdoesn't validate call_ids.
✓ Security Reliability
The change correctly implements sub-agent approval re-routing: internal metadata (_parent_tool_name, _parent_tool_args) is attached to propagated approval requests, and when the approval response returns, the system re-invokes the parent sub-agent tool with the approval context. The approach is architecturally sound and follows existing framework patterns. The main reliability concern is that the
_approval_messagesmetadata (containing a nested Content object reference) leaks into the function result'sadditional_propertiesvia the existing pass-through at lines 1444/1456, causing unnecessary internal routing data to persist in session messages. Additionally,_parent_tool_nameand_parent_tool_argsare stored in the approval request Content that is returned to the caller; sinceto_function_approval_responsepreservesadditional_properties, these survive and are trusted for routing on the return trip — but there is no validation that the returned_parent_tool_namematches the tool that originally propagated the request, leaving a theoretical tool-redirection path if the user tampers withadditional_properties.
✓ Test Coverage
The two new integration tests cover the core scenarios — approving a sub-agent's inner tool executes it (happy path) and rejecting it does not. The tests use meaningful assertions (boolean flag for tool execution, content checks on the approval request, text in the response). However, the rejection test discards the second run's response without any assertion, missing an opportunity to verify the outer agent handles rejection gracefully. Additionally, there are no tests for several edge cases introduced by the new re-routing logic: multiple simultaneous inner-tool approvals, the middleware pipeline code path (which also received changes), and the propagate_session=False scenario where session context would be lost on re-invocation.
✓ Design Approach
The PR correctly implements the sub-agent approval re-routing by: (1) tagging propagated approval requests with
_parent_tool_name/_parent_tool_argsin_try_execute_function_calls, (2) redirecting unresolvable approval responses through the parent tool in_auto_invoke_function, and (3) forwarding approval context to the inner agent's secondrun()call in_agent_wrapper. The core logic is sound for the basic case. Two issues warrant attention: the tests add ``@pytest.mark.asynciodecorators that the project explicitly avoids (pyproject.toml line 74: `asyncio_mode = "auto"`), and the re-invocation path is silently broken when `propagate_session=False` (the default, `_agents.py:487`) — the inner agent's second run receives only the approval context with no memory of the original user query or any multi-turn history from the first run.
Flagged Issues
- The re-created Content at _tools.py:1374 uses
function_call_content.call_idwhich isNoneforfunction_approval_responseContent (verified at _types.py:1186-1194 —from_function_approval_responsedoesn't passcall_id). The correct value is infunction_call_content.id(set viato_function_approval_responseat _types.py:1238). The resultingcall_id=Noneon the function result will cause LLM API errors in production because it won't match the outer assistant's function_call call_id.
Suggestions
- The test doesn't catch the call_id=None bug because MockBaseChatClient doesn't validate call_ids. Consider adding an assertion that the function result's call_id matches the outer function call's call_id (e.g., assert the tool message content has call_id='outer_call_1').
- The
_approval_messageslist (containing a Content object) is propagated into the function result viaadditional_propertiesat lines 1444 and 1456. This internal routing metadata persists in session messages unnecessarily. Consider stripping_approval_messagesfromadditional_propertiesbefore constructing the function result, or using a separate channel (e.g., a thread-local or context variable) to pass approval messages to the wrapper rather than embedding them in Content.additional_properties. - The
_parent_tool_namestored inadditional_propertiesis trusted without verification on the return trip (line 1364). Ifadditional_propertiesare round-tripped through an untrusted serialization boundary, a crafted approval response could redirect execution to a different registered tool. Consider validating that_parent_tool_namematches the tool that originally raised the UserInputRequiredException (e.g., by also storing a hash or comparing the inner function_call's call_id). - The rejection test (test_chat_agent_as_tool_inner_approval_rejected) discards the response from the second run() call. Capturing it and asserting on the response text (e.g., that it matches the mock's 'I could not get the details.' or is non-empty) would verify the end-to-end rejection flow completes correctly, not just that the inner tool wasn't called.
- Consider adding a test for the case where propagate_session=False (or propagate_session is not passed to as_tool) and the inner agent has a tool requiring approval. If the session is not propagated, the re-invoked inner agent starts fresh without the conversation context of the pending approval, which may produce incorrect behavior. A test documenting the expected behavior (or failure mode) would be valuable.
- The _auto_invoke_function changes and the FunctionInvocationContext metadata plumbing have two code paths: with and without middleware pipeline (lines 1430 and 1460 in _tools.py). Both received the metadata=approval_metadata change but only the no-middleware path is exercised by these tests. Consider adding a test that configures a middleware pipeline on the outer agent to cover the other branch.
- Consider adding a test where the inner agent has multiple tool calls requiring approval simultaneously. The code in invoke_with_termination_handling (line 1648) uses extra_user_input_contents.extend(propagated[1:]) to handle multiple propagated items, and each would need _parent_tool_name/_parent_tool_args metadata — verifying this works end-to-end would be valuable.
- Tests at lines 2656 and 2782 add ``@pytest
.mark.asynciodecorators. The project configures `asyncio_mode = 'auto'` (pyproject.toml:74), so all async test functions run automatically without this decorator. Existing tests in the file omit it; these new tests should too. - With
propagate_session=False(the default at_agents.py:487), the inner agent's secondrun()call receives only the artificially-constructedinput_messages(function call + approval response) and has no memory of the original user query or any prior turns from the first run. For inner agents that build up multi-turn context before hitting an approval gate, this context is silently lost. Consider either (a) documenting thatpropagate_session=Trueis required for this feature to work correctly with stateful inner agents, or (b) emitting a warning whenpropagate_session=Falseand an approval response is being routed through_agent_wrapper.
Automated review by giles17's agents
There was a problem hiding this comment.
Pull request overview
This PR fixes a bug in the Python agent tool invocation flow where function approval responses originating from a sub-agent (created via .as_tool()) were not correctly routed back through the parent sub-agent tool wrapper, causing the approved inner tool to never execute.
Changes:
- Re-route
function_approval_responsehandling to invoke the parent sub-agent tool wrapper when the inner tool name is not present in the outer agent’s tool map. - Thread sub-agent approval context through
FunctionInvocationContext.metadataso the.as_tool()wrapper can forward approval messages into the inner agent run. - Add regression tests covering approve/reject paths for inner tools requiring approval when used through
.as_tool().
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
python/packages/core/agent_framework/_tools.py |
Adds parent-tool tagging for propagated approval requests and re-routing logic during approved response execution. |
python/packages/core/agent_framework/_agents.py |
Updates .as_tool() wrapper to forward approval messages into the inner agent run via invocation metadata. |
python/packages/core/tests/core/test_agents.py |
Adds tests to ensure inner approved tools execute (and rejected tools do not) when invoked via a sub-agent tool. |
- Fix call_id=None bug: store _parent_tool_call_id during propagation and use it when constructing the synthetic function_call Content, falling back to content.id then content.call_id - Strip _approval_messages from additional_properties after extraction so internal routing metadata doesn't persist in session messages - Use setdefault for _parent_tool_name/_parent_tool_args/_parent_tool_call_id to preserve inner routing metadata for nested sub-agent boundaries - Remove unnecessary @pytest.mark.asyncio decorators (asyncio_mode='auto') - Assert rejection test captures response and verifies text output - Assert function_result with correct call_id exists in session history Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…t#4963) Add type: ignore comments for optional orjson import in file_history_provider samples to suppress pyright errors in samples-check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rove tool usage from sub-agents
giles17
left a comment
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 94% | Result: All clear
Reviewed: Correctness, Security Reliability, Test Coverage, Design Approach
Automated review by giles17's agents
eavanvalkenburg
left a comment
There was a problem hiding this comment.
so in general I think we need to solve this, but I wonder if this will solve it in a way that is universal enough, what if you have this two layers deep, will that work? and any time we use additional properties we should also consider that those might be overridden, and especially for this I suspect that at two levels deep they will and then this breaks, so maybe we need something like a parent_call_id that is None by default? and then that builds up nicely from the first one (which doesn;t have that) down to any number of levels...
|
@eavanvalkenburg I'll look into this and try it out. Thanks! |
Motivation and Context
When a sub-agent exposed via
.as_tool()contains an inner tool withapproval_mode="always_require", the approval request correctly propagates to the caller, but sending the approval response back fails silently — the inner tool is never executed and the agent returns without the tool result.Fixes #4963
Description
The root cause is that approval responses from the outer agent's caller reference the inner agent's tool name, which doesn't exist in the outer agent's tool map, so the framework treated it as a hosted tool and dropped it. The fix tags propagated approval requests with the parent sub-agent tool's name and original arguments (
_parent_tool_name,_parent_tool_args), then on re-invocation routes the approval response back through the parent tool wrapper instead of looking up the unknown inner tool directly. The approval context is threaded viaFunctionInvocationContext.metadatainto the_agent_wrapper, which forwards it as input messages to the inner agent so it can complete the approved function call.Contribution Checklist