Skip to content

Python: Fix approval routing for sub-agent tools created via .as_tool()#5300

Draft
giles17 wants to merge 4 commits intomicrosoft:mainfrom
giles17:agent/fix-4963-2
Draft

Python: Fix approval routing for sub-agent tools created via .as_tool()#5300
giles17 wants to merge 4 commits intomicrosoft:mainfrom
giles17:agent/fix-4963-2

Conversation

@giles17
Copy link
Copy Markdown
Contributor

@giles17 giles17 commented Apr 16, 2026

Motivation and Context

When a sub-agent exposed via .as_tool() contains an inner tool with approval_mode="always_require", the approval request correctly propagates to the caller, but sending the approval response back fails silently — the inner tool is never executed and the agent returns without the tool result.

Fixes #4963

Description

The root cause is that approval responses from the outer agent's caller reference the inner agent's tool name, which doesn't exist in the outer agent's tool map, so the framework treated it as a hosted tool and dropped it. The fix tags propagated approval requests with the parent sub-agent tool's name and original arguments (_parent_tool_name, _parent_tool_args), then on re-invocation routes the approval response back through the parent tool wrapper instead of looking up the unknown inner tool directly. The approval context is threaded via FunctionInvocationContext.metadata into the _agent_wrapper, which forwards it as input messages to the inner agent so it can complete the approved function call.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Note: PR autogenerated by giles17's agent

When a sub-agent tool's inner function requires approval, the approval
request propagates correctly to the outer agent. However, the approval
response was silently dropped because _auto_invoke_function looked up
the inner tool name in the outer agent's tool_map (where it doesn't
exist) and assumed it was a hosted tool.

Fix: Tag propagated approval requests with parent tool metadata
(_parent_tool_name, _parent_tool_args) in additional_properties.
When processing an approval response for an unknown tool, check for
this metadata and re-route the invocation through the parent sub-agent
tool. The _agent_wrapper forwards the approval messages to the inner
agent's run() with proper conversation context so the inner function
invocation loop can execute the approved tool.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 16, 2026 05:43
@giles17 giles17 self-assigned this Apr 16, 2026
@moonbox3
Copy link
Copy Markdown
Contributor

moonbox3 commented Apr 16, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework
   _agents.py4205287%461, 470, 525, 1038, 1083, 1156–1160, 1220, 1248, 1285, 1306, 1326–1327, 1332, 1379, 1421, 1443, 1445, 1458, 1464, 1509, 1511, 1520–1525, 1530, 1532, 1538–1539, 1546, 1548–1549, 1557–1558, 1561–1563, 1573–1578, 1582, 1587, 1589
   _tools.py9618691%190–191, 364, 366, 379, 404–406, 414, 432, 446, 453, 460, 483, 485, 492, 500, 539, 583, 587, 619–621, 623, 629, 674–676, 678, 701, 727, 731, 769–771, 775, 797, 909–915, 951, 963, 965, 967, 970–973, 994, 998, 1002, 1016–1018, 1359, 1501–1507, 1636, 1640, 1692, 1753–1754, 1869, 1889, 1891, 1947, 2010, 2182–2183, 2203, 2259–2260, 2398–2399, 2466, 2471, 2478
TOTAL27693319988% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
5599 20 💤 0 ❌ 0 🔥 1m 29s ⏱️

Copy link
Copy Markdown
Contributor Author

@giles17 giles17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 85%

✗ Correctness

The core routing logic is sound and the approach fits the framework's patterns well. However, there is a correctness bug: at line 1374 of _tools.py, the re-created function_call_content uses function_call_content.call_id which is None for approval response Content objects (because to_function_approval_response() / from_function_approval_response() don't set call_id). The correct outer call ID is stored in function_call_content.id. This means the function result produced by the sub-agent re-invocation will have call_id=None, causing a mismatch with the outer assistant's original function_call. Most LLM providers (OpenAI, Azure OpenAI) require matching tool_call_ids and will reject the request. The test passes only because MockBaseChatClient doesn't validate call_ids.

✓ Security Reliability

The change correctly implements sub-agent approval re-routing: internal metadata (_parent_tool_name, _parent_tool_args) is attached to propagated approval requests, and when the approval response returns, the system re-invokes the parent sub-agent tool with the approval context. The approach is architecturally sound and follows existing framework patterns. The main reliability concern is that the _approval_messages metadata (containing a nested Content object reference) leaks into the function result's additional_properties via the existing pass-through at lines 1444/1456, causing unnecessary internal routing data to persist in session messages. Additionally, _parent_tool_name and _parent_tool_args are stored in the approval request Content that is returned to the caller; since to_function_approval_response preserves additional_properties, these survive and are trusted for routing on the return trip — but there is no validation that the returned _parent_tool_name matches the tool that originally propagated the request, leaving a theoretical tool-redirection path if the user tampers with additional_properties.

✓ Test Coverage

The two new integration tests cover the core scenarios — approving a sub-agent's inner tool executes it (happy path) and rejecting it does not. The tests use meaningful assertions (boolean flag for tool execution, content checks on the approval request, text in the response). However, the rejection test discards the second run's response without any assertion, missing an opportunity to verify the outer agent handles rejection gracefully. Additionally, there are no tests for several edge cases introduced by the new re-routing logic: multiple simultaneous inner-tool approvals, the middleware pipeline code path (which also received changes), and the propagate_session=False scenario where session context would be lost on re-invocation.

✓ Design Approach

The PR correctly implements the sub-agent approval re-routing by: (1) tagging propagated approval requests with _parent_tool_name/_parent_tool_args in _try_execute_function_calls, (2) redirecting unresolvable approval responses through the parent tool in _auto_invoke_function, and (3) forwarding approval context to the inner agent's second run() call in _agent_wrapper. The core logic is sound for the basic case. Two issues warrant attention: the tests add ``@pytest.mark.asyncio decorators that the project explicitly avoids (pyproject.toml line 74: `asyncio_mode = "auto"`), and the re-invocation path is silently broken when `propagate_session=False` (the default, `_agents.py:487`) — the inner agent's second run receives only the approval context with no memory of the original user query or any multi-turn history from the first run.

Flagged Issues

  • The re-created Content at _tools.py:1374 uses function_call_content.call_id which is None for function_approval_response Content (verified at _types.py:1186-1194 — from_function_approval_response doesn't pass call_id). The correct value is in function_call_content.id (set via to_function_approval_response at _types.py:1238). The resulting call_id=None on the function result will cause LLM API errors in production because it won't match the outer assistant's function_call call_id.

Suggestions

  • The test doesn't catch the call_id=None bug because MockBaseChatClient doesn't validate call_ids. Consider adding an assertion that the function result's call_id matches the outer function call's call_id (e.g., assert the tool message content has call_id='outer_call_1').
  • The _approval_messages list (containing a Content object) is propagated into the function result via additional_properties at lines 1444 and 1456. This internal routing metadata persists in session messages unnecessarily. Consider stripping _approval_messages from additional_properties before constructing the function result, or using a separate channel (e.g., a thread-local or context variable) to pass approval messages to the wrapper rather than embedding them in Content.additional_properties.
  • The _parent_tool_name stored in additional_properties is trusted without verification on the return trip (line 1364). If additional_properties are round-tripped through an untrusted serialization boundary, a crafted approval response could redirect execution to a different registered tool. Consider validating that _parent_tool_name matches the tool that originally raised the UserInputRequiredException (e.g., by also storing a hash or comparing the inner function_call's call_id).
  • The rejection test (test_chat_agent_as_tool_inner_approval_rejected) discards the response from the second run() call. Capturing it and asserting on the response text (e.g., that it matches the mock's 'I could not get the details.' or is non-empty) would verify the end-to-end rejection flow completes correctly, not just that the inner tool wasn't called.
  • Consider adding a test for the case where propagate_session=False (or propagate_session is not passed to as_tool) and the inner agent has a tool requiring approval. If the session is not propagated, the re-invoked inner agent starts fresh without the conversation context of the pending approval, which may produce incorrect behavior. A test documenting the expected behavior (or failure mode) would be valuable.
  • The _auto_invoke_function changes and the FunctionInvocationContext metadata plumbing have two code paths: with and without middleware pipeline (lines 1430 and 1460 in _tools.py). Both received the metadata=approval_metadata change but only the no-middleware path is exercised by these tests. Consider adding a test that configures a middleware pipeline on the outer agent to cover the other branch.
  • Consider adding a test where the inner agent has multiple tool calls requiring approval simultaneously. The code in invoke_with_termination_handling (line 1648) uses extra_user_input_contents.extend(propagated[1:]) to handle multiple propagated items, and each would need _parent_tool_name/_parent_tool_args metadata — verifying this works end-to-end would be valuable.
  • Tests at lines 2656 and 2782 add ``@pytest.mark.asyncio decorators. The project configures `asyncio_mode = 'auto'` (pyproject.toml:74), so all async test functions run automatically without this decorator. Existing tests in the file omit it; these new tests should too.
  • With propagate_session=False (the default at _agents.py:487), the inner agent's second run() call receives only the artificially-constructed input_messages (function call + approval response) and has no memory of the original user query or any prior turns from the first run. For inner agents that build up multi-turn context before hitting an approval gate, this context is silently lost. Consider either (a) documenting that propagate_session=True is required for this feature to work correctly with stateful inner agents, or (b) emitting a warning when propagate_session=False and an approval response is being routed through _agent_wrapper.

Automated review by giles17's agents

Comment thread python/packages/core/agent_framework/_tools.py Outdated
Comment thread python/packages/core/agent_framework/_tools.py
Comment thread python/packages/core/tests/core/test_agents.py
Comment thread python/packages/core/tests/core/test_agents.py
Comment thread python/packages/core/tests/core/test_agents.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug in the Python agent tool invocation flow where function approval responses originating from a sub-agent (created via .as_tool()) were not correctly routed back through the parent sub-agent tool wrapper, causing the approved inner tool to never execute.

Changes:

  • Re-route function_approval_response handling to invoke the parent sub-agent tool wrapper when the inner tool name is not present in the outer agent’s tool map.
  • Thread sub-agent approval context through FunctionInvocationContext.metadata so the .as_tool() wrapper can forward approval messages into the inner agent run.
  • Add regression tests covering approve/reject paths for inner tools requiring approval when used through .as_tool().

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
python/packages/core/agent_framework/_tools.py Adds parent-tool tagging for propagated approval requests and re-routing logic during approved response execution.
python/packages/core/agent_framework/_agents.py Updates .as_tool() wrapper to forward approval messages into the inner agent run via invocation metadata.
python/packages/core/tests/core/test_agents.py Adds tests to ensure inner approved tools execute (and rejected tools do not) when invoked via a sub-agent tool.

Comment thread python/packages/core/agent_framework/_tools.py Outdated
Comment thread python/packages/core/agent_framework/_tools.py Outdated
Comment thread python/packages/core/tests/core/test_agents.py
Copilot and others added 3 commits April 16, 2026 05:56
- Fix call_id=None bug: store _parent_tool_call_id during propagation and
  use it when constructing the synthetic function_call Content, falling back
  to content.id then content.call_id
- Strip _approval_messages from additional_properties after extraction so
  internal routing metadata doesn't persist in session messages
- Use setdefault for _parent_tool_name/_parent_tool_args/_parent_tool_call_id
  to preserve inner routing metadata for nested sub-agent boundaries
- Remove unnecessary @pytest.mark.asyncio decorators (asyncio_mode='auto')
- Assert rejection test captures response and verifies text output
- Assert function_result with correct call_id exists in session history

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…t#4963)

Add type: ignore comments for optional orjson import in file_history_provider
samples to suppress pyright errors in samples-check.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

@giles17 giles17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 94% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Design Approach


Automated review by giles17's agents

@giles17 giles17 marked this pull request as draft April 16, 2026 06:19
@giles17 giles17 marked this pull request as ready for review April 16, 2026 18:16
Copy link
Copy Markdown
Member

@eavanvalkenburg eavanvalkenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so in general I think we need to solve this, but I wonder if this will solve it in a way that is universal enough, what if you have this two layers deep, will that work? and any time we use additional properties we should also consider that those might be overridden, and especially for this I suspect that at two levels deep they will and then this breaks, so maybe we need something like a parent_call_id that is None by default? and then that builds up nicely from the first one (which doesn;t have that) down to any number of levels...

@giles17 giles17 marked this pull request as draft April 17, 2026 15:52
@giles17
Copy link
Copy Markdown
Contributor Author

giles17 commented Apr 17, 2026

@eavanvalkenburg I'll look into this and try it out. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: [Bug]: Cannot approve tool usage from sub-agents

4 participants