Skip to content

Python: Support OpenAI allowed_tools tool choice#5322

Open
giles17 wants to merge 3 commits intomicrosoft:mainfrom
giles17:agent/fix-5309-1
Open

Python: Support OpenAI allowed_tools tool choice#5322
giles17 wants to merge 3 commits intomicrosoft:mainfrom
giles17:agent/fix-5309-1

Conversation

@giles17
Copy link
Copy Markdown
Contributor

@giles17 giles17 commented Apr 17, 2026

Motivation and Context

OpenAI and Azure OpenAI support an allowed_tools tool choice type that lets calers restrict which tools the model may invoke without removing tools from the prompt, preserving prompt caching benefits. The Agent Framework had no way to express this constraint.

Fixes #5309

Description

The ToolMode TypedDict gains an optional allowed_tools: list[str] field, validated to only be used with mode="auto". The OpenAI chat client's _prepare_options translates this into the wire format ({"type": "allowed_tools", "mode": "auto", "tools": [...]}) expected by the OpenAI API. Additionally, finish_reason is now propagated through AgentResponse and AgentResponseUpdate so calers can inspect why the model stopped generating, and Pydantic-based tool models (used by providers like Gemini) are properly serialized in _tools_to_dict.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Note: PR autogenerated by giles17's agent

Copilot and others added 2 commits April 17, 2026 03:14
Add allowed_tools field to ToolMode TypedDict, enabling users to restrict
which tools the model may call via the OpenAI allowed_tools tool_choice
type. This preserves prompt caching by keeping all tools in the tools list
while limiting which ones the model can invoke.

- Add allowed_tools: list[str] to ToolMode TypedDict
- Add validation in validate_tool_mode() (only valid when mode == "auto")
- Convert to OpenAI API format in _prepare_options()
- Add tests for validation and API payload generation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 17, 2026 03:49
@giles17 giles17 self-assigned this Apr 17, 2026
@moonbox3
Copy link
Copy Markdown
Contributor

moonbox3 commented Apr 17, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework
   _types.py11028792%58, 67–68, 122, 127, 146, 148, 152, 156, 158, 160, 162, 180, 184, 210, 232, 237, 242, 246, 276, 687–688, 847–848, 1233, 1305, 1340, 1360, 1370, 1422, 1554–1556, 1738, 1841–1846, 1871, 1965, 1973–1975, 1980, 2071, 2083, 2106, 2361, 2385, 2484, 2738, 2947, 3020, 3031, 3033–3037, 3039, 3042–3050, 3060, 3130, 3267, 3272, 3277, 3282, 3286, 3370–3372, 3401, 3489–3493
packages/openai/agent_framework_openai
   _chat_client.py88112386%522–525, 529–530, 536–537, 547–548, 555, 570–576, 597, 605, 628, 746, 845, 904, 906, 908, 910, 976, 990, 1070, 1080, 1085, 1128, 1250, 1431, 1436, 1440–1442, 1446–1447, 1513, 1542, 1548, 1558, 1564, 1569, 1575, 1580–1581, 1642, 1664–1665, 1680–1681, 1699–1700, 1743, 1906, 1944–1945, 1961, 1963, 2042–2050, 2080, 2187, 2222, 2237, 2257–2267, 2280, 2291–2295, 2309, 2323–2334, 2343, 2375–2378, 2386–2387, 2389–2391, 2405–2407, 2417–2418, 2424, 2439
   _chat_completion_client.py3582892%428, 524–525, 529, 672, 755–762, 764–767, 777, 855, 857, 874, 895, 923, 936, 960, 980, 1020, 1295
TOTAL28341329988% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
5656 30 💤 0 ❌ 0 🔥 1m 32s ⏱️

Copy link
Copy Markdown
Contributor Author

@giles17 giles17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 93%

✓ Correctness

This PR contains only cosmetic/formatting changes: a single blank line added after the ToolMode class in _types.py, and several multi-line expressions collapsed into single lines in test_hyperlight_codeact.py. There are no logic changes and no correctness issues. The allowed_tools feature referenced in the issue context is already fully implemented in the codebase (ToolMode TypedDict, validate_tool_mode, and OpenAI client conversion).

✓ Security Reliability

This PR contains only cosmetic changes: an extra blank line added in _types.py (line 3157) and reformatting of multi-line expressions into single lines in test_hyperlight_codeact.py. There are no functional, security, or reliability changes. The allowed_tools field referenced in context lines already existed prior to this diff.

✓ Test Coverage

The PR adds allowed_tools support to ToolMode with good test coverage for the core validation and OpenAI Responses API client conversion. Tests cover valid single/multiple tools, invalid mode combinations, and regression for plain auto mode. Two test coverage gaps are notable: (1) no test for an empty allowed_tools list ([]), which passes validation and produces a likely-invalid API payload {"type": "allowed_tools", "tools": []}, and (2) the Chat Completions client (_chat_completion_client.py line 665-666) silently drops allowed_tools by falling through to run_options["tool_choice"] = mode (i.e., just "auto"), but there is no test documenting this behavior or warning the user.

✗ Design Approach

The diff itself is trivial — a blank line added to _types.py and cosmetic test reformatting. No logic is changed. However, the allowed_tools field that this PR exposes in ToolMode is not fully wired up: _chat_completion_client.py (lines 655–666) never checks for allowed_tools and silently falls through to emitting plain tool_choice: "auto", making the feature a no-op for users of that client. The _chat_client.py (lines 1218–1224) handles it correctly, creating an inconsistency between the two clients.

Flagged Issues

  • _chat_completion_client.py _prepare_options (lines 655–666): the allowed_tools branch is missing. When mode == "auto" and allowed_tools is set, the code falls through to run_options["tool_choice"] = mode, silently discarding the list and emitting plain "auto". _chat_client.py lines 1218–1224 show the correct pattern to mirror. Without this fix the feature is non-functional for the Chat Completions client.

Suggestions

  • Add a test for validate_tool_mode({"mode": "auto", "allowed_tools": []}) — an empty list passes validation today but would produce {"type": "allowed_tools", "tools": []} at the API level. Consider whether validation should reject it, and add a test either way to document the expected behavior.
  • Add a test in test_openai_chat_completion_client.py covering tool_choice={"mode": "auto", "allowed_tools": ["fn"]} to lock in the expected API payload (or to document that allowed_tools is silently dropped), analogous to test_prepare_options_allowed_tools in test_openai_chat_client.py. If allowed_tools is intentionally unsupported in the Chat Completions client, consider raising a warning so users don't silently lose the restriction.

Automated review by giles17's agents

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Python SDK support for OpenAI/Azure OpenAI tool_choice.type="allowed_tools" so callers can restrict tool invocation without removing tools from the prompt/tool list.

Changes:

  • Extend core ToolMode to include optional allowed_tools (only valid with mode="auto") and update validation.
  • Update OpenAI chat client option preparation to translate allowed_tools into the OpenAI wire format, with accompanying unit tests.
  • Adjust samples to suppress pyright optional-dependency import errors for orjson.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
python/packages/core/agent_framework/_types.py Adds allowed_tools to ToolMode and extends validate_tool_mode constraints.
python/packages/core/tests/core/test_types.py Adds unit tests for ToolMode.allowed_tools and validation behavior.
python/packages/openai/agent_framework_openai/_chat_client.py Maps ToolMode.allowed_tools into OpenAI tool_choice “allowed_tools” payload.
python/packages/openai/tests/openai/test_openai_chat_client.py Adds tests ensuring _prepare_options emits correct OpenAI tool_choice format.
python/samples/02-agents/conversations/file_history_provider.py Adds pyright ignore for optional orjson import.
python/samples/02-agents/conversations/file_history_provider_conversation_persistence.py Adds pyright ignore for optional orjson import.
python/packages/hyperlight/tests/hyperlight/test_hyperlight_codeact.py Minor test formatting adjustments.

Comment thread python/packages/core/agent_framework/_types.py
…ions client support

- validate_tool_mode now checks allowed_tools is a non-string sequence of
  strings and normalizes to list[str], raising ContentError for invalid types
- Add missing allowed_tools branch in _chat_completion_client._prepare_options
  so allowed_tools is emitted as the OpenAI allowed_tools wire format instead
  of being silently dropped
- Add tests for invalid allowed_tools types (string, int, mixed), empty list,
  tuple normalization, and Chat Completions client payload generation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

@giles17 giles17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 91%

✓ Correctness

The diff adds support for allowed_tools in ToolMode, following the same pattern as the existing required_function_name field. The validation logic in _types.py correctly checks type constraints (non-string sequence of strings), normalizes tuples to lists, and gates the field to mode == 'auto'. Both the Chat Completion client and the Responses API client correctly convert the validated allowed_tools into the OpenAI API format. The walrus operator chain in the client's if/elif branches is correct — mode is assigned even when the first condition short-circuits. Tests cover the key cases including invalid types, empty lists, tuple normalization, and single/multiple tool names. No correctness issues found.

✓ Security Reliability

The implementation is clean and follows the established patterns for ToolMode validation and client conversion. Input validation is thorough (type-checking allowed_tools as a non-string sequence of strings), and the conversion to OpenAI API format is correct. The validation function properly prevents conflicting fields (e.g., both required_function_name and allowed_tools). No security or reliability issues found.

✓ Test Coverage

The new allowed_tools feature has solid test coverage for validation logic (type checks, normalization, invalid mode combinations) and basic client payload generation (single and multiple tools). Two minor gaps: (1) no client-level test for an empty allowed_tools list, which the validation explicitly permits and would produce "tools": [] in the payload; (2) no regression test verifying that {"mode": "auto"} without allowed_tools still falls through to produce tool_choice = "auto" (though this is indirectly covered by the existing parametrized test at line 1627). Overall the coverage is good and assertions are meaningful—each test verifies specific structural properties of the output rather than just asserting no exception.

✓ Design Approach

The PR adds allowed_tools support to ToolMode following the same pattern as required_function_name: extend the TypedDict, validate centrally in validate_tool_mode, and convert to provider-specific API format in the client. The implementation is consistent with the existing framework design at every layer. No fundamental design problems found. One minor observation: when allowed_tools is already a list (the common case), validate_tool_mode returns the original dict object unchanged (final return tool_choice), while a tuple input returns a newly-constructed dict — this asymetry is harmless and matches the existing behavior for required_function_name, but worth being aware of. There are no missing cases in validation logic and the is not None guard in the client correctly passes empty lists through to the API.

Suggestions

  • Add a client-level test for empty allowed_tools list (e.g., {"mode": "auto", "allowed_tools": []}) to verify _prepare_options produces {"type": "allowed_tools", "mode": "auto", "tools": []} rather than falling through to run_options["tool_choice"] = mode. Validation tests confirm the empty list is accepted, but no client test exercises the resulting payload shape.
  • Consider adding a regression test verifying that {"mode": "auto"} (without allowed_tools) still produces tool_choice = "auto" through _prepare_options, since the new elif branch could theoretically interfere if the walrus operator condition were wrong. The existing parametrized test at line 1627 covers "auto" as a string but not {"mode": "auto"} as a dict without allowed_tools.

Automated review by giles17's agents

@cecheta
Copy link
Copy Markdown
Member

cecheta commented Apr 18, 2026

Thanks for looking into this, just to add that allowed_tools also supports mode: required in addition to auto.

https://github.com/openai/openai-python/blob/main/src/openai/types/responses/tool_choice_allowed.py

@giles17 giles17 changed the title Python: Support OpenAI allowed_tools tool choice in Python SDK Python: Support OpenAI allowed_tools tool choice Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: [Feature]: Support OpenAI allowed_tools

4 participants