docs: add untaint controller documentation to CNI setup page#17190
docs: add untaint controller documentation to CNI setup page#17190nagendrareddy10 wants to merge 18 commits intoistio:masterfrom
Conversation
Adds a new 'Untaint controller' section to the CNI node agent installation guide, documenting the feature added in istio/istio#48818. The untaint controller addresses a race condition where pods can be scheduled on new nodes before istio-cni is ready, by placing a NoExecute taint (cni.istio.io/not-ready) on new nodes and removing it once the CNI agent is ready. Documents: - What the untaint controller does and when to use it - How to enable via IstioOperator and Helm - Full configuration reference (taint.enabled, PILOT_ENABLE_NODE_UNTAINT_CONTROLLERS, taint.namespace) - Relationship to the existing repair mechanism Fixes: istio#15003
|
Hi @nagendrareddy10. Thanks for your PR. I'm waiting for a istio member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
dhawton
left a comment
There was a problem hiding this comment.
A few breaking changes in here, such as the removal of spaces in the YAMLs.
…controller PR Per @dhawton's CHANGES_REQUESTED review: the PR inadvertently reformatted several existing sections, removing indentation from YAML code blocks and changing list formatting. Fixes: - Restore frontmatter 'aliases:' to 4-space indent (was accidentally changed to 2-space) - Restore prerequisite list markers from '-' back to '*' (original format) - Restore 'spec:' child indentation in 3 existing IstioOperator YAML blocks: * cni_agent_operator_install (components/cni/namespace/enabled) * Handling init container injection for revisions (revision/values/pilot/cni) * Canary upgrade IstioOperator (profile/components/cni/values), including fixing the broken 'excludeNamespaces: - istio-system' back to proper YAML list - Fix indentation in new untaint controller IstioOperator YAML example The new 'Untaint controller' subsection content is unchanged.
Two build failures addressed: 1. lint_istio.io (MD004): The 'Additional configuration' bullet list used '-' markers but the surrounding document uses '*'. Changed back to '*' to maintain consistent unordered list style per MD004 rule. 2. gencheck_istio.io: The new IstioOperator YAML code block in the untaint controller section was missing 'snip_id=none', causing the snip generator to create a new snip in snips.sh that was not committed. Added 'snip_id=none' to exclude it from automatic snip generation, as this YAML block is illustrative (not a test snippet).
…x error
Hugo shortcodes cannot mix positional and named parameters.
'{{< text yaml snip_id=none >}}' mixes positional ('yaml') and
named ('snip_id=none'), causing a Hugo build failure:
'got named parameter snip_id. Cannot mix named and positional parameters'
Fix by using the named form: '{{< text syntax=yaml snip_id=none >}}'
mdspell does not recognise 'untaint' as a valid word, causing 13 spelling errors in the new untaint controller documentation section. Added 'untaint' to the .spelling whitelist in sorted order.
Two .spelling issues:
1. 'untaint' was placed after 'untar' but alphabetically 'untai' < 'untar'
('i' < 'r'), so 'untaint' must come before 'untar'. The gencheck CI
auto-sorts the file and detected the wrong order.
2. 'Karpenter' (the node autoscaler referenced in the untaint controller
docs) was missing from the dictionary, causing 2 spelling errors in
lint. Added in correct sorted position (after Karma's, before katacoda).
sridhargaddam
left a comment
There was a problem hiding this comment.
The commit message still mentions that untaint controller will add the taint. Please update the commit message as well.
sridhargaddam
left a comment
There was a problem hiding this comment.
LGTM, thank you for addressing the review comments.
|
Do not set the auto-merge label. It doesn't work because it's only for PRs created by our automation, but do not set the auto-merge label. Your PR will not merge until approved by a Docs maintainer. |
Co-authored-by: Daniel Hawton <daniel.hawton@solo.io>
Co-authored-by: Daniel Hawton <daniel.hawton@solo.io>
Co-authored-by: Daniel Hawton <daniel.hawton@solo.io>
Co-authored-by: Daniel Hawton <daniel.hawton@solo.io>
Co-authored-by: Daniel Hawton <daniel.hawton@solo.io>
Co-authored-by: Daniel Hawton <daniel.hawton@solo.io>
Co-authored-by: Daniel Hawton <daniel.hawton@solo.io>
|
@dhawton Addressed comments, Please review the PR. |
Description
Adds documentation for the Untaint Controller feature (added in istio/istio#48818) to the CNI node agent installation guide.
The untaint controller prevents a race condition where pods can be scheduled on new nodes (e.g., in autoscaler environments like Karpenter) before
istio-cniis ready. It works by having the infrastructure provider place aNoScheduletaint (cni.istio.io/not-ready) on new nodes, and the untaint controller automatically removes it once the CNI agent reports ready.This addresses active user confusion — the feature exists but has no documentation, leading to issues for users with
Jobpods and node autoscalers (see issue comments like this one).Changes
Added a new "Untaint controller" subsection under "Race condition & mitigation" in
content/en/docs/setup/additional-setup/cni/index.md, documenting:values.pilot.taint.enabledandPILOT_ENABLE_NODE_UNTAINT_CONTROLLERS)Reviewers
Fixes: #15003