feat(provision): prompt to cancel Azure deployment on Ctrl+C (Bicep)#7795
feat(provision): prompt to cancel Azure deployment on Ctrl+C (Bicep)#7795
Conversation
When a user presses Ctrl+C during 'azd provision' or 'azd up' while a Bicep deployment is in flight on Azure, azd now pauses and asks whether to leave the Azure deployment running (default) or to cancel it via the ARM Cancel API and wait for a terminal state. - pkg/input: register-able interrupt handler stack with re-entrant Ctrl+C suppression while a handler is running. - pkg/azapi + pkg/infra: Cancel methods on DeploymentService / Deployment for both subscription- and resource-group-scoped deployments. Deployment Stacks return 'not supported' (no Cancel API surface today). - pkg/infra/provisioning: typed sentinel errors for the 4 outcomes (leave running / canceled / cancel timed out / cancel too late) plus telemetry attribute provision.cancellation. - pkg/infra/provisioning/bicep: interactive prompt + cancel-and-poll flow with 30s cancel-request timeout and 2-min terminal-state wait. - cmd/middleware + internal/cmd: bypass agent troubleshooting and map sentinels to telemetry codes. - docs/provision-cancellation.md: user-facing behavior, outcomes, provider scope, telemetry, and non-interactive fallback. Terraform and Deployment Stacks are out of scope and unchanged. Closes #2810 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds an interactive Ctrl+C flow for in-flight Bicep (ARM) deployments so users can explicitly choose to leave the deployment running or request Azure-side cancellation, with typed outcomes and telemetry.
Changes:
- Introduces a stack-based SIGINT handler mechanism with re-entrant suppression for interactive prompts.
- Adds ARM deployment cancel support for subscription- and resource-group-scoped deployments (with stack deployments explicitly unsupported).
- Wires Bicep provision/deploy to prompt on Ctrl+C, emit typed sentinel outcomes, and record
provision.cancellationtelemetry; documents the behavior.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| cli/azd/pkg/input/interrupt.go | New interrupt handler stack + re-entrant suppression primitives. |
| cli/azd/pkg/input/interrupt_test.go | Unit tests for handler stack and re-entrancy behavior. |
| cli/azd/pkg/input/console.go | Updates SIGINT watcher to consult registered handlers and ignore re-entrant interrupts. |
| cli/azd/pkg/infra/scope.go | Adds Cancel(ctx) to infra.Deployment and implements for RG/sub scopes. |
| cli/azd/pkg/infra/scope_test.go | Tests cancel endpoint calls for both scopes and error propagation. |
| cli/azd/pkg/infra/provisioning/cancel.go | Adds provisioning cancellation sentinel errors surfaced to middleware. |
| cli/azd/pkg/infra/provisioning/bicep/interrupt.go | Implements the interactive prompt + cancel/poll flow and outcome mapping. |
| cli/azd/pkg/infra/provisioning/bicep/interrupt_test.go | Tests terminal-state detection and interrupt outcome application. |
| cli/azd/pkg/infra/provisioning/bicep/bicep_provider.go | Wires interrupt handler around the in-flight ARM deploy call and sets telemetry. |
| cli/azd/pkg/azapi/deployments.go | Extends DeploymentService with cancel methods. |
| cli/azd/pkg/azapi/standard_deployments.go | Implements ARM cancel for subscription + resource group deployments. |
| cli/azd/pkg/azapi/stack_deployments.go | Implements cancel methods as unsupported for deployment stacks. |
| cli/azd/internal/tracing/fields/fields.go | Adds provision.cancellation tracing attribute key. |
| cli/azd/internal/cmd/errors.go | Maps new provisioning sentinel errors to classifications. |
| cli/azd/cmd/middleware/error.go | Ensures new sentinels bypass agent troubleshooting. |
| cli/azd/docs/provision-cancellation.md | Documents the Ctrl+C cancellation UX and outcomes. |
- pkg/input: LIFO test now invokes handlers and asserts distinct call counts to prove ordering. - pkg/infra/provisioning: add ErrDeploymentCancelFailed sentinel so the cancel-request-failure path no longer misclassifies as a timeout; wire it through error middleware skip-list and telemetry mapping. - pkg/infra: switch new TestScopeCancel subtests to t.Context(). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- pkg/azapi: add typed ErrCancelNotSupported sentinel; stack CancelSubscriptionDeployment / CancelResourceGroupDeployment now return it instead of an opaque string. - pkg/infra/provisioning/bicep: interrupt handler treats ErrCancelNotSupported as the safer 'leave running' outcome (matches documented stacks behavior + telemetry). Cancel-request error path routes through terminalToOutcome when the deployment is already in a terminal state, so the portal URL and consistent messaging are surfaced. Canceled terminal branch now prints the portal URL too. - pkg/infra/provisioning: ErrDeploymentCancelFailed doc comment now references errors.Is/errors.As (matches the multi-%w joined-error wrapping pattern used here). - pkg/infra/provisioning/bicep/bicep_provider: tear down the interrupt handler immediately after deployModule returns (sync.OnceFunc) to avoid a small window where a late Ctrl+C could surface the prompt over post-processing output. - internal/cmd/errors: map ErrCancelNotSupported in classifySentinel. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
If Ctrl+C arrives but the ARM deployment happens to finish naturally before the user picks an option in the prompt, the previous design could take the success path and silently drop the interrupt. - installDeploymentInterruptHandler now exposes a 'started' channel that is closed the instant Ctrl+C is received, before the prompt is shown. deployCtx is also cancelled immediately so PollUntilDone unblocks ASAP. - BicepProvider.Deploy block-receives the outcome whenever 'started' is closed (instead of a non-blocking drain), so the user's choice is always honored regardless of who wins the race. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- pkg/input/console: watchTerminalInterrupt now reserves the running slot before consulting the handler stack so re-entrant Ctrl+C is suppressed even if the stack is briefly empty (e.g. handler popped but still executing the prompt). - pkg/infra/provisioning/bicep/bicep_provider: defer cleanup until after the interrupt outcome is received so a second Ctrl+C during the prompt is still suppressed; the no-interrupt path tears down immediately as before. - pkg/infra/provisioning/cancel: doc reads 'sentinel errors' instead of 'typed errors' to match the implementation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- pkg/input/interrupt: enforce strict LIFO when popping handlers (only pop when this handler is still top-of-stack), so out-of-order pops never accidentally remove unrelated newer handlers. - pkg/infra/provisioning/bicep/interrupt: defensive default in terminalToOutcome now stops the spinner and emits a warning with the observed state and portal URL, leaving the UI clean if an unexpected terminal state is ever observed. - pkg/infra/provisioning/bicep/interrupt: treat DeploymentProvisioningStateDeleted as terminal in the cancel poll so we don't keep polling until the deadline if the deployment is deleted out from under us. Test updated accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- pkg/infra/provisioning/bicep/interrupt: wrap the interrupt handler closure with sync.OnceValue so close(started), cancelDeploy() and the outcome channel send all run at most once. Combined with the in-flight guard from tryStartInterruptHandler and the strict LIFO pop, additional Ctrl+C signals after the prompt completes can no longer panic or block on the buffered channel. - pkg/infra/provisioning/bicep/interrupt: print the portal URL on the prompt-failure leave-running path so the user always has a link to follow up when the URL is available. - docs/provision-cancellation: clarify that the portal URL is printed when available (not 'in every case'). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- pkg/input/interrupt: nil out the popped slot before truncating the interrupt stack so the GC can reclaim the popped handler and any state it captured, even before the underlying array is reallocated. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- pkg/input/console: run the registered interrupt handler inline on the signal goroutine instead of in a nested goroutine. This removes the scheduling window where SIGINT was received but the handler had not yet run, which could let a deploy goroutine complete naturally and silently drop the Ctrl+C. Re-entrant signals remain suppressed via tryStartInterruptHandler. - pkg/infra/provisioning/bicep/interrupt: switch the cancel poll loop to a time.Ticker and move the wait before each Get, so a slow Get cannot produce back-to-back ARM polls (preventing throttling). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| // If the deployment is already in a terminal state, route through | ||
| // the same terminal-outcome reporter so the user sees consistent | ||
| // messaging (including the portal URL). | ||
| if state, getErr := deployment.Get(context.WithoutCancel(ctx)); getErr == nil && |
There was a problem hiding this comment.
In the cancel-request error path, the follow-up deployment.Get(context.WithoutCancel(ctx)) call is unbounded (no timeout). If ARM is slow/unreachable, the interrupt handler can hang indefinitely after the user selected “Cancel”, preventing azd from exiting or surfacing the portal URL/outcome. Consider wrapping this Get in a short timeout context (similar to cancelRequestTimeout) so the interrupt flow always makes progress even when ARM is degraded.
| if state, getErr := deployment.Get(context.WithoutCancel(ctx)); getErr == nil && | |
| getCtx, getDone := context.WithTimeout( | |
| context.WithoutCancel(ctx), cancelRequestTimeout) | |
| defer getDone() | |
| if state, getErr := deployment.Get(getCtx); getErr == nil && |
| firstCalls++ | ||
| return true | ||
| } | ||
| pop1 := PushInterruptHandler(first) |
There was a problem hiding this comment.
This test pushes a global interrupt handler but doesn’t register cleanup until the end of the test. If any require.* before pop1() fails, the handler can remain on the global stack and pollute later pkg/input tests. Consider calling t.Cleanup(pop1) (or defer pop1()) immediately after this PushInterruptHandler call.
| secondCalls++ | ||
| return true | ||
| } | ||
| pop2 := PushInterruptHandler(second) |
There was a problem hiding this comment.
Same as above: if an assertion fails before pop2() runs, the global interrupt stack can be left in a dirty state. Register pop2 with t.Cleanup(pop2) (or defer pop2()) immediately after pushing it.
| switch state { | ||
| case azapi.DeploymentProvisioningStateCanceled, | ||
| azapi.DeploymentProvisioningStateFailed, | ||
| azapi.DeploymentProvisioningStateSucceeded, | ||
| azapi.DeploymentProvisioningStateDeleted: |
There was a problem hiding this comment.
isTerminalProvisioningState includes DeploymentProvisioningStateDeleted as terminal, but terminalToOutcome does not handle Deleted explicitly and will treat it as an “unexpected terminal state” (and map it to cancel_too_late). Either handle Deleted explicitly in terminalToOutcome (with appropriate messaging/telemetry) or remove it from the terminal set if it shouldn’t be surfaced here.
Azure Dev CLI Install InstructionsInstall scriptsMacOS/Linux
bash: pwsh: WindowsPowerShell install MSI install Standalone Binary
MSI
Documentationlearn.microsoft.com documentationtitle: Azure Developer CLI reference
|
Closes #2810
Summary
When a user presses Ctrl+C during `azd provision` or `azd up` while a Bicep deployment is in flight on Azure, azd now pauses and asks whether to:
Previously, Ctrl+C exited azd immediately while the deployment kept running on Azure with no easy follow-up.
Behavior
Provider scope
Implementation
Tests
Validation
Docs