Conductor is a CLI tool for defining and running multi-agent workflows with the GitHub Copilot SDK. Workflows are defined in YAML and support parallel execution, conditional routing, loop-back patterns, and human-in-the-loop gates.
# Install dependencies
make install # or: uv sync
make dev # install with dev dependencies
# Run tests
make test # all tests
uv run pytest tests/test_engine/test_workflow.py # single file
uv run pytest -k "test_parallel" # pattern match
# Run tests with coverage
make test-cov
# Lint and format
make lint # check only
make format # auto-fix and format
# Type check
make typecheck
# Run all checks (lint + typecheck)
make check
# Run a workflow
uv run conductor run workflow.yaml --input question="What is Python?"
# Run with web dashboard
uv run conductor run workflow.yaml --web --input question="What is Python?"
# Run in background (prints dashboard URL and exits)
uv run conductor run workflow.yaml --web-bg --input question="What is Python?"
# Stop a background workflow
uv run conductor stop # auto-stop if one running, list if multiple
uv run conductor stop --port 8080 # stop specific port
uv run conductor stop --all # stop all background workflows
# Update conductor
uv run conductor update # check for and install latest version
# Resume a failed workflow from checkpoint
uv run conductor resume workflow.yaml # resume from latest checkpoint
uv run conductor checkpoints # list available checkpoints
# Validate a workflow
uv run conductor validate examples/simple-qa.yaml
make validate-examples # validate all examples-
cli/: Typer-based CLI with commands
run,validate,init,templates,stop,update,resume,checkpointsapp.py- Main entry point, defines the Typer applicationrun.py- Workflow execution command with verbose logging helpersbg_runner.py- Background process forking for--web-bgmodepid.py- PID file utilities for tracking/stopping background processesupdate.py- Update check, version comparison, and self-upgrade viauv tool install
-
config/: YAML loading and Pydantic schema validation
schema.py- Pydantic models for all workflow YAML structures (WorkflowConfig, AgentDef, ParallelGroup, ForEachDef, etc.)loader.py- YAML parsing with environment variable resolution (${VAR:-default}) and!filetag supportvalidator.py- Cross-reference validation (agent names, routes, parallel groups)
-
engine/: Workflow execution orchestration
workflow.py- MainWorkflowEngineclass that orchestrates agent execution, parallel groups, for-each groups, and routingcontext.py-WorkflowContextmanages accumulated agent outputs with three modes: accumulate, last_only, explicitrouter.py- Route evaluation with Jinja2 templates and simpleeval expressionslimits.py- Safety enforcement (max iterations, timeout)checkpoint.py- Automatic checkpoint saving on failure and resume support
-
executor/: Agent execution
agent.py-AgentExecutorhandles prompt rendering, tool resolution, and output validation for single agentsscript.py-ScriptExecutorruns shell commands as workflow steps, capturing stdout/stderr/exit_codetemplate.py- Jinja2 template renderingoutput.py- JSON output parsing and schema validation
-
providers/: SDK provider abstraction
base.py-AgentProviderABC definingexecute(),validate_connection(),close()copilot.py- GitHub Copilot SDK implementationclaude.py- Anthropic Claude API implementationfactory.py- Provider instantiation
-
gates/: Human-in-the-loop support
human.py- Rich terminal UI for human gate interactions
-
interrupt/: Interactive workflow interruption (Esc/Ctrl+G to pause)
listener.py- Keyboard listener daemon thread for Esc/Ctrl+G detection
-
web/: Real-time web dashboard for workflow visualization
server.py- FastAPI + uvicorn server with WebSocket broadcasting, late-joiner state replay, andPOST /api/stopendpointstatic/index.html- Single-file Cytoscape.js frontend with DAG graph, agent detail panel, and streaming activity
-
events.py: Pub/sub event system decoupling workflow execution from rendering (console, web dashboard)
-
exceptions.py: Custom exception hierarchy (ConductorError, ValidationError, ExecutionError, etc.)
- CLI parses YAML via
config/loader.py→WorkflowConfig WorkflowEngineinitializes with config and provider- Engine loops: find agent/parallel/for-each/script → execute → evaluate routes → next
- Parallel groups execute agents concurrently with context isolation (deep copy snapshot)
- For-each groups resolve source arrays at runtime, inject loop variables (
{{ item }},{{ _index }},{{ _key }}) - Script steps run shell commands via asyncio subprocess, expose stdout/stderr/exit_code to context
- Routes evaluated via
Routerusing Jinja2 or simpleeval expressions - Final output built from templates in
output:section
- Context modes:
accumulate(all prior outputs),last_only(previous only),explicit(only declared inputs) - Failure modes for parallel/for-each:
fail_fast,continue_on_error,all_or_nothing - Route evaluation: First matching
whencondition wins; nowhen= always matches - Tool resolution:
null= all workflow tools,[]= none,[list]= subset
Tests mirror source structure in tests/:
test_cli/- CLI command tests, e2e teststest_config/- Schema validation, loader teststest_engine/- Workflow, router, context, limits teststest_executor/- Agent, template, output teststest_providers/- Provider implementation teststest_integration/- Full workflow execution teststest_gates/- Human gate tests
Use pytest.mark.performance for performance tests (exclude with -m "not performance").
- Python 3.12+
- Ruff for linting/formatting (line length 100)
- Google-style docstrings
- Type hints required, checked with ty (Red Knot)
- Pydantic v2 for data validation
- async/await for all provider operations
All providers (copilot.py, claude.py) must maintain feature parity. Any change to one provider's behavior, contract, or capabilities must be applied to all providers. This includes:
- Event callbacks: Same event types emitted at the same semantic points
agent_turn_startwith{"turn": "awaiting_model"}— immediately before each API callagent_turn_startwith{"turn": N}— at the start of each agentic loop iterationagent_message— for text content in responsesagent_reasoning— for reasoning/thinking contentagent_tool_start/agent_tool_complete— around tool executions
- Retry and error handling: Same retry semantics, error classification (retryable vs. fatal), and timeout behavior
- Output contract: Same
AgentOutputstructure with consistent field population (model, tokens, input_tokens, output_tokens, content) - Tool execution: Same MCP tool calling interface and result handling
- Session management: Same lifecycle (
validate_connection(),execute(),close())
When modifying any provider, check all other providers for the same change. The dashboard, JSONL logger, console subscriber, and workflow engine all depend on consistent behavior across providers.