Changelog
Chronological log of features and fixes merged into main. For full detail see
each PR on GitHub.
2026-03-03
TaskPaths field rename and YAML/JSON nesting — PR #46
Renames the three TaskPaths fields to shorter names and aligns the YAML task
syntax and task_context.json with the Python dataclass structure.
TaskPathsfields renamed:src_paths→src,test_paths→test,spec_path→spec.- YAML
paths:nesting: task entries now use a nestedpaths:block instead of flat keys at the task level:paths: src: - src/agentrelaydemos/palindrome.py spec: specs/palindrome.md task_context.jsonnesting:write_task_context()andwrite_merger_task_context()now emit{"paths": {"src": [...], "test": [...], "spec": ...}}matching the YAML shape.WorktreeTaskRunner: constructor params andself.*attributes renamed accordingly;from_config()reads from the nestedpathsdict.CLAUDE.mdgraph YAML schema updated to show the new nested syntax.- 13 YAML files in
agentrelaydemos/main/graphs/converted to the new format.
No behaviour changes — purely a naming and structure improvement. 349 tests, pixi run check clean.
Key files: agent_task.py, agent_task_graph.py, run_graph.py, task_launcher.py,
worktree_task_runner.py.
Group src_paths/test_paths/spec_path into TaskPaths dataclass — PR #45
Introduces a TaskPaths frozen dataclass in agent_task.py to group the three path
fields, then moves AgentTask.src_paths, .test_paths, and .spec_path under a single
paths: TaskPaths attribute.
TaskPaths(src_paths, test_paths, spec_path)(new frozen dataclass): groups the three path fields with the same types and defaults as before.AgentTask.paths: TaskPathsreplaces the three flat fields; all access sites updated totask.paths.src_paths,task.paths.test_paths,task.paths.spec_path.AgentTaskGraphBuilder.from_yaml()updated: still reads flat YAML keys (src_paths:,test_paths:,spec_path:) and wraps them inTaskPaths(...).write_task_context()andWorktreeTaskRunner.from_config()unchanged at the JSON level (flat keys preserved in this PR; nested in PR #46).
349 tests, pixi run check clean.
Key files: agent_task.py, agent_task_graph.py, run_graph.py, task_launcher.py,
worktree_task_runner.py.
Demo YAML: split_tdd_single.yaml updated to three-node structured-roles graph — PR #42
Replaces the two-node split-TDD demo in agentrelaydemos with a three-node pipeline
that exercises the full set of structured-role features added in PRs #36–#41.
graphs/split_tdd_single.yaml(inagentrelaydemos): three tasks —write_roman_spec(role: spec_writer),write_roman_tests(role: test_writer),impl_roman(role: implementer).- Features exercised:
role:key on plain tasks,src_paths/test_pathsas structured lists, spec-in-source (docstrings as authoritative spec, no separate.md),description:omitted ontest_writerandimplementertasks, dispatch-time path validation, MERGER agent reviews implementer PR for spec integrity,design_concernsmechanism available to implementer,completion_gatewithcoverage_threshold: 95, graph-levelverbosity: standard. tdd_groupremoved: the new pipeline uses plain tasks with explicit roles instead of thetdd_groups:expansion macro.
pixi run check clean (no agentrelay source changes in this PR).
Key file: /data/git/agentrelaydemos/main/graphs/split_tdd_single.yaml.
Verbosity / ADR mechanism — PR #41
When a task's effective verbosity is above "standard", the agent now writes an
ADR (Architecture Decision Record) to docs/decisions/{task_id}.md as part of its
normal commit. The final graph PR body lists all ADRs produced, and
docs/decisions/index.md is updated on the graph branch before the final PR is
created.
_adr_step(task, graph) -> str(new, inrun_graph.py): returns an empty string when effective verbosity is"standard"; otherwise returns a numbered step instructing the agent to write an ADR atdocs/decisions/{task.id}.mdwith YAML front matter (task_id,graph,role,date,verbosity) and three standard sections (## Context,## Decision,## Consequences). At"educational"verbosity two additional sections are added (## Key Concepts,## Alternatives Considered). Always ends withgit add docs/decisions/{task.id}.md._effective_verbosity(task, graph) -> str: already present (PR 37 scaffold); now actively used by_adr_step.- All role prompt builders updated (
_build_spec_writer_prompt,_build_test_writer_prompt,_build_test_reviewer_prompt,_build_implementer_prompt,_build_generic_instructions): each accepts a new optionalgraph: AgentTaskGraph | None = Noneparameter and injects_adr_stepbetween the "do the work" step and the "stage, commit, push" step. Step numbers shift accordingly when the ADR step is active. _build_task_instructions()updated: acceptsgraph: AgentTaskGraph | None = Noneand passes it to each role builder._run_task()updated: passesgraphto_build_task_instructions()._run_graph_loop()updated: callswrite_adr_index_to_graph_branch()beforecreate_final_pr()when all tasks are done.scan_adr_section(graph_name, target_repo_root) -> str(new, intask_launcher.py): usesgit ls-treeto listdocs/decisions/*.mdon the graph branch andgit showto read their YAML front matter; returns a formatted## ADRs produced in this runsection for the final PR body (empty string if no ADRs found).write_adr_index_to_graph_branch(graph_name, target_repo_root, worktrees_root)(new, intask_launcher.py): creates a temporary detached worktree on the graph branch, writes/updatesdocs/decisions/index.md(a markdown table of all ADR files with task_id, role, date columns), commits, and pushes. Silent no-op when no ADR files are present. Stale worktrees at the index path are cleaned up before and after the operation._extract_front_matter_field(content, field) -> str | None(new private helper, intask_launcher.py): parses a YAML front-matter value from a markdown file string; used byscan_adr_sectionandwrite_adr_index_to_graph_branch.create_final_pr()updated: callsscan_adr_section()and appends the ADR listing to the PR body alongside the existing design-concerns section.
6 new tests (375 total, up from 369). pixi run check clean.
Key files: run_graph.py, task_launcher.py, test/test_run_graph.py.
2026-03-02
MERGER role and merge_history — PR #40
Before the orchestrator calls gh pr merge, it now launches a MERGER agent to review
the PR in a tmux pane with CWD = target_repo_root. The agent checks spec integrity
for IMPLEMENTER PRs, logs findings to a graph-level merge_history.md, and returns
a verdict. The orchestrator merges only on approval.
_build_merger_prompt(reviewed_task, pr_url, history_path) -> str(new, inrun_graph.py): builds instructions for the MERGER agent. For IMPLEMENTER tasks withsrc_paths, includes a docstring integrity check (verifies docstrings were not materially altered — additive changes OK, changed signatures/Args/Returns/Raises NOT OK). Always instructs the agent to append a## Review: {task_id} — {timestamp}entry tomerge_history.mdand callmark_done("approved")ormark_done("rejected: {reason}")(nevermark_failed)._launch_merger(reviewed_task, pr_url, graph, tmux_session) -> str(new, inrun_graph.py): creates the merger signal dir at.workflow/{graph}/merge_reviews/{task_id}/, writestask_context.jsonandinstructions.md, launches claude in a new tmux pane with CWD =target_repo_root, returns the pane_id._run_task()updated: aftersave_pr_summaryand beforemerge_pr, launches the MERGER, polls for its completion sentinel, reads the verdict, and closes the pane. If the verdict does not start with"approved", sets status toFAILEDand returns without merging.merge_history_path(graph_name, target_repo_root) -> Path(new, intask_launcher.py): returns.workflow/{graph_name}/merge_history.md.poll_for_completion_at(signal_dir, poll_interval) -> str(new async, intask_launcher.py): polls an arbitrarysignal_dirfor.doneor.failed; variant ofpoll_for_completionthat accepts aPathdirectly.launch_agent_in_dir(cwd, task_id, tmux_session, signal_dir, model) -> str(new, intask_launcher.py): launches a claude agent with a given CWD without requiring a worktree (used by MERGER which runs intarget_repo_root).read_done_note_at(signal_dir) -> str(new, intask_launcher.py): reads line 2 of.donefrom an arbitrarysignal_dir; complement toread_done_note.close_pane_by_id(pane_id) -> str(new, intask_launcher.py): kills a tmux window by pane_id; used to close the MERGER pane after polling completes.write_merger_task_context(...)(new, intask_launcher.py): writes a minimaltask_context.jsonfor the MERGER agent (role=merger, task_id, signal_dir, graph_name, graph_branch, src_paths).
7 new tests (367 total, up from 360). pixi run check clean.
Key files: run_graph.py, task_launcher.py, test/test_run_graph.py.
Gate failure recording in merge_history.md
When the orchestrator's completion gate fails, the failure is now appended to the
graph-level .workflow/{graph}/merge_history.md alongside MERGER review entries,
making the full PR lifecycle auditable from a single file.
record_gate_failure(task_id, pr_url, gate_cmd, graph_name, target_repo_root)(new, intask_launcher.py): appends a## Gate failure: {task_id} — {ts}section with the PR URL, verdict, and gate command._run_task()updated: callsrecord_gate_failure()in the gate-failure branch, betweensave_pr_summaryandtask.state.status = TaskStatus.FAILED.
2 new tests (369 total, up from 367). pixi run check clean.
Key files: task_launcher.py, run_graph.py, test/test_task_launcher.py.
design_concerns mechanism — PR #39
Agents (primarily IMPLEMENTER) can now record concerns about the spec, tests, or
architecture during implementation. Concerns accumulate in a per-task
design_concerns.md file and are surfaced in the final graph PR body.
WorktreeTaskRunner.record_concern(concern: str) -> None(new): appends a timestamped## Concern recorded at {ISO timestamp}entry to$AGENTRELAY_SIGNAL_DIR/design_concerns.md. Creates the signal dir and file if absent. Each call appends — multiple concerns are accumulated in one file.read_design_concerns(signal_dir: Path) -> str | None(new, intask_launcher.py): returns the stripped file contents ifdesign_concerns.mdexists and is non-empty, elseNone.create_final_pr()updated: scans.workflow/{graph}/signals/*/design_concerns.mdbefore creating the final PR. If any tasks recorded concerns, a## Design concerns raised during implementationsection is appended to the PR body with a### {task_id}subheading per task._build_implementer_prompt()updated: step 4 now instructs the agent to callrecord_concern()for any concerns encountered during implementation (before the commit/push step). Steps 4→5 and 5→6 renumbered accordingly._run_task()updated: readsdesign_concerns.mdafter gate passes; if concerns exist, callsappend_concerns_to_pr()to add them to the task PR body before savingsummary.mdand merging. On gate failure,summary.mdis saved immediately for debugging before the early return.append_concerns_to_pr(pr_url, concerns)(new, intask_launcher.py): fetches the current task PR body, appends a## Design concerns raised during implementationsection, and updates the PR via the GitHub REST API. This ensures concerns appear in the task PR (merged into the graph integration branch) as well as in the final graph PR.save_pr_summary()call reordered: now called after concerns are appended (when gate passes) sosummary.mdalways reflects the final PR body state.
9 new tests (360 total, up from 351). pixi run check clean.
Key files: worktree_task_runner.py, task_launcher.py, run_graph.py,
test/test_worktree_task_runner.py, test/test_task_launcher.py.
Dispatch-time path validation — PR #38
Adds validate_task_paths(task, worktree_path) to run_graph.py. After a worktree is
created and before the agent is launched, the orchestrator checks that required files and
directories are present. This surfaces missing prerequisites (e.g. SPEC_WRITER stubs not
yet merged) with a clear error message rather than letting an agent fail mid-task.
validate_task_paths(task, worktree_path) -> None(new): role-aware validation:SPEC_WRITER: parent directory of eachsrc_pathsentry must exist; ifspec_pathis set, its parent directory must also exist.TEST_WRITER: eachsrc_pathsfile must exist (stubs from SPEC_WRITER must be merged in); parent directories oftest_pathsentries must exist.IMPLEMENTER: eachsrc_pathsfile must exist; eachtest_pathsfile must exist.MERGER,GENERIC: no validation (pass unconditionally).- Raises
ValueErrorwith a message containing the task id and missing path. _run_task()updated: callsvalidate_task_pathsaftercreate_worktree()and beforewrite_task_context(). OnValueError, prints the error, sets status toFAILED, and returns (worktree cleanup still runs via thefinallyblock).test/test_path_validation.py(new): 14 unit tests covering all roles and both the happy path and error cases.
SPEC_WRITER prompt, _spec_reading_step, and updated role prompts — PR #37
Implements the prompt-building layer for the structured-roles design. All new prompt
functions and helpers land in run_graph.py; no behaviour changes to the orchestrator
loop or data model.
_effective_verbosity(task, graph) -> str: returnstask.verbosity or graph.verbosity or "standard". Scaffolded here for use by PR 6's ADR mechanism._spec_reading_step(task) -> str: returns a "Before starting, read…" preamble whentask.src_pathsortask.spec_pathis set; returns""otherwise. Injected into all non-SPEC_WRITER role prompts so agents always read the authoritative spec before acting._build_spec_writer_prompt(task, graph_branch) -> str(new): dispatched forAgentRole.SPEC_WRITER. Instructs the agent to create stub files with full Google-style docstrings andraise NotImplementedErrorbodies, optionally write a supplementaryspec_path.md index, verify importability, commit, push, and create a PR. Ends with "Do NOT implement any function or method bodies."_build_task_instructions(): SPEC_WRITER dispatch added at the top (before TEST_WRITER)._build_test_writer_prompt()updated: stub-creation step removed (SPEC_WRITER already produced stubs); spec-reading preamble injected; referencestask.src_paths("already exist — do NOT create or overwrite them") andtask.test_pathswhen set._build_test_reviewer_prompt()updated: spec-reading preamble injected._build_implementer_prompt()updated: spec-reading preamble injected; referencestask.src_pathsandtask.test_pathswhen set; adds explicit docstring-preservation instruction ("MUST preserve all existing docstrings … do NOT alter Args/Returns/Raises … docstrings are the specification contract")._build_generic_instructions()updated: spec-reading preamble injected immediately before "1. Do the work described in your task."
14 new tests (337 total, up from 323). 2 obsolete TEST_WRITER tests removed
(old stub-creation assertions no longer apply). pixi run check clean.
Key file: run_graph.py.
2026-03-01
Structured roles, src_paths/test_paths/spec_path/verbosity, SPEC_WRITER/MERGER roles — PR #36
Core data model and YAML parser additions for the structured-roles design. No behavior changes to existing agents; purely a foundation for subsequent PRs.
AgentRole.SPEC_WRITERandAgentRole.MERGER: two new enum members support the spec-in-source and PR-review workflows described in the design plan.AgentTask.descriptionoptional (default""): non-SPEC_WRITER roles can omit the description entirely; role + structured path fields carry the instruction content.src_paths: tuple[str, ...]: list of source files the task operates on. For SPEC_WRITER: files to create as stubs+docstrings. For IMPLEMENTER: files to fill in.test_paths: tuple[str, ...]: list of test files. TEST_WRITER creates them; IMPLEMENTER reads them as the test contract.spec_path: str | None: optional secondary spec.mdfor cases where the spec can't fit in source comments.verbosity: str | None: task-level verbosity override ("standard"/"detailed"/"educational");Nonemeans inherit from graph.AgentTaskGraph.verbosity: str = "standard": graph-level default verbosity parsed from YAMLverbosity:key.- YAML
role:key on plain tasks: parser now mapsrole: spec_writer→AgentRole.SPEC_WRITERetc.; invalid values raiseKeyErrorimmediately. task_context.json:write_task_context()serialises all new fields.WorktreeTaskRunner:from_config()readssrc_paths,test_paths,spec_path, andverbosityfromtask_context.json.- tdd_groups unchanged: all new capabilities apply to plain tasks only.
46 new tests (323 total, up from 277). pixi run check clean.
Key files: agent_task.py, agent_task_graph.py, task_launcher.py,
worktree_task_runner.py, test/test_agent_task.py, test/test_agent_task_graph.py,
test/test_task_launcher.py, test/test_worktree_task_runner.py.
Rename retries→attempts, task_params, review_on_attempt — PR #35
Three design improvements building on PR #34's gate machinery:
- Rename "retries" → "attempts" throughout:
max_gate_retries→max_gate_attempts,DEFAULT_GATE_RETRIES→DEFAULT_GATE_ATTEMPTS,gate_retries.log→gate_attempts.log,merge_pr(retries=6)→merge_pr(attempts=6). "Attempt 1" is unambiguous; "retry #1" is not (does it mean the first overall try or the second?). task_params: dict[str, Any]replaces the typedcoverage_thresholdfield. A generic{key}→str(val)substitution dict avoids schema churn for future gate-command parameters. A new_resolve_gate(task)helper inrun_graph.pyperforms the substitution before execution.coverage_thresholdmoves intotask_paramsin both the YAML andtask_context.json. YAML:task_params: {coverage_threshold: 90}(int preserved; coerced to str at substitution time).review_on_attempt: int = 1delays the self-review subagent until a specific gate attempt. Default1keeps the existing behavior (review before first attempt). Settingreview_on_attempt: 2causes the review instruction to appear inside the gate loop instead of before it, worded as "Attempt 2+: before running the gate on attempt 2 or later, spawn a self-review subagent."
277 tests (up from 270). Demo graphs in agentrelaydemos updated to use
max_gate_attempts and task_params: blocks.
Key files: agent_task.py, agent_task_graph.py, run_graph.py,
task_launcher.py, worktree_task_runner.py.
coverage_threshold, review_model, max_gate_retries, gate instruction hardening — PR #34
Three new optional AgentTask fields extend the completion_gate machinery:
coverage_threshold: int | None: substituted into{coverage_threshold}in the gate command string;run_completion_gate()performs the substitution before exec.review_model: str | None: model ID for a self-review subagent that the agent spawns after completing its main work; included ininstructions.mdwhen set.max_gate_retries: int | None: per-task override for gate retry count; falls back to the graph-widemax_gate_retriesYAML key (defaultDEFAULT_GATE_RETRIES = 5).
Gate instruction improvements to prevent agents from ignoring non-zero gate exits:
- "Exit code is the only accepted truth" paragraph explicitly overrides prior observations.
- tee + PIPESTATUS[0] pattern captures gate output to gate_last_output.txt while
preserving the gate command's exit code through the pipe.
- Steps use "gate_exit is 0" / "gate_exit is non-zero" language to remove ambiguity.
- mark_failed() instruction references gate_last_output.txt for post-mortem.
merge_pr() now retries up to 6× (5 s delay) on transient GitHub "not mergeable"
responses that occur after neutralize_pixi_lock_in_pr pushes a new commit.
WorktreeTaskRunner.record_gate_attempt(n, passed) appends a timestamped JSONL
entry to gate_retries.log in the signal directory so retry history is auditable.
270 tests (up from 258). Demo graphs in agentrelaydemos:
covg_and_retry_single.yaml, covg_and_retry_chained.yaml.
Key files: agent_task.py, agent_task_graph.py, run_graph.py,
task_launcher.py, worktree_task_runner.py.
AgentTask structure — completion_gate, agent_index, instructions.md channel — PR #33
Three coordinated improvements giving AgentTask more structure and reducing
reliance on large ephemeral f-string prompts:
completion_gate: optional shell command onAgentTask(YAMLcompletion_gate:key). The orchestrator runs it in the worktree after.doneis seen but before merging the PR. Non-zero exit marks the taskFAILEDwithout merging. The agent also self-validates via a step included ininstructions.md(defense-in-depth).agent_index: monotonically increasing integer assigned at dispatch time. Stored inTaskState, written totask_context.json, shown in log messages as[a{N}].AgentTaskGraphgains_agent_counter+next_agent_index().instructions.mdas the sole instruction channel: all role-specific steps move to a file written in the signal directory before the agent launches. The tmux prompt becomes a minimal bootstrap string.task_context.jsonis now a rich structured record (role,description,graph_branch,model,completion_gate,agent_index).WorktreeTaskRunnergains the new fields andget_instructions().BACKLOG.md: added 5 entries from a review of the originalagentrelayproject.
Scope: AgentTask and GENERIC role only. TDDTaskGroup intentionally untouched.
Key files: agent_task.py, agent_task_graph.py, run_graph.py,
task_launcher.py, worktree_task_runner.py.
2026-02-27
Move task_context.json and context.md to .workflow/ — PR #32
Both orchestration context files are now written under
.workflow/<graph>/signals/<task-id>/ instead of the worktree root.
task_context.json: was written to the worktree root (gitignored in target repos); now lives alongside the other signal files (.done,.merged,agent.log, etc.) and persists after worktree cleanup for easier debugging.context.md: was written to the worktree root with no gitignore entry, causing it to be picked up bygit add -Aand committed into PRs/main. Now also under.workflow/, so it is automatically gitignored and persists after teardown.AGENTRELAY_SIGNAL_DIRenv var: exported bylaunch_agent()to the tmux pane;WorktreeTaskRunner.from_config()reads it to locatetask_context.jsonwithout needing the worktree path. Theworktree_pathparameter offrom_config()is removed.WorktreeTaskRunner.get_context(): now reads fromself.signal_dir / "context.md"instead ofPath.cwd() / "context.md".
Key files: task_launcher.py, worktree_task_runner.py, run_graph.py.
reset_graph: skip git reset on out-of-order runs — PR #29
reset_graph.py now automatically skips the git reset --hard + force-push step
when the recorded start_head is not an ancestor of the current HEAD (i.e. the
reset would move main forward and re-introduce already-reset commits). All other
cleanup still runs: open PRs are closed, remote task branches are deleted, leftover
worktrees are removed, and the .workflow/ signal directory is cleared. The warning
message is updated to communicate that step 2 will be skipped rather than implying
the user must decide.
Correct reset order: most-recently-run graph first (reverse run order).
Key file: reset_graph.py.
Typed AgentTask.dependencies and TaskGroup ABC — PR #28
Replaced string-based dependency references with typed object references throughout the task model and graph builder.
TaskGroupABC added toagent_task.py: abstract frozen dataclass withid,description, and abstractdependency_idsproperty. Provides a common type for groups that expand to multiple tasks.AgentTask.dependencieschanged fromtuple[str, ...]totuple[AgentTask, ...]; adependency_idscomputed property reconstructs the string tuple when needed. No call sites that already heldAgentTaskobjects are broken.TDDTaskGroup(inagent_task_graph.py) extendsTaskGroupwith two typed dep fields:dependencies_single_task: tuple[AgentTask, ...]anddependencies_task_group: tuple[TaskGroup, ...]. Itsdependency_idsproperty concatenates both.- Topological sort (
_topo_sort, Kahn's algorithm) added toagent_task_graph.pyso the builder can construct frozen dataclasses in dependency-first order. _build_context_content()inrun_graph.pysimplified: no longer needs the graph as a parameter; iteratestask.dependenciesdirectly.- Tests updated across
test_agent_task.py,test_agent_task_graph.py, andtest_run_graph.py— 176 tests total.
Key files: agent_task.py, agent_task_graph.py, run_graph.py.
TDD workflow: AgentRole, TDDTaskGroup, role-specific prompts — PR #26
Added first-class TDD workflow support via a tdd_groups: YAML key that
auto-expands a single group entry into three sequential AgentTask objects —
test-writer, reviewer, and implementer — each dispatched as its own
worktree-PR-merge cycle.
AgentRoleenum (GENERIC,TEST_WRITER,TEST_REVIEWER,IMPLEMENTER) added toagent_task.py.AgentTaskgains two new optional fields:role(defaultGENERIC) andtdd_group_id(defaultNone) — fully backward-compatible with existing plaintasks:graphs.TDDTaskGroupdataclass added toagent_task_graph.py. It is a transient build-time concept:from_yaml()expands each group into{id}_tests,{id}_review, and{id}_implAgentTaskobjects, then discards the group.AgentTaskGraphstays a flatdict[str, AgentTask]. Thetasks:YAML key is now optional (a YAML with onlytdd_groups:is valid).- Cross-group dependencies: a dep string that matches another group's
idresolves to{dep}_implat build time; plain task IDs pass through unchanged. - Role-specific prompts:
_build_task_prompt()inrun_graph.pydispatches to_build_test_writer_prompt,_build_test_reviewer_prompt,_build_implementer_prompt, or_build_generic_promptbased ontask.role. Reviewer writes{task.id}.md(e.g.stats_module_review.md); implementer derives the review filename fromtask.id. - New test coverage: new
test/test_run_graph.py; additions totest_agent_task.pyandtest_agent_task_graph.py— 171 tests total.
Key files: agent_task.py, agent_task_graph.py, run_graph.py.
2026-02-26
BACKLOG.md for quick idea capture — PR #25
Added docs/BACKLOG.md — a lightweight place to park ideas immediately without
interrupting a current task. Three sections: Features, Improvements, Ideas/Maybe.
Items graduate to HISTORY.md when done. Added a pointer in CLAUDE.md so new
sessions see it.
Black + isort formatting pass — PR #24
Ran pixi run check (introduced in PR #23) across the whole codebase to apply
black line wrapping, blank-line normalization, and isort import ordering. No
functional changes — pure formatting.
CLAUDE.md, OPERATIONS.md, HISTORY.md, pixi run check — PR #23
Three usability improvements landed together:
CLAUDE.mdat repo root: concise project reference auto-loaded by Claude Code at session start — commands, module map, signal files, YAML keys, and links to detailed docs.pixi run check: one command for format + typecheck + test — the canonical pre-PR verification step.docs/OPERATIONS.md: day-to-day running guide (running a graph, reading results, resetting after a run, common failure modes).docs/HISTORY.md: per-PR development log covering all 23 PRs to that point.
Agent PR body capture + summary.md — PR #22
Agents are now prompted to write a real ## Summary / ## Files changed PR body
instead of the placeholder "Automated task.". Before merging, the orchestrator
fetches the body via gh pr view --json body and writes it to
.workflow/<graph>/signals/<task-id>/summary.md. The final status report shows
the summary path for each completed task.
Key functions: save_pr_summary() in task_launcher.py.
Configurable tmux session + keep_panes + graph reset — PR #21
Three related usability improvements:
tmux_sessiongraph YAML field +--tmux-sessionCLI flag: controls which tmux session agent windows open in (default:"agentrelay").keep_panesgraph YAML field +--keep-panesCLI flag: suppresses auto-close of agent tmux windows after task completion (useful for debugging).reset_graph.pymodule +record_run_start():python -m agentrelay.reset_graphreadsrun_info.json(written at graph start) and fully resets the target repo — closes open PRs,git reset --hard+ force-push, deletes remote branches, removes worktrees and signal dirs.
2026-02-25
pixi.lock neutralize-and-regenerate — PR #20
Eliminates pixi.lock merge conflicts when parallel agents both modify pixi.toml.
Before merging any PR, the orchestrator restores main's current pixi.lock into
the agent branch (neutralizing the agent's version), then after merging runs
pixi install and commits the freshly resolved lock file to main.
Key functions: neutralize_pixi_lock_in_pr(), commit_pixi_lock_to_main().
pixi.toml change detection — PR #19
After merging a PR, the orchestrator checks whether pixi.toml changed (via
gh pr view --json files). If so, it runs pixi install on the target repo to
keep the environment in sync.
Key functions: pixi_toml_changed_in_pr(), run_pixi_install().
Agent log capture + sibling repo support + target_repo_root rename — PR #18
- Agent log capture: after each task, the orchestrator captures the full tmux
pane scrollback and writes it to
.workflow/<graph>/signals/<task-id>/agent.log. - Sibling repo support:
AgentTaskGraphBuilder.from_yaml()accepts atarget_repo_rootoverride, enabling the orchestrator to drive tasks in a separate sibling repository (e.g.agentrelaydemos). - Rename:
repo_root→target_repo_rootthroughout for clarity.
Bypass --dangerously-skip-permissions dialog — PR #17
Fixed the send_prompt() timing sequence to correctly navigate the Claude Code
confirmation dialog (Down + Enter) before sending the task prompt.
2026-02-24
Multi-task async graph dispatch loop — PR #14
Core orchestrator loop (run_graph.py): loads a YAML graph, resolves task
dependencies, dispatches ready tasks concurrently via asyncio, polls for
completion signals, and merges PRs in order. Supports resuming interrupted runs
(signals persist across restarts).
pull_main after merge — PR #11
After merging a task PR, the orchestrator fast-forwards the local main branch
(git pull --ff-only) so subsequent tasks branch from the correct HEAD.
.workflow/ gitignored; graph YAML convention — PR #12
All runtime state (.workflow/) is gitignored. Graph definitions (graphs/)
are version-controlled. This keeps the working tree clean and makes graph
definitions reviewable and reusable.
PR workflow + timing fixes — PR #5
Agents create their own PRs and write the URL into the .done signal file.
Fixed timing delays in send_prompt() to ensure Claude finishes initializing
before the task prompt is sent.
PATH propagation in launch_agent — PR #7
Exports the orchestrator's PATH into each agent's tmux environment so tools
like gh, pixi, and claude are available in agent bash subshells.
2026-02-23 — 2026-02-24 (initial build)
Foundation — PRs #1–#3
- Project scaffolding:
pyproject.toml,pixi.toml, source layout - Documentation:
PROJECT_DESCRIPTION.md,WORKFLOW_DESCRIPTION.md,DESIGN_DECISIONS.md,REPO_SETUP.md - Core infrastructure:
AgentTask,AgentTaskGraph,AgentTaskGraphBuilder,WorktreeTaskRunner,task_launcherfunctions, demo script, initial tests