Top-level Workflows¶
Two slash commands in this plugin are orchestrator-only: they do not implement, review, or merge anything themselves — they sequence other skills and agents through their phases, holding the human gates between them.
/ship— spec → plan → build → review → PR, end-to-end./test-improve— seven-phase consolidated analyze-then-improve orchestrator for legacy or in-flight test suites.
Both follow the same orchestrator contract: delegate every phase to the owning skill or agent, honor human gates, surface ambiguous inputs in one batch up front, and report outcomes concisely.
/ship¶
File: skills/ship/SKILL.md
Role: orchestrator.
Use when: the user says "ship this", "take this feature end to end", or
wants the spec → plan → build → PR flow without re-assembling it each time.
Steps¶
| # | Step | Delegates to | Human gate after? |
|---|---|---|---|
| 1 | Approach contract — screen request against knowledge/decision-defaults.md; resolve ambiguous high-reversal-cost axes in one batch. |
(orchestrator only) | yes, if a blocker remains |
| 2 | Spec (skipped with --skip-spec) — produce Intent, Architecture, Acceptance Criteria. |
/specs |
yes — operator approves the spec |
| 3 | Plan — decompose into vertical slices with Gherkin scenarios; a tier-scaled set of plan-review personas (1–5, by plan complexity) runs in parallel before the gate. | /plan |
yes — operator approves the plan |
| 4 | Build — RED-GREEN-REFACTOR per slice, inline review checkpoints, verification evidence. Do not proceed until the suite is green. | /build |
no |
| 5 | Review — run quality-review agents and let the auto-fix loop converge. Only judgment-call findings escalate to the operator. | /code-review |
no |
| 6 | PR — pre-PR quality gate, open PR, arm auto-merge by default (--no-auto-merge to opt out). |
/pr |
yes — the PR is the final review artifact |
| 7 | Report — PR URL, quality-gate result, whether auto-merge is armed. | (orchestrator only) | — |
Agents involved (dispatched by the delegated skills)¶
/build's inline review checkpoints dispatch the review agents listed in
team-structure.md → Review Agent Dispatch.
/plan dispatches a tier-scaled subset of the five plan-review personas
(prompts/plan-review-*.md) — the Acceptance Test Critic always
runs; the rest are added as the plan's tier (trivial/standard/complex)
warrants — and the
progress-guardian gate-keeper.
/code-review re-runs the same review agents over the full changeset.
Arguments¶
/ship <feature-description> [--skip-spec] [--no-auto-merge]
Notes¶
- Sequencing only — every gate, fix loop, and evidence requirement comes from
the underlying skills. If any phase stops at a gate,
/shipstops with it. - For a plan-only pass, use
/plan; for build-only, use/build. - Resume across sessions with
/continue.
/test-improve¶
File: skills/test-improve/SKILL.md
Role: orchestrator.
Consolidated analyze-then-improve test orchestrator. Defaults to lightweight ceremony, prompts for heavier capabilities on demand, and always baselines coverage (and mutation, when enabled) before any test change.
Phases¶
Each phase writes a progress file to
memory/test-improve/<slug>/phase-<n>.md so /continue (and --from-phase)
can resume.
- Phase 0 — Approach contract. Batched prompt (Enter accepts all
defaults): mutation
[off], BDD rubric[none], refactor[no-refactor], quality targets, sink (--parent <url>vs local files). Go stack shows the alpha go-mutesting advisory before the mutation prompt. Answers are immutable for the run. - Phase 1 — Analyze. Delegate to
/test-health(sole worker). No separate calls to/cd-test-architecture,/test-design,/mutation-testing. Mutation section respects Phase-0 setting. - Phase 2 — Baseline (before any test edit).
/coverage-baseline --workflow test-improveunconditionally;/mutation-testing --baseline --workflow test-improvewhen mutation is on. Go = advisory-only marker. Honest score = hard kills, timeouts separate. - Phase 2b — Derive Gherkin (conditional).
noneskips entirely;xunit-with-annotationswrites.featurefiles without a runner;bdd-runnerwires the native parser. - Phase 3 — Triage.
/issues-from-assessment --workflow test-improvepartitions findings intoNO_REFACTOR(Phase-4 Stories) /REFACTOR_REQUIRED(deferred to Phase 5) /LOW_VALUE(advisory-only). - Phase 4 — Improve without refactoring. Per Story:
/build(no-refactor) →/coverage-delta --workflow test-improve --story <id>→mutation-killagent (--file <story-file> --max-rounds 3,[c/r/w/q]on residuals). End-of-phase review loop runs/test-design --sinceand/code-review --sincein parallel,/apply-fixesthen re-run, cap 2 iterations,[r/w/q]escalation. Evidence inphase-4-review.json. - Phase 4b — Refactor decision prompt.
[y] enter Phase 5 / [b] backlog and skip to Phase 6 / [q] quit. - Phase 5 — Refactor-for-testability (conditional). Only when
[y]. Seam-only production-code changes; existing tests are immutable. Same end-of-phase review loop; evidence inphase-5-review.json. - Phase 6 — Validate.
/quality-targets-converge --workflow test-improve. Mutation off = skipped (not waived). Go = advisory-only. Coverage < 90% in no-refactor mode →[y/n]re-run-in-refactor-allowed prompt lists backlogged items. - Phase 7 — Executive-summary report. Interpolates the shipped
templates/executive-summary.mdfrommemory/test-improve/<slug>/files toreports/test-improve/<repo-slug>-<date>.md. 10 numbered sections; empty sections render "Not applicable" (never omitted). Parent tracker (orplans/test-improve/FEATURE.md) is updated with a link to the report. Report is regeneratable from memory.
Invocation¶
/test-improve <repo-path> [--parent <url>] [--analyze-only] [--from-phase <n>] [--stack <id>]
Flow diagram:
diagrams/test-improve-flow.svg.
Why these are documented together¶
/ship and /test-improve are the two multi-phase pipelines with
inter-phase human gates in the plugin. Every other slash
command is either a single-step worker (e.g. /coverage-baseline,
/triage) or a one-shot orchestrator that returns in a single pass (e.g.
/code-review, /test-design). Knowing the phase order, the owning skill
or agent for each step, and where the human gates fall is the difference
between operating these workflows confidently and re-reading every SKILL.md
each time.