Recommendation: mechanically enforce "prefer Python over bash"¶
Issue: #701 — exploration only, no mechanism implemented here. Follow-up implementation issue: #702 (see Follow-up for scope).
Problem¶
CLAUDE.md's Working Rules say "Prefer Python over bash, repo-wide, unless bash
is strictly required." Today that rule is prose, caught only if a human or
review agent happens to notice a new .sh/.bats file in a diff. It has
already regressed once: issue #700 documents
tests/skills/mutation_kill_slice_loop_refinements_tests.bats landing via
issues #667/#681 after epic #668 had ported every tests/skills/*.bats file
to pytest. Per this repo's own stance ("rules the agent should follow land as
hooks / ci-local.sh checks, not prose that can be ignored" —
feedback_prefer_hooks_over_prose_enforcement), this is a mechanical-gate gap.
What already exists (and isn't wired in)¶
scripts/check-python-only.py already implements the diff-based mechanism
option 1 describes in the issue: it runs git diff --diff-filter=A --name-only
<base>...HEAD, filters for newly added .sh files, checks them against an
AUDIT_EXCLUSIONS set, and reports violations. It was added in PR #581
(commit f2fb9e9d) as the ADR 0014 enforcement script promised in that ADR's
"Enforcement" line. It has three material gaps:
- Not wired into anything. It appears in no
ci-local.shcheck, no.github/workflows/*.ymljob, and no pre-push path.grep -rl check-python-only tests/ .github/returns nothing. It has run zero times in CI since it was written — dead code, not a dead gate. - Scoped only to
plugins/dev-team/. #700's regression was undertests/skills/, outside this script's path filter, so even wired in it would have missed the actual regression that motivated #701. - Only checks
.sh, not.bats. #700's regression file was.bats. - Advisory-only by design (
--blockis opt-in), because ADR 0014 gated blocking mode on "the epic's Phase 3 gate." ADR 0015 (2026-07-02) now records that phase as complete forplugins/dev-team/— the gate this script was waiting on has landed.
The right move is extend and wire in this existing script, not write a
new one. It already has the correct diff-based shape (--diff-filter=A, an
explicit exclusions set, --base/--block/--list flags) that issue #701's
option 1 asks for, and it already has ADR 0014 as its authority — no new ADR
needed, just an update noting the enforcement finally landed.
Recommended mechanism: extend check-python-only.py + wire into both gates¶
Chosen from the issue's three options: option 1 (CI/local diff gate),
generalized to cover .bats and a repo-wide (allowlisted) scope, run in both
ci-local.sh (pre-push, local) and a CI workflow job (PR, remote) — the
repo's existing dual-gate pattern. Option 2 (PreToolUse hook at authoring
time) and option 3 (one-time baseline+drift) are addressed below as
considered-and-rejected-for-now, with a note on when option 2 becomes worth
revisiting.
1. Extend the script's scope and allowlist¶
Changes to scripts/check-python-only.py:
- Add
.batsto the tracked extensions, not just.sh.#700's regression was.bats; a rule that only watches.shhalf-covers the problem this issue exists to close. - Broaden the path scope from
plugins/dev-team/to the whole repo, then carve out an explicit directory allowlist instead of a single path prefix. Rationale for each entry:
| Allowlisted path | Why it's exempt |
|---|---|
plugins/dev-team/install.sh (exact file) |
Existing ADR 0014 exception — the two-line shell trampoline that must run before Python is guaranteed on PATH. |
plugins/security-assessment/** |
A different plugin, shell-based by design (ADR 0014/0015 explicitly scope the Python rule to plugins/dev-team/ only). |
tests/security-assessment/** |
Test suite for the above; same rationale. |
evals/** |
Eval fixtures deliberately exercise shell-script scenarios (e.g. evals/codebase-recon/fixtures/polyglot/scripts/deploy.sh) as test data, not shipped tooling. A fixture that's supposed to look like an arbitrary repo's shell script needs to stay a shell script. |
.claude/*.sh (exact files: cloud-setup.sh, install-dev-team.sh) |
Same install-trampoline rationale as install.sh — these run in a SessionStart hook / cloud setup-script context before this repo's Python toolchain is guaranteed present. |
tests/lib/hermetic_tests.bats (exact file) |
Named as out-of-scope-here in #700 itself; owned by #677 (retire bats-core), not this gate. Existing-file edits aren't flagged anyway (the script only checks added files), so this entry only matters if the file is ever deleted and re-added. |
Everything else repo-wide — including tests/skills/, tests/repo/,
tests/agents/, tests/commands/, tests/docs/, tests/knowledge/, and
repo-root scripts/*.sh — is in scope for the gate.
- Repo-root
scripts/*.sh: block new additions, don't just discourage them. CLAUDE.md already says existing ones are "convert opportunistically when touched" — that's a statement about the ~20 legacy files, not a license to keep adding more. A new repo-root.shscript has the exact same regression risk as a new plugin one (untested-until-CI, another shellcheck surface, another bats-vs-pytest fork). Blocking new ones (with the same allowlist escape hatch below) keeps the "convert opportunistically" carve-out honest — it shrinks the shell-script surface instead of quietly growing it around the edges. - False-positive / escape-hatch handling: same mechanism as the existing
AUDIT_EXCLUSIONSset, generalized to a directory-or-exact-path list, with a required one-line comment justifying each entry inline in the source (the table above is the pattern to follow). A genuinely-necessary new shell script is a one-line source diff tocheck-python-only.pyin the same PR — reviewed like any other code change, not silently exempted by a separate untracked list. This makes "we needed bash here and here's why" an explicit, reviewable decision rather than a gate the author routes around.
2. Flip default mode to blocking¶
ADR 0014 gated --block on "the epic's Phase 3 gate" being reached. ADR 0015
records that gate as met for plugins/dev-team/ (2026-07-02). Recommend:
flip the script's default behavior to blocking (--block becomes the
default; keep an --advisory flag for anyone who wants the old behavior
locally), and update ADR 0014's "Enforcement" line to point at this doc +
ADR 0015 instead of "Advisory in Phase 0-2."
3. Wire into scripts/ci-local.sh (local, pre-push)¶
Add a check function alongside the existing ones:
chk_python_only() {
if [ -n "$BASE" ]; then
python3 scripts/check-python-only.py --base "$BASE" --block
else
python3 scripts/check-python-only.py --block # defaults to origin/main
fi
}
Add "prefer-Python-over-bash audit (check-python-only.py)::chk_python_only"
to the CHECKS array (near chk_rules_vs_prompts, since both are
prose-to-mechanical boundary sensors over repo conventions). This makes it
part of the default full run and therefore part of what the pre-push hook
runs before every push — the same local gate chk_rules_vs_prompts gets
today. It follows the file's existing BASE/HEAD plumbing (used today only
by chk_eval_semver), so no new argument-parsing plumbing is needed.
4. Wire into CI (.github/workflows/plugin-tests.yml)¶
The workflow already dispatches ci-local.sh --only=<comma-list> per job.
Add chk_python_only to the same --only= group as
chk_shellcheck_helpers,chk_shellcheck_tests,chk_sa_shell_suite (line 42) —
it's conceptually the same "shell hygiene" job, and check-python-only.py
needs git diff against the PR's actual base ref, which that job step
already has checked out with fetch-depth sufficient for shellcheck's own
diffing needs (verify fetch-depth covers the merge-base; if not, add
fetch-depth: 0 or the existing shallow-fetch pattern the shellcheck step
already uses). This keeps the required-status-check job count unchanged —
no new job, just one more check folded into an existing one.
Dual-gate coverage answered directly¶
Yes — wire both, per this repo's existing dual-gate pattern (chk_rules_vs_prompts,
chk_shellcheck_*, etc. all run in both places today via the shared
ci-local.sh --only= dispatch). Local pre-push catches the regression before
it's pushed; CI catches it if someone pushes with --no-verify or the local
hook is skipped/misconfigured. Since both call the same ci-local.sh
function calling the same script, there's no duplicated logic to keep in
sync — just two invocation sites.
Options considered and not recommended (for now)¶
Option 2 — PreToolUse hook blocking Write/Edit of new .sh/.bats
paths at authoring time. Rejected as the primary mechanism because:
- It only catches files created via Claude's own
Write/Edittools inside a Claude Code session — not files added viagit apply, a human editor,mv, or a script. The diff-based gate is authoring-tool-agnostic and catches every path a new file can enter the tree. - It would duplicate the allowlist logic in two places (a Python hook script
reading the same exemption list as
check-python-only.py) for a benefit that's purely about when the author is told, not whether the rule is enforced. - It's a legitimate future enhancement, not a replacement: once
check-python-only.py's allowlist logic is extracted into a small shared module (scripts/lib/python_only_allowlist.pyor similar), a thinPreToolUsehook could import that module and warn at write-time, shortening the feedback loop from "next push" to "next keystroke" — worth revisiting once the diff-gate is proven in CI and the allowlist has stabilized. Filed as a "nice to have, not blocking" note in the follow-up issue rather than in scope for the first cut.
Option 3 — one-time baseline snapshot + drift check. Rejected as
redundant with the diff-based approach: check-python-only.py's
--diff-filter=A semantics already give an equivalent "does the tracked set
grow" answer without needing a separately-maintained baseline manifest file
that itself needs updating on every legitimate allowlist change. A baseline
file is one more artifact that can go stale between the manifest and the
allowlist in check-python-only.py; the diff-based check has a single
source of truth (the script's own exclusion table).
Summary of the change surface (for the follow-up issue)¶
scripts/check-python-only.py: extend extension filter to.sh+.bats, broaden scope repo-wide with the allowlist table above, flip default to--block(keep--advisoryopt-out), update its docstring +AUDIT_EXCLUSIONSnaming/shape to reflect the directory-allowlist generalization. Add/extendscripts/tests/orplugins/dev-team/tests/scripts/pytest coverage (whichever mirrors where this file's existing tests — if any — currently live; none exist today, so this is new coverage) for: new.batsunder an allowlisted dir → pass; new.shunderplugins/security-assessment/→ pass; new.shunderscripts/(not allowlisted) → fail; new.shunderplugins/dev-team/→ fail; edited (not added) existing.bats→ pass (diff-filter=A semantics unchanged).scripts/ci-local.sh: addchk_python_onlyfunction +CHECKSentry..github/workflows/plugin-tests.yml: addchk_python_onlyto the shellcheck/shell-suite job's--only=list (line 42).docs/adr/0014-python-for-cross-os-scripts.md: update the "Enforcement" line to reflect blocking-by-default and link this doc.
Follow-up implementation issue¶
Filed: #702 —
"Extend and wire in check-python-only.py to block new .sh/.bats files
outside allowlist." Covers the concrete script extension, the ci-local.sh
and CI wiring, and the new pytest coverage described above.