Docs API drift watch
Merged PRs in, one rolling editor-review PR out — path and symbol drift under docs/, flagged never rewritten.
Six steps, once per day
One scheduled job: resolve the merge window, diff API surfaces, publish one editor PR.
Step-by-step detail
Resolve the scan window
The lower bound is the created_at of this workflow's previous
successful run. On a cold start it falls back to the last 24 hours; a
manual dispatch can override it with a since input. If an
editor-review PR is already open, the bound is widened back to the window
that PR's own report recorded — so the merges that first flagged its drift
stay in scope.
List every PR merged in the window
The complete merged-PR record for the resolved window — nothing sampled, nothing skipped.
Identify the API-touching changes
A change counts as API surface when it lands under scripts/,
portal/, or apps/ in a code file
(.py, .sh, .js, .mjs,
.cjs, .ts, .tsx). Test trees and
build noise are excluded.
Extract two drift signals
For each API change the watch derives a path-drift signal and a symbol-drift signal — detailed in the next section.
Cross-reference the docs
Every signal is matched against every Markdown file under docs/,
excluding the generated and shared subtrees (_generated/,
_shared/).
Publish the findings
The full result is published as a single rolling editor-review PR — a per-document worklist, ready for a human to action.
Two signals of drift
One is exact and silent. The other is fuzzy by nature, so it is held to a deliberately conservative bar.
Signal detail
Path drift
A file the docs name by path was modified, removed, or renamed. A rename is tracked by its old path — that is what the docs still point at.
Matching is exact, which makes this signal effectively false-positive-free.
// exact · authoritativeSymbol drift
A def, class, or export definition was
removed or renamed. Re-added names are excluded, so a body edit that keeps the
signature never registers.
A removed symbol is only chased into the docs when it is distinctive:
at least MIN_SYMBOL_LEN characters, not a generic stopword, and
snake_cased, camelCased, or long. Generic identifiers — build,
parse, index — are dropped before they reach the docs.
The surface boundary, drawn explicitly
Drawing the boundary in code, not folklore, is what keeps the watch from crying wolf on every README typo. A change has to land inside the boundary on the left to ever reach an editor.
Inside vs outside boundary
scripts/**/*.{py,sh,js,mjs,cjs,ts,tsx}Stdlib automation, shell entry-points, CLI surface, audit gates.portal/**/*.{py,sh,js,mjs,cjs,ts,tsx}Public/operator console modules and helpers.apps/**/*.{py,sh,js,mjs,cjs,ts,tsx}The command worker, routes, and any first-party support scripts.
**/tests/**Test trees are not an API readers depend on.**/__pycache__/**Build artifacts; not source.**/node_modules/**Vendored dependencies; not first-party.docs/**, references/**, sources/**Prose surface — never an API the docs describe.
One rolling editor-review PR
The watch maintains exactly one PR, on the branch
automation/docs-api-drift.
Deliverable detail
Always the current worklist
Each run merges main into the branch, commits the fresh
report, and pushes the updated branch without force-pushing. The branch
may accumulate the occasional sync commit,
but the report file at the top of the PR is always the current state.
The body is the worklist
The PR body is the generated report
(docs/_generated/api-drift-report.md): a per-document worklist
citing the changed path or symbol, the PR that changed it, and the exact line
numbers in the doc that reference it.
It closes itself
When a later complete scan finds no drift, the watch closes the PR and deletes the branch. Because the window is widened back to the open PR's recorded window, “no drift” means the flagged docs are genuinely consistent again — not merely that the flagging merge aged out.
Inconclusive runs stay safe
A run that could not complete the scan exits non-zero. It neither closes the PR nor refreshes its report, and is not recorded as the last successful run — so the next run rescans the same window instead of advancing past drift that was never published.
docs: stale documentation referencing recently changed APIs
// docs/deploy-runbook.md
- L41 path — references
scripts/deploy_command.py// renamed in #4812 - L88 symbol — references
resolve_target_env// removed in #4812
// docs/automation-pr-hygiene.md
- L17 path — references
scripts/janitor_close_duplicates.py// modified in #4807
// docs/dependency-watch-overview.md
- L62 symbol — references
fetch_advisory_index// removed in #4801
Anatomy of a finding
Every row in the worklist is the same five-piece composition. Once you can read one, you can read the whole report at a glance.
Specimen and legend
Line number — the exact line in the flagged doc that still references the moved/removed thing. Click it in the rendered PR body to jump straight to the prose that needs editing.
Signal kind — path or symbol. path is exact and authoritative; symbol is noise-gated and conservative.
The reference — the literal text the doc still uses: a file path (orange) or a callable name (teal).
The provenance — what happened to it (renamed, removed, modified) and the PR number that did it. That PR is your context: open it, decide if the prose needs an update or just a path correction.
One morning, end-to-end
06:17 UTC fires. Forty-three seconds later there is a worklist on an editor's desk. Here is what those forty-three seconds look like.
Timeline walkthrough
Schedule fires
GitHub Actions instantiates the docs-api-drift concurrency group. Any in-flight manual dispatch queues — it is never cancelled.
Lower bound resolved
Last successful run was yesterday at 06:17:09Z. An editor-review PR is already open from two days ago — its recorded window starts at 06:17:11Z two days prior, which becomes the bound.
Merged PRs listed
Fourteen merged PRs land inside the window. The watch asks gh pr list for one item beyond the configured cap; if the extra row appears, the window is marked incomplete and the run exits non-zero.
API surface isolated
Four of the fourteen touched code under scripts/, portal/, or apps/. From those, the scanner extracts six removed-or-renamed paths and nine removed symbols that clear the noise floor.
Three docs flagged
The path + symbol set is grepped against every docs/**/*.md file outside _generated/ and _shared/. deploy-runbook.md, automation-pr-hygiene.md, and dependency-watch-overview.md all still name something that moved.
Rolling PR refreshed
The branch automation/docs-api-drift merges main when it already exists, commits the new report, and pushes the updated branch without force-pushing. The open editor-review PR's body is rewritten from the report. The editor wakes up to one tab, three sections, six lines.
Failure modes & self-healing
The watch is designed so the failure modes are safe: an inconclusive run never closes a still-stale PR, and a manual override never escapes a scheduled run racing it.
Failure mode cards
drift=true
Complete scan, drift found
The rolling PR is opened (or its body is rewritten with the fresh report). The scan window's created_at becomes next run's lower bound.
Self-heal — the next morning's run will reopen the same picture until an editor lands a fix on a separate branch.
drift=false
Complete scan, no drift
If an editor-review PR is open, it is closed with an automated comment and the branch is deleted. If none is open, the run is a silent no-op.
Self-heal — a future drift-introducing merge reopens a fresh PR. There is never a stale "resolved" PR left to consult.
scan_complete
=false
Inconclusive scan
A dropped merged-PR page, a gh error on a PR's file list, or a failed lookup of the open editor-review PR. Nothing is published, nothing is closed.
Self-heal — this run is not recorded as the last successful run, so tomorrow's run rescans the same window from scratch. Drift that was never published cannot be silently aged out.
unhandled
Unhandled exception
A crash inside the scanner. Distinct from exit 3 so operators can tell an intentional incomplete-scan from a bug.
Self-heal — same as inconclusive: nothing published, nothing closed, the same window is rescanned tomorrow. Failures surface in the Actions log; an operator fixes the bug and the next run carries on.
race
Manual dispatch during a scheduled run
The static concurrency group docs-api-drift queues the second run rather than cancelling either. The dispatch's explicit since is still respected when its turn comes.
Self-heal — every published report is the output of a complete, uninterrupted scan. There is no partial-report half-state.
The watch flags. The editor writes.
The PR is intentionally a flag, not an auto-rewrite — substantive prose stays a human decision.
- Read each flagged line and decide whether the doc is actually stale.
- Make the prose fix — or correct the stale path or symbol — on a separate branch.
- Open your own PR for that fix.
- Close the editor-review PR once your fix merges, or if every flag was a false positive.
- Commit onto
automation/docs-api-drift— the watch refreshes the report on it every run; manual commits linger in bot-owned history and can be replaced when they touch the generated report. - Hand-edit
docs/_generated/api-drift-report.md— it is machine output, rewritten each run. - Expect the watch to rewrite prose — it only points; you decide.
Tunables
All tunables live as constants at the top of
scripts/docs_api_drift.py.
| Constant | Purpose |
|---|---|
| API_PATH_PREFIXES | Directories treated as API surface. |
| API_PATH_IGNORE | Path substrings that disqualify a match (tests, build noise). |
| API_CODE_SUFFIXES | Extensions that carry a callable or importable surface. |
| DOC_SUFFIXES / DOC_DIR_SKIP | Which docs are scanned, and which subtrees are skipped. |
| MIN_SYMBOL_LEN / SYMBOL_STOPWORDS | The symbol-noise floor. |
| DEFAULT_WINDOW_HOURS | Cold-start fallback window. |
Tuning symbol noise. Path matching is exact and effectively
false-positive-free; symbol matching is the noisier signal. If editors still see
noise, raise MIN_SYMBOL_LEN or extend SYMBOL_STOPWORDS.
Running it by hand
The script is stdlib-only and shells out to the gh CLI for
all GitHub access. gh must be authenticated — GH_TOKEN in CI.
# Default: window = previous successful run, report to docs/_generated/ python3 scripts/docs_api_drift.py # Explicit window, custom report path, no Actions output python3 scripts/docs_api_drift.py --since 2026-05-12T00:00:00Z \ --report-path /tmp/drift.md
How it is wired
| Workflow | .github/workflows/docs-api-drift.yml |
| Scanner | scripts/docs_api_drift.py · tested by tests/test_docs_api_drift.py |
| Trigger | schedule (06:17 UTC daily) + workflow_dispatch |
| Token | secrets.STONEWALL_AUTO_REBASE_PAT when opening or closing the editor-review PR (draft create is blocked for GITHUB_TOKEN on this repository); falls back to GITHUB_TOKEN for scan-only paths |
| Permissions | actions: read, contents: write, pull-requests: write |
| Concurrency | static group docs-api-drift, queued — not cancelled |
Relationship to other automation
docs/_generated/api-drift-report.md is machine output. It is
rewritten on every run — never hand-edit it.
The rolling PR title classifies into no topic slug in
docs/automation-pr-hygiene.md, so the duplicate-PR janitor leaves
it alone — and the watch keeps exactly one PR, so there is never a sibling to
consolidate against.
Behavior changes here follow docs/automation-versioning.md: tag
the commit with the appropriate automation(...) intent.