The routine, in six steps
Window in. Worklist out.
Every step is bounded: the scan window closes on the previous successful run, the API scope is a fixed prefix list, and the deliverable is a single rolling PR. Nothing escapes the envelope.
Resolve the window deterministic
Lower bound is the created_at of this workflow's previous successful run. Cold start falls back to the last 24h; manual dispatch can override via since. When the rolling PR is open, the bound widens to the window that PR recorded — so the originating merges stay in scope until drift clears.
# lower bound resolution
since = dispatch_since
or open_pr.recorded_window.start
or last_successful_run.created_at
or now() - 24h
List merged PRs paginated
Every PR with merged_at > since via gh. Pagination is bounded — a run that exhausts the cap exits inconclusive, so the next run rescans the same window instead of advancing past unpublished drift.
# gh search, paginated
gh search prs --merged \
--repo $REPO \
--merged-at ">=$since" \
--json number,mergedAt,files
Classify the API surface prefix-gated
A change counts as API only when it lands under scripts/, portal/, or apps/ in a code file (.py .sh .js .mjs .cjs .ts .tsx). Test trees and build noise are filtered out — your editor's morning is not spent on __pycache__.
# API_PATH_PREFIXES
("scripts/", "portal/", "apps/")
# API_PATH_IGNORE
("/tests/", "/test_",
"/__pycache__/", "/node_modules/")
Extract two drift signals low-noise
Path drift — a file the docs name was modified, removed, or renamed (renames tracked by the old path, since that is what the docs still point at). Symbol drift — a def / class / export was removed or renamed. Re-added names are excluded, so a body edit that keeps the signature never registers.
# diff → signals
path_drift = touched_paths & api_surface
symbol_drift = (removed_defs - added_defs)
>= MIN_SYMBOL_LEN
& not_in(SYMBOL_STOPWORDS)
Cross-reference the docs exact-match
Every signal is matched against every Markdown file under docs/ (excluding _generated/ and _shared/). Path match is exact. Symbol match requires the identifier as a whole token. Every hit carries the doc path, line number, originating PR, and the changed path or symbol.
# every hit is fully cited
{
"doc": "docs/deploy-runbook.md",
"line": 117,
"signal": "path",
"ref": "scripts/verify_all.py",
"pr": "#1102"
}
Publish the rolling PR draft
One PR on branch automation/docs-api-drift. New on first drift, refreshed on subsequent runs, closed automatically when a complete scan clears the worklist. Opened as draft — the merge button stays off until an editor explicitly clicks "Ready for review," the affirmative gate for "I actioned every flag."
# single rolling target, never duplicated
branch: automation/docs-api-drift
state: draft
body: docs/_generated/api-drift-report.md
push: fast-forward only