Ten days of letting the check do the work

Ten days of letting the check do the work
April 23rd I shipped a blog post about how I'd nearly shipped a fix that referenced three database fields that didn't exist. Author-reviewer collapse. I'd reviewed my own spec and missed three column-name typos in a single round. The post argued that a rule in documentation isn't a rule in practice — you have to wire the check to a trigger that doesn't care if you're tired.
The trigger I added was simple. Run /sprint-review at the moment a spec gets declared ready, not just at the moment of implementation. Catching a defect at publish costs 30 seconds. Catching it at ship costs four days of round-trip, which is the time between declaring done and discovering the spec is broken.
That was ten days ago. The trigger has fired on five consecutive specs since, each at publish time, and each one caught a real bug.
| Spec | Date | Bug caught |
|---|---|---|
| Sprint 137 PR-C | Apr 25 | Code snippet return None; actual function returns (None, (token, reason)) 2-tuple |
| Sprint 137 PR-E | Apr 26 | Test code called client.call(...); actual method is client.chat(...) |
| Sprint 137 PR-F | Apr 27 | import re missing from spec's required changes |
| Sprint 139 PR-A | Apr 30 | Test snippet used MagicMock; target file imports only Mock |
| Sprint 138 PR-MFE-MAE-WIREUP | May 2 | Test called tracker._monitor_position(trade); actual method is _track_trade(trade) |
Each one would have produced a NameError or AttributeError on first run if shipped. Each was caught in 30 seconds at publish, instead of hours later in a fresh implementation session. Five for five.
The April 23rd post said the trigger would justify itself if it caught 2-3 more specs. It has caught five.
Here is the part I didn't expect.
What the discipline made room for
The interesting thing isn't the catches. The catches were the obvious win. The interesting thing is what happens to your week when you stop spending it cleaning up things you should have caught earlier.
Two days ago I ran /hypothesis-test against the live trading database on what looked like an obvious fix. The system had been failing to open trades for seven days. The cluster cap was firing on every proposal. The naive read: loosen the cap. I drafted that as a sprint, then ran the data check first.
The data refuted the framing. The cap wasn't over-tight. The cap was correctly designed at the value it had. What was wrong was that another component, written before the cap existed, computed position sizes that mathematically could not satisfy it at typical stop distances. Two correctly-designed pieces colliding. Neither was wrong. Their interaction was wrong.
Without the discipline to slow down and verify before changing, that sprint would have shipped as "loosen the cluster cap." It would have weakened a real protection without addressing the actual issue. The seven-day zero-trade window would have continued.
Today the same shape played out again. Pre-deploy logs showed RR_REJECT_POST_ATR firing on 30%+ deterioration of pre-ATR R:R. Obvious smoking gun: ATR multiplier too aggressive. Drafted the investigation. Pulled the data. The data killed the hypothesis. Of 113 proposals with ATR instrumentation, only 14 had ATR widening fire, and those 14 had slightly better paper outcomes than the 99 that didn't widen. The opposite of the hypothesis.
The real finding from the same query: 0 wins across all 113 paper-tracked proposals. Most expired without hitting TP. The actual problem isn't ATR widening. It is that the strategist is producing proposals that systematically don't reach target.
I wouldn't have found that by looking at logs. I found it because the discipline of pulling data before changing the rule put me in front of a number that would have stayed invisible otherwise.
The trade I almost forgot to mention
Sprint 139's first trade closed today. XRP-PERP LONG, +$53.43, +1.56%. First profitable closed trade since the system was reset on April 23rd.
It didn't hit TP. It hit the 48-hour max-hold ceiling while in profit. The Sprint 139 sprint that enabled it wasn't a fix to the strategist or the gate stack. It was a structural floor on the cluster cap, the one I would have wrongly weakened if I'd skipped the hypothesis test.
The trade and the discipline are the same story. Structural fixes shipped because data drove the decision. The first profit since baseline because the data check killed the wrong sprint.
What the ten days actually taught me
The April 23rd post was about a check. The check is real and has now fired five times. Worth every second.
The deeper thing is harder to write down. The headspace to actually investigate. Without the publish-time trigger I'd have spent the last ten days fighting fires from shipped-broken specs. With it, I shipped four specs that worked, refuted two wrong sprints with data, surfaced one structural design collision that had been invisible for months, fixed a dead-code instrumentation gap that had been silently NULL since Sprint 21, and saw the first profitable trade since baseline.
The discipline isn't the achievement. The discipline is what made it possible to see what was actually broken.
If you're building something where wrong moves are expensive, the question isn't "how do I avoid mistakes." It is "what fires when I haven't yet noticed I'm making one." Documentation doesn't fire. Templates don't fire. A trigger that runs without you remembering does.
Ten days of data say it works.