Ten days of letting the check do the work

April 23rd I shipped a blog post about how I'd nearly shipped a fix that referenced three database fields that didn't exist. Author-reviewer collapse. I'd reviewed my own spec and missed three column-name typos in a single round. The post argued that a rule in documentation isn't a rule in practice — you have to wire the check to a trigger that doesn't care if you're tired.

The trigger I added was simple. Run /sprint-review at the moment a spec gets declared ready, not just at the moment of implementation. Catching a defect at publish costs 30 seconds. Catching it at ship costs four days of round-trip, which is the time between declaring done and discovering the spec is broken.

That was ten days ago. The trigger has fired on five consecutive specs since, each at publish time, and each one caught a real bug.

Spec	Date	Bug caught
Sprint 137 PR-C	Apr 25	Code snippet `return None`; actual function returns `(None, (token, reason))` 2-tuple
Sprint 137 PR-E	Apr 26	Test code called `client.call(...)`; actual method is `client.chat(...)`
Sprint 137 PR-F	Apr 27	`import re` missing from spec's required changes
Sprint 139 PR-A	Apr 30	Test snippet used `MagicMock`; target file imports only `Mock`
Sprint 138 PR-MFE-MAE-WIREUP	May 2	Test called `tracker._monitor_position(trade)`; actual method is `_track_trade(trade)`

Each one would have produced a NameError or AttributeError on first run if shipped. Each was caught in 30 seconds at publish, instead of hours later in a fresh implementation session. Five for five.

The April 23rd post said the trigger would justify itself if it caught 2-3 more specs. It has caught five.

Here is the part I didn't expect.

What the discipline made room for

The interesting thing isn't the catches. The catches were the obvious win. The interesting thing is what happens to your week when you stop spending it cleaning up things you should have caught earlier.

Two days ago I ran /hypothesis-test against the live trading database on what looked like an obvious fix. The system had been failing to open trades for seven days. The cluster cap was firing on every proposal. The naive read: loosen the cap. I drafted that as a sprint, then ran the data check first.

The data refuted the framing. The cap wasn't over-tight. The cap was correctly designed at the value it had. What was wrong was that another component, written before the cap existed, computed position sizes that mathematically could not satisfy it at typical stop distances. Two correctly-designed pieces colliding. Neither was wrong. Their interaction was wrong.

Without the discipline to slow down and verify before changing, that sprint would have shipped as "loosen the cluster cap." It would have weakened a real protection without addressing the actual issue. The seven-day zero-trade window would have continued.

Today the same shape played out again. Pre-deploy logs showed RR_REJECT_POST_ATR firing on 30%+ deterioration of pre-ATR R:R. Obvious smoking gun: ATR multiplier too aggressive. Drafted the investigation. Pulled the data. The data killed the hypothesis. Of 113 proposals with ATR instrumentation, only 14 had ATR widening fire, and those 14 had slightly better paper outcomes than the 99 that didn't widen. The opposite of the hypothesis.

The real finding from the same query: 0 wins across all 113 paper-tracked proposals. Most expired without hitting TP. The actual problem isn't ATR widening. It is that the strategist is producing proposals that systematically don't reach target.

I wouldn't have found that by looking at logs. I found it because the discipline of pulling data before changing the rule put me in front of a number that would have stayed invisible otherwise.

The trade I almost forgot to mention

Sprint 139's first trade closed today. XRP-PERP LONG, +$53.43, +1.56%. First profitable closed trade since the system was reset on April 23rd.

It didn't hit TP. It hit the 48-hour max-hold ceiling while in profit. The Sprint 139 sprint that enabled it wasn't a fix to the strategist or the gate stack. It was a structural floor on the cluster cap, the one I would have wrongly weakened if I'd skipped the hypothesis test.

The trade and the discipline are the same story. Structural fixes shipped because data drove the decision. The first profit since baseline because the data check killed the wrong sprint.

What the ten days actually taught me

The April 23rd post was about a check. The check is real and has now fired five times. Worth every second.

The deeper thing is harder to write down. The headspace to actually investigate. Without the publish-time trigger I'd have spent the last ten days fighting fires from shipped-broken specs. With it, I shipped four specs that worked, refuted two wrong sprints with data, surfaced one structural design collision that had been invisible for months, fixed a dead-code instrumentation gap that had been silently NULL since Sprint 21, and saw the first profitable trade since baseline.

The discipline isn't the achievement. The discipline is what made it possible to see what was actually broken.

If you're building something where wrong moves are expensive, the question isn't "how do I avoid mistakes." It is "what fires when I haven't yet noticed I'm making one." Documentation doesn't fire. Templates don't fire. A trigger that runs without you remembering does.

Ten days of data say it works.

Ten days of letting the check do the work

Ten days of letting the check do the work

What the discipline made room for

The trade I almost forgot to mention

What the ten days actually taught me

Share this post

The House Always Wins: Why Retail Crypto Trading Is About to Get a Lot Harder