The Bug That Blocked Our Trading System for 30 Hours - And the Feature We Deliberately Didnt Build
The Bug That Blocked Our Trading System for 30 Hours - And the Feature We Deliberately Didn't Build
Date: March 10, 2026 Author: Jamie Watters Project: Trader-7 - An LLM-powered crypto trading system
In the last 72 hours, my AI trading system went through a regime flip, lost $62 on two trades, got stuck with a crippled position limit for 30+ hours, and then taught me one of the most important lessons I've learned building this thing: the best code you'll ever write is the code you decide not to write.
Here's what happened, what I fixed, what I deliberately chose not to fix, and why the decision that felt wrong was actually right.
The Setup: A Regime Flip and $62 in Losses
On March 9th at 04:43 UTC, BTC crossed above its 50-day moving average. The system's regime classification flipped from WEAK_BEAR to WEAK_BULL.
Two SHORT positions were still open:
| Trade | Entry | Exit | Result | What Happened |
|---|---|---|---|---|
| 185 (ETH SHORT) | $1,964 | SL $1,987 | -$31 | Stopped out during the flip |
| 186 (BTC SHORT) | $66,421 | SL $67,121 | -$31 | Opened 2.5h before the flip |
Meanwhile, the system's boundary penalty had been escalating for hours - from -1.0 to -14.2 confidence points - correctly identifying that a flip was imminent. It blocked new entries beautifully. But the two positions that were already open? No protection at all. They rode straight into the regime change and hit their stop losses.
That's $62 lost in a single event on a $3,000 paper account. Not catastrophic, but painful enough to demand a response.
The Bug: Code That Blocked Itself
After the flip, the system activated its cooldown - a 3-cycle restriction that caps new positions to 1 while the market settles. This was working as designed. Sprint 101 built this feature specifically to prevent stacking trades during uncertain transitions.
Except it never turned off.
For 30+ consecutive hours, every single cycle logged the same message:
Position limit capped to 1 during regime transition (2 cycles remaining)
Two cycles remaining. Every hour. For 30 hours. The counter never moved.
The root cause was elegant in its stupidity. The cooldown decrement code was inside an else block:
if not can_open:
log("Cannot open") ← cooldown sends us HERE
else:
... data collection ...
... regime classification ...
if cooldown > 0:
cooldown -= 1 ← but the decrement is HERE
The cooldown sets can_open = False. That sends execution into the if branch. The decrement is in the else branch. The cooldown prevents the very code that would end the cooldown from executing.
It worked exactly once - on the first cycle, when zero positions were open, so can_open stayed True and the decrement ran (3 → 2). Then a trade opened on that same cycle, and from that point forward, 1 position + active cooldown = can_open = False = decrement never reached = stuck forever.
The Fix: 15 Lines, 2 Minutes
Move the decrement to run unconditionally, before the can_open check:
# Sprint 115: Decrement regime cooldown every cycle
if self._regime_cooldown_remaining > 0:
self._regime_cooldown_remaining -= 1
# persist to DB
Remove the old elif decrement from inside the else block. Keep the flip detection there (it needs market data that's only available in that branch).
Deployed. First cycle post-deploy: cooldown went from 2 → 1. Confirmed working. Next cycle it'll hit 0 and the system is fully unblocked.
Total time to diagnose and fix: about 20 minutes. Total time the system was degraded: 30+ hours.
The Lesson for Builders
Self-blocking code paths are invisible until they're not. This bug passed every mental review because the logic looks correct - "decrement the cooldown when we process regime data." The problem isn't the logic, it's the reachability. The decrement was logically correct but physically unreachable under the exact conditions it was supposed to handle.
If you have state machines in your code, ask yourself: can the state that triggers a transition also prevent the transition code from executing? If yes, you have a self-blocking path.
The Feature I Didn't Build: Pre-Flip Position Protection
After fixing the cooldown bug, I looked at the $62 in transition losses and asked the obvious question: shouldn't the system protect existing positions when a flip is imminent?
The boundary penalty already knows a flip is coming. It escalated to -14.2 points. The system was smart enough to block new entries. Why not also tighten the stop losses on existing positions? Or close them early? Or reduce position size?
I spent an hour analysing this from three angles - as a quant, a trader, and an architect. The conclusion was unanimous: don't build it.
The Quant's Math
When the boundary penalty is high (price within 1% of the SMA50), two things can happen:
- The flip occurs (~40-50% of the time): You save ~$15-25 per position
- Price bounces back (~50-60% of the time): You've killed a winner worth ~$49
Expected value per activation: (0.45 x $20) - (0.55 x $49) = -$18
The math is negative because false signals outnumber real flips. BTC doesn't cleanly cross the SMA50 - it oscillates around it, sometimes for days. Every tightened stop that gets hit on a bounce is a winner you sacrificed for protection you didn't need.
The Trader's Perspective
The system's best feature is its trailing stop mechanism. When a trade hits TP1 (2:1 risk-reward), it closes half and trails the rest. Average trailing stop exit: +$49.81. This is the entire profit engine.
Pre-flip protection fights this directly. It introduces a competing exit path that says "close early because the regime might change." In ranging markets near the SMA50 - exactly where boundary penalty is active - you'd be constantly tightening and getting stopped out of positions that would have been fine.
Trade 187 (currently open, +$30 unrealized) entered right near the SMA50 after the flip. An aggressive pre-flip system might have prevented it from opening or killed it early. That one trade has already recovered half the transition losses.
The Architect's Concern
Trader-7 already has five layers of regime-transition protection:
- Boundary penalty (blocks new entries)
- Regime cooldown (caps positions to 1)
- Regime watchdog (tightens stops after 2-4 opposing cycles)
- Position reevaluator (closes on opposing signal + thesis invalidation)
- Stop loss (hard exit at defined price)
Adding a sixth layer creates interaction effects. Which system "owns" the stop? What if the watchdog says keep but the pre-flip tightener says close? When a trade exits during a transition, which of six systems caused it? Every new layer makes the system harder to debug and harder to trust.
What I'll Do Instead
Rather than building something new, I'll tune what exists. The watchdog already tightens stops during regime opposition - it just takes 2-4 cycles. The idea I'm monitoring: when boundary penalty is high, reduce the watchdog trigger from N cycles to 1. Same system, adaptive tempo. No new interactions.
But even that waits until after 20-30 more trades. I need data, not instinct.
The Bigger Lesson: Restraint Is a Feature
Three days ago I watched the system navigate a regime flip, lose $62, get stuck with a bug for 30 hours, and then recover with a winning trade. The temptation to "fix everything" was overwhelming.
But here's what actually happened when I sat with the data:
- The bug fix (Sprint 115): 15 lines, clear root cause, obvious fix, immediate value. Shipped in 20 minutes.
- The new feature (pre-flip protection): Negative expected value, fights the profit engine, adds debugging complexity, unclear benefit. Deliberately not built.
The hardest skill in software engineering isn't building things. It's looking at a problem, understanding the system deeply enough to calculate the expected value of a fix, and having the discipline to say "the cost of this solution is higher than the cost of the problem."
Every line of code you add is a line you have to maintain, debug, and reason about when something goes wrong at 3am. The $62 in transition losses felt urgent. The analysis said otherwise. And Trade 187 - quietly sitting at +$30 - is a reminder that the system's existing mechanisms, when they work correctly, handle regime changes well enough.
Sometimes the best engineering decision is a well-documented "no."
Current State
- Sprint 115: Deployed and confirmed. Cooldown decrementing correctly (2 → 1 on first cycle post-deploy)
- Trade 187: BTC LONG, TP1 hit, trailing stop at $69,369, unrealized ~+$30
- Overall PnL: -$15.10 across 92 trades (-0.5%)
- Next: Monitor for 20-30 trades with no code changes. Collect data on regime transitions, watchdog activations, and false flip rates.
The system is stable. The architecture is sound. Now we wait for the data to tell us what's actually true.
Building Trader-7 in public. Day 109.
Jamie Watters - @Jamie_within