Skip to main content

The Invisible Bugs That Stole My Trades (And How I Found Them)

Published: December 14, 20254 min read
#Crypto#Agent#tradingbot#algotrading

The Invisible Bugs That Stole My Trades (And How I Found Them)

December 13, 2025


For three days, my AI trading system was generating excellent trade setups - 85% confidence signals with solid technical backing - and they were disappearing into thin air. Today, I finally caught the culprits.

The Ghost Trade Problem

Here's what I'm seeing in the logs:

06:55:48 - Opening trade: SHORT 0.135312 ETH-PERP @ $3095.50
06:55:50 - Tax lot created for trade 57
06:55:51 - Fetched 0 positions from perpetual adapter
06:55:51 - Auto-closing orphan database trade 57

The trade existed for exactly 3 seconds before vanishing. Not because the market moved against me. Not because a stop loss triggered. Because of a race condition that had been silently killing my best trades.

Bug #1: The Race Condition

My system has a "reconciliation" process - it compares what the database thinks we own versus what the exchange reports. If the database shows a position but the exchange doesn't, it closes the "orphan" trade. Safety feature, right?

The problem: When I open a new trade, the database updates immediately. But our paper trading exchange adapter - the simulated exchange we use for testing - takes a moment to register the new position internally. If reconciliation runs during those few seconds... the trade gets killed as an "orphan."

The irony: A safety feature designed to prevent phantom positions was creating them.

Bug #2: The Uninitialized Variable

The second bug was more insidious. When the first bug triggered, it caused an error that crashed the trade opener. But the error handler itself crashed:

# What the code tried to do:
self._cancel_all_orders([entry_order, tp1_order, tp2_order, stop_order])

# The problem:
UnboundLocalError: cannot access local variable 'entry_order'

The error handler assumed the orders existed. They didn't - the trade crashed before creating them. So the cleanup code crashed too, leaving the system in an undefined state.

The lesson: Error handlers must be more defensive than the code they're protecting.

The Fix

For the race condition, I added proper position routing:

# Before: Direct database write, no exchange notification
db.save_trade(trade)

# After: Route through exchange adapter first, then database
order_manager.open_perpetual_position(trade)  # Updates exchange state
db.save_trade(trade)  # Then save to DB

For the uninitialized variable:

# Initialize before the try block
entry_order = None
tp1_order = None

try:
    entry_order = exchange.create_order(...)
    # ...
except Exception:
    # Now safe to reference
    orders = [o for o in [entry_order, tp1_order] if o is not None]
    self._cancel_orders(orders)

8 Hours of Nothing (The Good Kind)

After deploying the fixes, we ran the system for 8 hours. Result: 0 trades executed.

This might sound bad. It's actually perfect.

The market today was choppy - ADX between 18-25 (weak trend), RSI in the 35-55 neutral zone, price oscillating between moving averages. No clear setups. Our AI strategist evaluated 32 opportunities and rejected all of them.

That's exactly what it should do.

The system's job isn't to trade constantly. It's to:

  1. Find high-probability setups
  2. Execute with discipline
  3. Sit out when conditions suck

Today's 0 trades with $0 lost is better than forcing trades in a choppy market.

The Data Collection Waiting Game

I'm working on Sprint 30: Entry Timing System. The idea is to score entry quality before execution - are we entering too late in the move? Is price extended from moving averages? Is RSI at extremes?

To build this, I need data from actual trades. But I can't get data if the market doesn't cooperate.

Current status: System running 24/7, waiting for trending conditions where:

  • ADX > 25 (strong trend)
  • RSI at extremes (< 30 or > 70)
  • Price breaking key levels

When those conditions appear, trades will execute, and I'll collect the timing data I need.

What I Learned This Week

1. Race Conditions Hide in Plain Sight

This bug existed for at least 3 days before I caught it. It only appeared when trades opened at specific timing windows - near the hourly reconciliation cycle. Most trades were fine. The occasional one just... vanished.

Logs showed the symptom ("auto-closing orphan") but not the cause. I had to trace execution timing millisecond by millisecond to find the race.

2. Safety Features Can Backfire

Reconciliation exists to prevent phantom positions - trades the database thinks exist but the exchange doesn't. Noble goal. But without a "grace period" for newly-opened trades, it became the very thing it was trying to prevent.

Always ask: "What happens if this safety feature triggers at the wrong time?"

3. Error Handling Needs Its Own Error Handling

When code crashes, the error handler runs. But if the error handler references state that doesn't exist... the error handler crashes. Now you have two problems.

Defensive pattern: Initialize everything to None or empty values before try blocks. Check for None in exception handlers.

4. Patience Is Part of the System

Not trading isn't failure. It's discipline.

Our system rejected 32 low-quality setups today. That's 32 potential losses avoided. The capital we preserved will be there when a genuine opportunity appears.

Tomorrow

The system keeps running. We watch for regime changes. When the market starts trending again:

  • Trades will execute (without being killed by ghosts)
  • Entry timing data gets collected
  • Sprint 30 progresses

Sometimes building in public means waiting in public. Today was a good day for stability. Tomorrow might be a good day for trading.

Or not. The market decides.


Building Trader-7: An AI-powered crypto trading system. One bug fix at a time.

Share this post