StrategyXL StrategyXL
Methodology

The five biases that quietly inflate backtest results

Shawn Cherry
#backtesting#methodology#risk#quantitative

Backtesting is the single most useful exercise in systematic trading — and it’s also where systematic traders quietly fool themselves. Not because the math is wrong, but because every backtest has assumptions baked in, and a few of those assumptions reliably make your results look better than reality will deliver.

This is a list of the five most common ways I’ve watched myself (and other people) get fooled by an otherwise well-intentioned backtest. It’s also, indirectly, an explanation of what StrategyXL does and doesn’t do about each one.

1. Look-ahead bias

What it is: Using information in your signal that wouldn’t actually have been available when you made the trade.

The classic version is “buy when today’s close is above the 20-day moving average.” That sounds fine until you realize today’s close prints at 4:00 PM — at which point you can’t also place a trade at today’s close. Either the trade has to happen tomorrow morning at the open (different price, different result), or you cheated by peeking at a future-looking price.

Slightly subtler versions: indicators that get re-computed across the whole history each time a new bar arrives (some libraries’ “full series” mode does this), or any logic that touches data[i+1] to decide what to do at data[i].

What StrategyXL does about it: Signals are computed bar-by-bar with only the data that would have been available at that bar. An indicator value at day N is computed from days 0 through N — never from N+1 onward. Entry and exit decisions are evaluated against historical bars in chronological order, with no peeking forward. This is the floor of any honest backtest engine, and it’s the first thing to verify in any tool you don’t fully trust.

2. Survivorship bias

What it is: Testing your strategy only on companies that survived to the present day.

Pick “the current S&P 500 constituents” as your test universe and run a 20-year backtest. The results look great. Of course they do — every company in your universe was successful enough to still be in the S&P 500 today. The Lehmans, the Eastman Kodaks, the Sears, the WorldComs — companies that went to zero, got acquired at distressed prices, or got delisted — are silently absent. Your backtest didn’t have to navigate any of them.

Survivorship bias is one of the largest single sources of inflated backtest performance, and it’s nearly invisible. A “buy quality stocks” strategy looks brilliant against today’s S&P 500 because today’s S&P 500 was defined as the survivors of “buy quality stocks.”

What StrategyXL does about it: The honest answer is: this one is partly on you. The Index Holdings feature shows the current SPY and (when the data source is back) QQQ constituents — useful for a starting universe, but inherently biased toward survivors. Tiingo retains historical data for many delisted US-listed tickers if you query them by symbol — coverage isn’t universal, but it’s good enough that you can build a more survivorship-aware test by including known historical names that no longer trade. The point is to be aware of which kind of test you’re running.

3. Overfitting and curve-fitting

What it is: Tweaking parameters until your backtest looks great, with no validation against unseen data.

You start with a 50/200 moving average crossover and the backtest is mediocre. You try 30/150 — better. You try 35/170 — better still. You try 33/167 — better again. After 30 iterations you find a parameter set that produces a 22% CAGR with a 12% max drawdown. You ship it. Six months later it loses money because what you actually found was the parameters most exquisitely tuned to the noise in your historical sample.

The technical name is in-sample overfitting. The tell is that performance gets worse the moment you change anything — the date range, the ticker, the parameters by 1%. A robust strategy doesn’t behave that way; an overfit one does.

What StrategyXL does about it: The product encourages logging every test, not cherry-picking winners. The Save Defaults feature plus the Results History sheet make it natural to keep a record of every parameter set you’ve tried, including the failures. That sounds like a small thing, but it’s the difference between “I tested 5 strategies and one worked” (selection bias) and “I tested 47 strategies and one worked” (probably random).

What’s not in the tool yet: built-in parameter sweeps (run 50/100/150/200 across one batch and rank the results) and walk-forward validation (train on years 1–7, validate on years 8–10, slide the window forward). Both are on the public roadmap and are direct counterweights to overfitting. They’ll matter most for users running serious parameter searches.

4. Cherry-picked date ranges

What it is: Backtesting on a window where your strategy happened to work, and not on the windows where it would have failed.

Run any momentum strategy on US large-caps from March 2009 through January 2020 and it’ll print money. So would a coin flip. The market was in a generational bull run — virtually any “buy and hold something risky” strategy looked great. The interesting question is what does your strategy do during the next 2008, the next 2000, or the next 2022?

This bias is especially nasty because it’s often unintentional. People naturally test against “recent history” — usually the last 5 to 10 years — because it’s easy to fetch and feels relevant. But “the last 10 years” of US equities has been historically unusual.

What StrategyXL does about it: Default Start dates pull from the beginning of the prior calendar year, but you can extend back as far as Tiingo’s data goes — for major US equities and indices, that’s typically 30+ years. The Stock Backtest template defaults make it easy to span multiple market regimes without thinking about it. The discipline is on you to actually do that, and to be skeptical of any strategy that only works on one decade.

5. Unrealistic execution

What it is: Assuming your backtest’s fills, costs, and timing match what you’d actually get in real money.

Many backtests assume you can buy at the closing print the moment a signal fires, pay zero commission, and incur no slippage. None of that is exactly true in practice — you can’t physically place an order at the close, your fill might actually be the next session’s open if you weren’t there to enter the order in time, large orders will move illiquid stocks, and even a fraction of a percent of slippage compounds across hundreds of trades.

For liquid US equities at retail size with a no-commission broker, these effects are usually small. For illiquid names, leveraged or short positions, or anything time-sensitive — they’re not small at all.

What StrategyXL does about it: It executes entries and exits at the next bar’s open rather than the same-bar close — a deliberately conservative choice that prevents the “I saw the close before I traded” version of look-ahead. It also applies an adverse slippage percentage on every fill (default 0.10%, configurable in the template) and a commission per trade (default $0 to match no-fee retail accounts, but configurable for assets where it matters). The honest framing is: with the defaults, treat the result as the theoretical case for liquid US equities at small size with a no-fee broker. For less liquid names or anything where execution cost is non-trivial, raise commission and slippage explicitly until they reflect what you’d actually pay.

The point

None of this is a reason not to backtest. Backtests are still the best tool we have for separating “this strategy makes sense” from “I just had a good feeling about it.” But every backtest result needs to be approached with the assumption that it’s slightly too good — that the historical data was kinder to your strategy than the future will be — and the tool you’re using should help you see exactly why it might be too good.

That’s what StrategyXL tries to be — a tool that gives you a more informed view of how a strategy would have performed. Keeping the biases above in mind as you read your results is what will keep your expectations grounded before real capital goes behind a trade.

If you have feedback on any of this — disagreement, additional pitfalls I missed, or specific features you’d like to see for catching them — email ideas@strategyxl.com or check the public roadmap.

← Back to Blog