Study notes for Chapters 2, 3, 5, 6, and 7 of Quantitative Trading (2nd Ed.) by Ernest P. Chan. Click any chapter title to expand or collapse it.
The following sources are recommended for finding quantitative trading strategy ideas. Most are free or low-cost. The key caveat: treat all of them as starting points for modification, not ready-made strategies.
| Category | Source | URL |
|---|---|---|
| Academic | Business school finance professors’ websites | hbs.edu/research |
| Social Science Research Network (SSRN) | ssrn.com | |
| National Bureau of Economic Research (NBER) | nber.org | |
| Quantitative finance seminars | ieor.columbia.edu | |
| Quantpedia (aggregator of all academic quant papers) | quantpedia.com | |
| Blogs & podcasts | Flirting with Models | thinknewfound.com |
| Mutiny Fund podcast | mutinyfund.com/podcast | |
| Chat with Traders | chatwithtraders.com | |
| Eran Raviv | eranraviv.com | |
| Party at the Moontower | moontowermeta.com | |
| Ernest Chan’s blog | epchan.blogspot.com | |
| Trader forums | Elite Trader | elitetrader.com |
| Wealth-Lab | wealth-lab.com | |
| Twitter / X | Benn Eifert | @bennpeifert |
| Corey Hoffstein | @choffstein | |
| Quantocracy (aggregator of new quant articles) | @Quantocracy | |
| Mike Harris | @mikeharrisNY | |
| Euan Sinclair | @sinclaireuan | |
| Ernest Chan | @chanep | |
| Newspapers & magazines | Stocks, Futures and Options magazine | sfomag.com |
| Factor | Low capital (<$100K) | High capital (>$100K) |
|---|---|---|
| Account type | Proprietary trading firm | Retail brokerage |
| Instruments | Futures, FX, options | Everything incl. stocks |
| Holding period | Intraday only | Intra- and overnight |
| Position type | Directional only | Directional or market-neutral |
| Data quality | Daily, survivorship-biased OK | Tick-level, survivorship-free |
| News access | Delayed/low-coverage | Real-time Bloomberg-tier |
| Sharpe ratio | Interpretation | Viability |
|---|---|---|
| < 1.0 | High volatility relative to returns | Not viable as standalone |
| ≥ 1.0 | Minimum acceptable threshold | Borderline |
| ≥ 2.0 | Profitable almost every month | Good |
| ≥ 3.0 | Profitable almost every day | Excellent |
Formula:
\[\text{Sharpe Ratio} = \frac{\text{Avg Portfolio Return} - \text{Risk-Free Rate}}{\text{Std Dev of Portfolio Returns}}\]Annualization:
\[\text{Daily Sharpe} \times \sqrt{252} \quad \text{or} \quad \text{Monthly Sharpe} \times \sqrt{12}\]| Component | Definition | Typical size |
|---|---|---|
| Commission | Explicit broker fee per trade | Negligible with discount brokers |
| Bid-ask spread | Gap between bid and ask prices | ~5 bps one-way for S&P 500 stocks; ~1 bp for ES futures |
| Market impact | Your own order moves the price against you | Dominant cost for large/illiquid positions |
| Slippage | Difference between signal price and execution price | Average cost due to transmission delays |
A reference glossary of financial terms:
| Tool | Good for | Best thing about it | Watch out for |
|---|---|---|---|
| Excel | Simple daily strategies, part-time traders | Everything visible on screen — very hard to make hidden mistakes | Cannot handle hundreds of stocks or complex models |
| MATLAB | Serious quant research, large stock universes | Fast, powerful statistics, real customer support | Costs money (home version is affordable though) |
| Python | Building production systems, using machine learning | Free, huge library of add-ons, most popular in industry | Packages often break each other; slower than MATLAB; no support |
| R | Classical statistics and econometrics | Best statistical packages available anywhere | Weaker machine learning support; basic interface |
| QuantConnect | Research all the way through to live trading | 400TB of data included; Python or C#; backtest and live trading use identical code | Requires programming knowledge |
| Blueshift | Python users who want data and backtesting in one place | Free minute-by-minute data; also has a no-code visual builder | Fewer markets than QuantConnect |
| Source | What it covers | Good things | Problems |
|---|---|---|---|
| finance.yahoo.com | Daily stocks | Free; already adjusted for splits and dividends | Missing bankrupt stocks (survivorship bias); can only download one stock at a time |
| Sharadar.com | Daily stocks | Includes delisted stocks — no survivorship bias | Paid subscription |
| Algoseek.com | Stocks, futures, tick-by-tick data | Rent the data instead of buying it; very detailed | Moderately expensive |
| CSIdata.com | Daily stocks and futures | Cheap; can download many stocks at once | Has survivorship bias (delisted history can be bought separately) |
| CRSP.com | Daily stocks | No survivorship bias; very clean data | Expensive; only updated once a month |
| Tickdata.com | Tick-by-tick stocks and futures | Institutional-grade quality | Expensive |
| Interactive Brokers | Forex data | Free if you already have an IB account | Need an IB account first |
Annualization:
⚠️ Common mistake: using $252 \times 24 = 6,048$ for hourly data. Only count actual trading hours (6.5 hours per NYSE day), not all 24 hours.
Example: Strategy grows 20% per year and the worst loss from peak was 40% \(\text{MAR ratio} = \frac{20\%}{40\%} = 0.5\)
Higher is better. Useful because it roughly cancels out leverage effects, making it easier to compare strategies with different risk levels.
| Your backtest Sharpe ratio | You want confidence that live trading will achieve Sharpe of at least… | Minimum data needed |
|---|---|---|
| 1.0 | Greater than 0 | 681 trading days — about 2.7 years of daily data |
| 2.0 | Greater than 0 | 174 trading days — about 0.7 years of daily data |
| 1.5 | Greater than 1.0 | 2,739 trading days — about 10.9 years of daily data |
These requirements apply to paper trading (running a live strategy with no real money) just as much as to the backtest itself. You need the same amount of clean out-of-sample testing (test set) to be confident your strategy is real. Meaning the minimum data is required for both training set and test set when split data for training
Split your historical data into two halves:
If the strategy works well on both halves, the result is much more credible. If it only works on the training half, it was overfit to the past and is unlikely to hold up in real trading.
The idea: GLD tracks the price of gold. GDX holds a basket of gold-mining company stocks. Since gold miners' profits depend on gold prices, these two ETFs should generally move together. When their prices drift unusually far apart, we bet they will snap back — this is called pair trading.
How it works:**- Calculate the "spread" = GLD price − (1.637 × GDX price). The 1.637 is the hedge ratio — how many dollars of GDX to balance against each dollar of GLD, found by running a simple regression on the training data
- When the spread drops more than 2 standard deviations below normal → **buy the spread** (buy GLD, short GDX)
- When the spread rises more than 2 standard deviations above normal → **short the spread** (short GLD, buy GDX)
- Exit when the spread returns to within 1 standard deviation of normal
Splitting the data:- Training set: first 252 trading days (about 1 year)
- Test set: all remaining days — the strategy runs here without any changes
Results with default settings (entry at ±2 standard deviations, exit at ±1):Plain-English definitions for every key term introduced in Chapter 3.
| Step | What happens | Example |
|---|---|---|
| 1. Data retrieval | Pull fresh prices and any other data needed | Download today’s closing prices for all 500 S&P stocks |
| 2. Signal generation | Run your algorithm to decide what to trade | Strategy says: buy IGE, short SPY |
| 3. Order creation | Turn the signals into a proper order list | Generate file: ("IGE", "BUY", "200"), ("SPY", "SELL", "150") |
| 4. Order transmission | Send orders to your broker | Upload to basket trader or send via API |
| 5. Monitoring | Watch for fills, cancel anything unfilled at end of day | Press “cancel unfilled” before close |
| Cost type | What it is | How to reduce it |
|---|---|---|
| Commission | Fee your broker charges per trade | Use low-cost brokers; avoid high-turnover strategies |
| Bid-ask spread | Gap between buy price and sell price | Avoid low-priced stocks; use limit orders where possible |
| Market impact | Your own buying pushes prices up; your own selling pushes them down | Keep order size below 1% of average daily volume |
| Slippage | Price moves between when your signal fires and when the order fills | Use faster execution infrastructure; choose brokers with better speed |
| What you discover | Why backtesting misses it |
|---|---|
| Software bugs in your ATS | Backtesting runs on stored historical data; bugs only show up with live systems |
| Look-ahead bias | In a backtest you have all the data; in live trading you discover what you actually cannot obtain in time |
| Data feed problems | Historical data is clean; live feeds drop out, arrive late, or contain bad ticks |
| Operational timing issues | How long does downloading + order generation + transmission actually take? |
| Transaction cost estimates | Your real fills often differ from theoretical ones; paper trading gives you real fill prices |
| Data-snooping bias | A month of live paper trading is a genuine out-of-sample test |
Before April 9, 2001, US stock prices were quoted in fractions — sixteenths or eighths of a dollar (e.g., $10 and 3/16). Those wide fractional price increments created friction in the market that statistical arbitrage traders could exploit.
When the US switched to fully decimal pricing on April 9, 2001, those fractions disappeared. Bid-ask spreads narrowed dramatically. The friction that stat arb traders relied on was reduced significantly, and many strategies that looked great in pre-2001 backtests stopped working in the decimal era.
Practical implication: If your backtest data extends before 2001, the pre-decimalization period will show much better performance than you should expect going forward. Be especially sceptical of any strategy that shows most of its historical edge in the pre-2001 period.
If your strategy involves shorting stocks, there is a specific regulatory trap in the historical data.
Before June 2007, the SEC's "uptick rule" stated that you could only short a stock on an "uptick" — meaning the last trade had to have been at a higher price than the one before it. This rule prevented short sellers from piling on during a price decline. In practice, it meant that many profitable short positions simply could not be entered during fast-moving markets.
Before June 2007: Uptick rule in force. Shorting is constrained. Backtest performance for short strategies is artificially inflated because the backtest ignores the uptick constraint.
June 2007 – February 2010: No uptick rule at all. Shorting is unrestricted. This is the most realistic period for backtesting short strategies.
After February 2010: Alternative uptick rule (Rule 201) introduced. Shorting is restricted again when a stock drops more than 10% in a day.
Additional complication — hard-to-borrow stocks: Even when the uptick rule does not apply, many stocks — especially small-caps with low liquidity — are "hard to borrow." To short a stock, your broker has to borrow it from someone else (usually a mutual fund or another client). If no one will lend it, you simply cannot short it, regardless of what your backtest says. This can eliminate many of the best short opportunities in a strategy.
Here is the puzzle:
A stock goes up exactly 1% or down exactly 1% each minute, with equal 50/50 probability. If you buy this stock, will you — in the long run — make money, lose money, or break even?
Most experienced traders answer: break even. The answer is wrong.
You will slowly lose money — at a rate of about 0.005% per minute (0.5 basis points per minute).
Why? Because the mathematics of compounding is not symmetric. Consider two minutes:
$100 becomes $101 - Minute 2: down 1% → your \$101 becomes \$99.99
You did not break even — you lost \$0.01. The 1% gain and 1% loss are equal in percentage terms, but they are applied to different base amounts, so they do not cancel out.
More precisely, the long-run compounded growth rate is:
\[g = m - \frac{s^2}{2}\]Where:
$m$ = average return per period (0% here)
$s$ = standard deviation per period (1% here)
$\frac{s^2}{2} = \frac{0.01^2}{2} = 0.00005 = 0.005\%$
So:
\(g = -0.005\% \text{ per minute — slowly losing money}\)
The full general formula extends this to any leverage:
\[g(f) = r + f \cdot m - \frac{f^2 \cdot s^2}{2}\]Where:
$g(f)$ = compounded growth rate when using leverage $f$
$r$ = risk-free rate (what your cash earns if not invested)
$f$ = leverage (1 = no leverage, 2 = 2× leverage, etc.)
$m$ = average one-period excess return (return minus risk-free rate)
$s$ = standard deviation of one-period returns
$s^2$ = variance of returns
For a single strategy:
\[f = \frac{m}{s^2}\]Where:
Example:
$m = 7\%$, $s = 15\%$
\[f = \frac{0.07}{(0.15)^2} = \frac{0.07}{0.0225} = 3.11\]This means: for every \$1 of your own capital, borrow an additional \$2.11 to invest a total of \$3.11.
The strategy: Simply buy and hold SPY — the ETF that tracks the S&P 500 index.
Historical numbers (at the time of calculation):
Step 1 — Calculate the Sharpe ratio:
\[\text{Sharpe ratio} = \frac{7.23\%}{16.91\%} = 0.428\]Step 2 — Calculate the Kelly leverage:
\[f = \frac{7.23\%}{(16.91\%)^2} = \frac{0.0723}{0.02860} = 2.53\]So Kelly says: if you have \$100,000, borrow money to invest a total of \$253,000 in SPY.
Step 3 — Calculate the optimal compounded growth rate:
\[g_{\max} = r_f + \frac{\text{Sharpe}^2}{2} = 4\% + \frac{(0.428)^2}{2} = 13.1\%\](per year, compounded after financing costs)
For comparison, if you just buy SPY with cash and no leverage:
\[g = 11.23\% - \frac{(16.91\%)^2}{2} = 11.23\% - 1.43\% = 9.8\% \text{ per year}\]The three strategies (actually just three ETFs held simultaneously):
Historical performance (annualised excess returns above 4% risk-free rate):
| ETF | Annual excess return | What this means |
|---|---|---|
| OIH | +13.96% | Oil services — positive edge, should go long |
| RKH | +2.94% | Regional banks — small positive edge, should go long |
| RTH | −0.73% | Retail — slightly negative, Kelly says short it |
Kelly-recommended leverage for each ETF:
| ETF | Kelly leverage | Meaning |
|---|---|---|
| OIH | +1.29× | Long — invest 1.29× your equity in oil services |
| RKH | +1.17× | Long — invest 1.17× your equity in regional banks |
| RTH | −1.49× | Short — short 1.49× your equity in retail |
Combined portfolio result:
For reference, the best individual ETF (OIH) achieves only 12.78% compounded annual growth on its own. The combined portfolio of three ETFs beats OIH’s solo performance — even though RTH is being shorted and RKH barely contributes — because diversification across uncorrelated strategies reduces total volatility.
Maximum safe leverage from historical loss:
Step 1: Find the single worst one-period loss in your backtest
Step 2: Decide the maximum one-period equity loss you could tolerate
Step 3: Maximum leverage = Max tolerable loss ÷ Worst historical loss
Example (from the book — S&P 500):
Worst single-day loss in history: −20.47% (Black Monday, October 19, 1987)
If you can tolerate a 20% one-day equity drop:
Max leverage = 20% ÷ 20.47% ≈ 0.98×
Meanwhile, half-Kelly recommends: 2.53× ÷ 2 = 1.26×
Use the more conservative: 0.98× — barely any leverage at all
This shows that even half-Kelly can sometimes be too aggressive once you factor in historical extremes.
In August 2007, at the start of the subprime mortgage crisis, a series of large hedge funds experienced shocking losses — even funds that held zero mortgage-backed securities.
What happened:
The numbers:
The lesson: Risk management rules that make sense for each individual fund (reduce positions when you lose money) can combine to create a collective problem. When many funds all try to sell the same things at the same time, they create the very price crashes that trigger further selling. This is called financial contagion.
| Situation type | Price behaviour | Stop loss verdict | Example |
|---|---|---|---|
| News-driven: real bad news about the company | Trending — prices fall further | ✅ Use stop losses — the trend is likely to continue | Earnings fraud revealed, revenue collapse |
| Liquidity-driven: forced selling with no fundamental cause | Mean-reverting — prices eventually snap back | ❌ Avoid stop losses — patience is rewarded | Quant fund forced to liquidate, short squeeze |
| Sudden market crash | Gap down — stop loss executes far below trigger | ❌ Stop losses execute at crisis prices, not your target | Black Monday, pandemic announcement |
Psychological trap related to stop losses:
Scenario: Your automated system has a bug and enters a large position by mistake. You discover it quickly and have a big unrealised loss.
The rational response: Exit immediately. The position was entered in error — you have no model for whether it will recover.
The emotional response most traders take: Wait for mean reversion. “I’ll exit once it comes back a bit and the loss is smaller.”
The result: The position usually keeps losing, because there was no model that said now was a good time to hold it. The irrational wait for mean reversion turns a manageable loss into a larger one.
The rule: If you entered a position by mistake (software bug, wrong button pressed, data error), exit it immediately regardless of the current P&L. Never wait for mean reversion on an accidental position — you have no basis for assuming mean reversion will occur on that specific security at that specific time.
| Risk type | What it means in plain English | How to reduce it |
|---|---|---|
| Model risk | Your strategy never had a real edge, or market conditions changed | Have someone else independently replicate your backtest; keep strategies simple; update parameters regularly |
| Software risk | Your trading system has bugs that cause it to trade differently from your backtest | Compare live system trades vs. backtest trades on the same data daily; paper trade extensively |
| Operational risk | Physical infrastructure failures (internet, power, hardware) | Backup internet connection; uninterruptible power supply; clear emergency procedures |
After a large unexpected loss, the almost universal human response is to look at the historical data and ask: “What rule would have avoided this specific loss?” You then add that rule to your strategy.
The problem: This is pure data-snooping bias applied in real time. You are tuning your strategy to avoid a loss that has already happened, not to prevent future losses. Almost certainly, your new rule will:
The correct response: If you feel the strategy genuinely needs improvement after a loss, run a proper backtest of the modified version over a long historical period — not just the recent weeks that hurt you. If the modified strategy genuinely outperforms on a multi-year backtest, the change may be justified. If it only helps for the recent period, you are curve-fitting to recent noise.
Disaster 1 — Greed at an institutional fund: While working at a money management firm, a strategy had been running successfully for about six months. In a fit of enthusiasm (greed), over $100 million was added to that portfolio. The strategy had not been running long enough to validate whether the six months of performance was genuine or lucky. The result: over $1 million in losses for the fund’s investors.
Disaster 2 — Despair while trading independently: A mean-reverting spread strategy between XLE (energy ETF) and crude oil futures (CL) was not reverting as expected. Instead of reducing the position according to Kelly principles, the position was stubbornly increased to almost $500,000, hoping the reversion would come. Eventually despair set in, the position was exited with close to a six-figure loss. Shortly after — as is always the case in such stories — the spread reverted exactly as the strategy predicted. The strategy was right. The position management was catastrophically wrong.
The lesson the book draws from both: Both disasters shared the same root cause: letting emotion override a systematic process. In the first case, greed — adding capital too quickly. In the second, the combination of overconfidence and then despair — increasing a losing position and then exiting at the worst possible moment.
The solution is not more discipline in the abstract — it is more concrete: start small, follow the Kelly formula mechanically, and build up position sizes gradually only as track record justifies it.
Nobel laureate Daniel Kahneman famously used this gamble to illustrate what he called “loss aversion bias”:
You are offered a fair coin flip. Tails: you lose $100. Heads: you win $110. Would you take it?
Most people refuse. Kahneman called this irrational — the expected gain is $5.
The book argues that the person refusing is actually correct.
Here is why. Suppose you start with $1,000 and keep playing this game repeatedly, adjusting stakes proportionally to your current wealth:
$5 gain on $1,000)$105 range on $1,000)You are losing money on average, even though the expected gain is positive. The variance is large enough that the compounding math works against you.
The key insight: the standard economic argument for taking this bet assumes you have infinite capital and can play many games simultaneously. In reality, you have one bankroll and must play rounds in sequence — and if you go broke, you stop playing forever. From that perspective (the “time series view”), refusing the bet is the mathematically correct decision.
The takeaway for traders: focusing purely on expected return while ignoring variance is a mistake. Variance destroys compounded wealth. The Kelly formula captures this precisely — it maximises compounded growth, not just expected return, which is why it penalises high-volatility strategies.
Definitions for key terms.
| Cause | Type it creates | Example |
|---|---|---|
| News diffusing slowly to investors | Momentum | Earnings beat — price drifts up for weeks as more investors react |
| A large institution executing a big order gradually | Momentum | Fund buying 10 million shares over several days — price trends up |
| Herd behaviour — following what others are doing | Momentum | Meme stocks (GameStop, 2021) — price goes to irrational extremes |
| Liquidity events — forced selling for unrelated reasons | Mean reversion | Fund forced to liquidate — price overshoots, then recovers |
| Two fundamentally linked stocks temporarily diverge | Mean reversion | GLD and GDX — gold ETF and gold miners drift apart, then reconnect |
The base strategy: Trade GLD (gold ETF) based on its relationship with GDX (gold miners ETF). Every minute, compute the spread between GLD and GDX (weighted by a hedge ratio). When the spread is unusually low (more than X standard deviations below its exponential moving average), buy GLD. When it is unusually high, short GLD. Exit when the spread returns to near-normal.
Three adjustable parameters:
GDX_weight — how many GDX shares’ worth to compare against each GLD share (tested from 2 to 4)entry_threshold — how far below normal the spread must be before entering (tested from 0.2 to 5 standard deviations)lookback — how many minutes of history to use for the moving average (tested from 30 to 720 minutes)Standard approach (Unconditional Parameter Optimization): Pick the one best combination on training data. Use those same settings forever on the test data.
CPO approach (Conditional Parameter Optimization): Every day after market close, run 400 combinations of the three parameters through a machine learning model trained to predict each combination’s next-day return. Select the combination predicted to work best tomorrow. Use those settings the next day only — then repeat.
Out-of-sample test set results (last 3 years ending December 31, 2020):
| Metric | Standard (unconditional) | CPO (conditional) |
|---|---|---|
| Annual return | 17.29% | 19.77% |
| Sharpe ratio | 1.947 | 2.325 |
| Calmar ratio (return ÷ max drawdown) | 0.984 | 1.454 |
| 3-year cumulative return | 73% | 83% |
Every performance metric improved with CPO. The machine learning step adds roughly 2-3% annual return and meaningfully improves the risk-adjusted profile.
The PredictNow.ai API (predictnow.ai) provides the ML prediction service used in this example. Sample code is available in the book.
| Property | Correlation | Cointegration |
|---|---|---|
| Measures | Whether daily returns move together | Whether the price relationship is stable long-term |
| Time horizon | Short-term (day by day) | Long-term (months to years) |
| What it guarantees | Nothing about where prices end up | That the weighted spread won’t drift away forever |
| Useful for | Risk management, beta hedging | Pair trading — finding mean-reverting spreads |
| KO vs. PEP | Correlation = 0.48 (statistically significant) | Not cointegrated — prices drift apart |
| GLD vs. GDX | Some correlation | Cointegrated — spread stays stationary |
The test: Use an Augmented Dickey-Fuller (ADF) statistical test to check whether GLD and GDX are cointegrated. If they are, find the hedge ratio — how many GDX shares to short per share of GLD to create a stationary spread.
MATLAB result:
CADF t-statistic: -3.18
5% critical value: -3.38
10% critical value: -3.08
→ t-stat is between these two: >90% probability of cointegration
Hedge ratio (β): 1.6766
Spread = GLD − 1.6766 × GDX (this spread is stationary — see Figure 7.2)
Python result: ADF t-statistic = −2.4, which does not reach even the 90% critical value — suggesting no cointegration. However, this contradicts both the MATLAB and R results, and is likely a library accuracy issue.
R result:
CADF t-statistic: -3.24
p-value: 0.005
→ Reject the null hypothesis of no cointegration at 99.5% confidence
Hedge ratio: 1.631
GLD and GDX are cointegrated. The stationary spread is shown in Figure 7.2 — it bounces around a stable mean without drifting away.
Coca-Cola (KO) and PepsiCo (PEP) seem like a natural pair — same industry, same products, same customer base. Most pair traders would assume they are cointegrated.
Cointegration test result:
ADF t-statistic: -2.14
10% critical value: -3.038
→ t-stat is ABOVE the critical value: less than 90% probability of cointegration
Conclusion: KO and PEP are NOT reliably cointegrated
Correlation test result:
Daily return correlation: 0.4849
P-value: effectively 0
→ They are highly and significantly correlated on a daily basis
So KO and PEP move in the same direction most days (correlated), but their prices can and do drift apart indefinitely over long periods (not cointegrated). This means a pair trade between KO and PEP has no mathematical guarantee of mean-reverting — the spread could just keep widening.
Figure 7.3 shows this: the KO-PEP spread is clearly non-stationary, with no tendency to return to a fixed mean.
| Factor | What it measures | Example exposure |
|---|---|---|
| Market (beta) | How much does this stock move when the whole market moves? | Beta = 1.5 means the stock moves 1.5× the market |
| SMB (small minus big) | Does this stock behave more like small-cap or large-cap? | A micro-cap stock has positive SMB exposure |
| HML (high minus low) | Is this a value stock (cheap) or growth stock (expensive)? | A stock with a low P/E ratio has positive HML exposure |
The approach: Instead of using named factors (market, SMB, HML), use PCA to extract 5 statistical factors from the daily returns of all S&P 600 small-cap stocks over the past 252 trading days. Assume these factor returns have momentum — they will remain roughly the same tomorrow as they are today. Based on that assumption, predict each stock’s next-day return. Buy the 50 stocks with the highest predicted return and short the 50 with the lowest.
Results (no transaction costs):
The strategy generates positive returns but they are modest — and this is before transaction costs. The difference in results between MATLAB and Python/R are due to rounding differences in the PCA implementation, not a fundamental disagreement.
| Strategy type | Normal exit | Alternative exit | Stop loss? |
|---|---|---|---|
| Mean-reversion | Fixed holding period = half-life of mean reversion | Target price = historical mean (µ) | ❌ No — exits at the worst moment |
| Momentum | Fixed holding period (backtested) | Opposite entry signal fires | ✅ Yes — a price reversal signals the trend has ended |
The Ornstein-Uhlenbeck (OU) formula describes mean-reverting processes mathematically:
dz(t) = −θ × (z(t) − μ) × dt + dW
Where:
z(t) = the spread at time t
μ = the long-run average value of the spread
θ = the speed of mean reversion (higher = faster reversion)
dW = random noise (Gaussian)
Half-life = ln(2) / θ
To find θ: run a linear regression of daily changes in the spread
(dz) against the spread itself (z − mean(z)).
The slope of this regression is −θ.
Using the GLD-GDX spread from Example 7.2 (Spread = GLD − 1.67 × GDX):
MATLAB steps:
dz = z(t) − z(t−1) dz against z(t−1) − mean(z): the slope of this regression is −θ half-life = ln(2) / θ results = ols(dz, prevz - mean(prevz));
theta = results.beta;
halflife = -log(2) / theta
% halflife = 7.84 trading days
Python result: half-life = 7.84 trading days
R result: half-life = 7.84 trading days — all three agree.
This means: after entering a spread trade, you should expect to hold it for roughly 7–10 trading days before the spread has reverted about halfway back to its mean. This is your natural holding period estimate — derived mathematically from the data, not guessed from a backtest of individual trades.
The strategy: From S&P 600 small-cap stocks, identify those with the worst returns in December. Buy them at the end of December, sell them at the end of January. Rationale: tax-loss selling pressure in December depresses these stocks artificially; when the pressure lifts in January, prices recover.
This worked well in historical backtests covering many years before 2006. But by 2006–2007, enough traders knew about it that the trade had been arbitraged away. It worked again in January 2008 — but that was during an unusual period of extreme market turmoil (the Société Générale scandal and a surprise Fed rate cut) that benefited mean-reversion strategies broadly, not specifically the January effect.
The strategy: At each month-end, buy the stocks from the S&P 500 that had the best returns in the same month one year earlier, and short those with the worst returns in that same month one year earlier. The idea: whatever drove returns in that calendar month last year (seasonal factors, sector rotations) may repeat this year.
Out-of-sample results:
The trade:
Annual P&L (selected years, post-2007 are out-of-sample):
| Year | P&L ($) | Max drawdown ($) |
|---|---|---|
| 2007 (out-of-sample) | +4,322 | −5,279 |
| 2008 (out-of-sample) | +9,740 | −1,156 |
| 2009 (out-of-sample) | −890 | −4,167 |
| 2012 (out-of-sample) | −7,997 | −8,742 |
| 2015 (out-of-sample) | +8,539 | −1,753 |
Overall: Profitable in 19 of the 21 years shown, including many out-of-sample years. The two losing years were modest. The trade is economically motivated and has held up out-of-sample better than any equity seasonal strategy.
The trade:
Out-of-sample results (2007–2016): Mixed. The trade had large losses in 2009 (−$4,240), 2010 (−$8,360), and 2012 (−$7,180). Natural gas is extremely volatile — Amaranth Advisors lost $6 billion on natural gas trades in 2006.
| Feature | Daily strategy | High-frequency strategy |
|---|---|---|
| Holding period | Days to months | Seconds to minutes (never overnight) |
| Bets per day | 1–10 | 100 to 10,000+ |
| Typical Sharpe ratio | 1–3 | 3–10+ |
| Leverage possible | 2–10× | 10–100×+ |
| Main cost driver | Transaction costs | Transaction costs + latency |
| Backtesting reliability | High (with clean daily data) | Low (requires very specialised tick data) |
| Infrastructure needed | Standard PC, internet connection | Co-location, C++ code, low-latency feeds |
| Drawdown profile | Can have multi-month drawdowns | Very quick to go flat — risk is manageable |
| Independent trader feasibility | Achievable | Difficult but not impossible to work toward |
Definitions for every key terms.