Backtesting through Cross Validation - jaeaehkim/trading_system_beta GitHub Wiki

Motivation

  • ํ˜‘์˜์  : ์—ญ์‚ฌ์  ์‹œ๋ฎฌ๋ ˆ์ด์…˜ , ๊ด‘์˜์  : (๊ณผ๊ฑฐ์— ๋ฐœ์ƒํ•˜์ง€ ์•Š์•˜๋˜) ์‹œ๋‚˜๋ฆฌ์˜ค ์‹œ๋ฎฌ๋ ˆ์ด์…˜
    • ์‹œ๋‚˜๋ฆฌ์˜ค ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ํ†ตํ•ด์„œ ๊ณผ๊ฑฐ Path ํ•œ ๊ฐœ์— bias๋˜์ง€ ์•Š๊ณ  stress test๋ฅผ ์ง„ํ–‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋จ.
  • ํ˜„์žฌ๋Š” ์—ญ์‚ฌ์  ์‹œ๋ฎฌ๋ ˆ์ด์…˜์ด Backtesting์˜ ๋™์˜์–ด์ฒ˜๋Ÿผ ๋˜์–ด๋ฒ„๋ฆผ. Walk-Forward ๋ฐฉ์‹๊ณผ CPCV ๋ฐฉ์‹์„ ์•„๋ž˜์— ์†Œ๊ฐœํ•จ.

The Walk-Forward Method

image

  • Concept : '๊ณผ๊ฑฐ์— ์ด ์ „๋žต์„ ์‚ฌ์šฉํ–ˆ๋”๋ผ๋ฉด ์–ด๋–ป๊ฒŒ ๋์„๊นŒ?'์— ๋Œ€ํ•œ ์—ญ์‚ฌ์  ์‹œ๋ฎฌ๋ ˆ์ด์…˜
    • Train ํ•˜๋Š” ๊ธฐ๊ฐ„์„ ์–ด๋–ป๊ฒŒ ์žก์„ ๊ฒƒ์ด๋ƒ์— ๋”ฐ๋ผ ์•ฝ๊ฐ„์˜ Variation์„ ์ค„ ์ˆ˜ ์žˆ์Œ.
  • Walk-Forward๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๊ณผ์ •์—์„œ๋„ ์ •๊ตํ•˜๊ฒŒ ์‹คํ–‰ํ•˜๋ ค ํ•œ๋‹ค๋ฉด data source knowledge / market microstructure / risk-management / performance measurement ๋“ฑ ๋ชจ๋‘ ๊ณ ๋ คํ•ด์•ผ ํ•œ๋‹ค. ๋ฌผ๋ก , ์‹คํ–‰ํ•˜๋Š” ์ „๋žต์— ๋”ฐ๋ผ์„œ ๊ฐ ๋ถ€๋ถ„์˜ ๊ฐ•์•ฝ ์กฐ์ ˆ์„ ํ•  ์ˆ˜ ์žˆ๋‹ค.
    • ์žฅ์ 
      • ๋ช…๋ฐฑํ•œ ์—ญ์‚ฌ์  ํ•ด์„์„ ๊ฐ–๊ณ  ์žˆ์œผ๋ฏ€๋กœ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ฑฐ๋ž˜๋ฅผ ํ†ตํ•ด ์ผ์น˜์‹œํ‚ด. -> Production Level์—์„œ Monitoringํ•  ๋•Œ ์ฃผ์š”ํ•จ
      • Purge๊ฐ€ ์ ์ ˆํžˆ ๊ตฌํ˜„๋œ๋‹ค๋ฉด information leakage๊ฐ€ ์—†๋Š” ๊ฒƒ์ด ๋ณด์žฅ๋œ๋‹ค. ๋˜ํ•œ, embargo๋Š” ํ•„์š”์—†๋‹ค. ํ›ˆ๋ จ ์ง‘ํ•ฉ์ด ํ•ญ์ƒ ํ…Œ์ŠคํŠธ ์ง‘ํ•ฉ ์ด์ „์ด๊ธฐ์— (์ฐธ๊ณ  : Cross Validation in Model)
    • ๋‹จ์ 
      • ๋‹จ์ผ ์‹œ๋‚˜๋ฆฌ์˜ค ํ…Œ์ŠคํŠธ๊ฐ€ ๋˜์–ด๋ฒ„๋ฆฌ๊ณ  ์ด๋Š” ์‰ฝ๊ฒŒ ๊ณผ์ ํ•ฉ ๊ฐ€๋Šฅ
      • WF๋Š” ๋ฏธ๋ž˜์˜ ์„ฑ๋Šฅ์„ ๋‚˜ํƒ€๋‚ด๋ ค๊ณ  ํ‘œํ˜„ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํŠน์ • Data Point Sequence์— ํŽธํ–ฅ๋  ์ˆ˜ ์žˆ์Œ
        • ์—ญ์‚ฌ๋„ ๋ช‡๊ฐ€์ง€ ๊ฒฐ์ •์  ์„ ํƒ๋“ค๋กœ ์ธํ•ด Data Sequence๊ฐ€ ๋ฐ”๋€” ์ˆ˜ ์žˆ๋Š”๋ฐ ์ด ๋ถ€๋ถ„์— ๋Œ€ํ•œ ๊ฒƒ์„ ํ•˜๋‚˜๋กœ๋งŒ ํŽธํ–ฅ๋˜๊ฒŒ ๊ฒฐ๊ณผ๋ฅผ ๋ฝ‘๊ณ  ์ด ์ค‘์—์„œ ์ข‹์€ ๊ฒƒ์„ ๋ฝ‘๊ธฐ ๋•Œ๋ฌธ์— ๋ฏธ๋ž˜์˜ ์—ฌ๋Ÿฌ ์ƒํ™ฉ์— ๋Œ€ํ•œ Stress test๊ฐ€ ๋งค์šฐ ๋ถ€์กฑํ•จ. Walk-Backward test ๊ฐ™์€ ๊ฒƒ์„ ํ•  ๋•Œ ๋” ์ข‹์€ ์„ฑ๋Šฅ์ด ๋‚˜์˜ค์ง€ ์•Š๋Š”๋‹ค๋ฉด ๊ณผ์ ํ•ฉ์˜ ์ฆ๊ฑฐ๊ฐ€ ๋จ.
      • ์ดˆ๊ธฐ ๊ฒฐ์ •์€ ์ „์ฒด ํ‘œ๋ณธ ์ค‘ ๋” ์ž‘์€ ๋ถ€๋ถ„์—์„œ ํ•™์Šต๋˜์–ด ์ด๋ค„์กŒ๋‹ค๋Š” ์ . ํŠนํžˆ, ์ฒซ ํฌ์ธํŠธ๋ฅผ fixํ•˜๋ฉด์„œ train ํ•˜๋Š” ๊ฒฝ์šฐ๋Š” ์ „๋žต ๊ฒฐ์ •์˜ ์ ˆ๋ฐ˜์„ ์ „์ฒด ๊ธฐ๊ฐ„์˜ ์ ˆ๋ฐ˜ ์ดํ•˜์˜ ๋ฐ์ดํ„ฐ๋กœ๋งŒ ํ•™์Šตํ•˜๊ณ  ์ง„ํ–‰ํ•˜๊ฒŒ ๋จ. ๋ฌผ๋ก  ์œ„์˜ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ ๋ชจ๋“  ๊ตฌ๊ฐ„์—์„œ ํ•™์Šต๋Ÿ‰์„ ์ผ์ •ํ•˜๊ฒŒ ํ•˜๋Š” ๋ฐฉ๋ฒ•๋„ ์žˆ์œผ๋‚˜ ์ด ๋ถ€๋ถ„์€ ๋งŽ์€ ์–‘์„ ํ•™์Šตํ•˜์ง€ ์•Š๋Š”๋‹ค๋Š” ๋‹จ์ ์ด ์กด์žฌ.

The Corss Validation Method

image

  • ํˆฌ์ž์ž๊ฐ€ ์›ํ•˜๋Š” ๊ฒƒ์€ 08๋…„, ๋‹ท์ปด ๋ฒ„๋ธ”, ๊ธด์ถ• ๋ฐœ์ž‘, 15~16๋…„ ์ค‘๊ตญ ์ฃผ์‹ ์š”๋™๊ณผ ๊ฐ™์€ ์ „๋ก€ ์—†๋Š” ์ŠคํŠธ๋ ˆ์Šค ์‹œ๋‚˜๋ฆฌ์˜ค์— ์ง๋ฉดํ–ˆ์„ ๋•Œ ํˆฌ์ž ์ „๋žต์˜ ์„ฑ๋Šฅ์„ ์•Œ๊ณ  ์‹ถ์Œ
    • Simple CV Method
      • ๊ด€์ธก ๊ฐ’์„ ๋‘ ๊ฐ€์ง€ ์ง‘ํ•ฉ์œผ๋กœ ๋‚˜๋ˆ„์–ด 2009๋…„~2017๋…„์˜ ๋ฐ์ดํ„ฐ๋กœ Trainํ•˜๊ณ  08๋…„์— ๋Œ€ํ•ด Test ํ•˜๋Š” ๋ฐฉ์‹
      • ์ด ๋ฐฉ๋ฒ•์€ ์—ญ์‚ฌ์ ์œผ๋กœ ์ •ํ™•ํ•˜์ง€ ์•Š๋‹ค. ์ด ๋ฐฉ๋ฒ•์˜ ๋ชฉํ‘œ๋Š” ์—ญ์‚ฌ์  ์ •ํ™•๋„๊ฐ€ ์•„๋‹Œ 08๋…„๋„์˜ ๋‚ด์šฉ์„ ๋ชจ๋ฅด๋Š” ์ „๋žต์ด 08๋…„๋„์™€ ๊ฐ™์€ Stress test๋ฅผ ํ•ด๋ณด๋Š” ๊ฒƒ์— ์žˆ๋‹ค.
    • ์žฅ์ 
      • CV๋Š” k๊ฐœ์˜ ๋Œ€์ฒด ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ํ…Œ์ŠคํŠธํ•˜๊ณ  ํ•˜๋‚˜๋งŒ ๊ณผ๊ฑฐ sequence์— ํ•ด๋‹น
      • ๋ชจ๋“  ๊ฒฐ์ •์€ ๋™์ผํ•œ ํฌ๊ธฐ์˜ ์ง‘ํ•ฉ์— ๋Œ€ํ•ด ์ด๋ค„์ง„๋‹ค. ํ›ˆ๋ จํ•˜๋Š” ์ •๋ณด์˜ ์–‘์ด ๋™์ผ.
      • ๊ฐ€์žฅ ๊ธด OOS ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. Walk-Forward ๋Œ€๋น„ํ•ด warm-up ๋ถ€๋ถ„์ด ์—†๊ธฐ ๋•Œ๋ฌธ.
        • test(or validation) set์„ ์—ฐ๊ฒฐํ•ด ๊ฐ€์žฅ ๊ธด OOS ๋‹จ์ผ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ์–ป์„ ์ˆ˜ ์žˆ์Œ
    • ๋‹จ์ 
      • Walk-Forward์ฒ˜๋Ÿผ ๋‹จ์ผ Backtesting ๊ฒฝ๋กœ๊ฐ€ ์‹œ๋ฎฌ๋ ˆ์ด์…˜๋œ๋‹ค. (์—ญ์‚ฌ์ ์ด์ง„ ์•Š์ง€๋งŒ ๊ด€์ธก๋ณ„๋กœ ์˜ค์ง ํ•˜๋‚˜์˜ ์˜ˆ์ธก๋งŒ ์ƒ์„ฑ๋จ)
      • ๋ช…ํ™•ํ•œ ์—ญ์‚ฌ์  ํ•ด์„์„ ํ•˜์ง€๋Š” ๋ชปํ•œ๋‹ค.
      • Purge/Embargo๋ฅผ ํ•˜์ง€ ์•Š์œผ๋ฉด inforamtion leakage๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค.

The Combinatorial Purged Cross-Validation(CPCV) Method

image

  • Walk-Forward์™€ (Simple) Cross-Validation ์˜ ์ฃผ์š” ๋‹จ์ ์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์ƒˆ๋กœ์šด ๋ฐฉ๋ฒ• : CPCV

Combinatorial Splits

  • image
    • T= Data Point ๊ฐœ์ˆ˜ , N=์ „์ฒด ๊ทธ๋ฃน (T๋ฅผ N๊ฐœ์˜ ๊ทธ๋ฃน์œผ๋กœ ๋‚˜๋ˆ”) , k=Test ๊ทธ๋ฃน ๊ฐœ์ˆ˜ (๋นจ๊ฐ„์ƒ‰, ํ•˜๋‚˜์˜ Case์— ๋Œ€ํ•œ N-k๋Š” Train ๊ทธ๋ฃน ๊ฐœ์ˆ˜)
  • image
    • Backtest Path ๊ฐœ์ˆ˜ ๊ณ„์‚ฐ
    • k * Combination(N, N-k)๋Š” ์œ„์˜ ๊ทธ๋ฆผ์˜ ๋นจ๊ฐ„์ƒ‰ ๊ทธ๋ฃน์˜ ์ด ๊ฐœ์ˆ˜. ์ „์ฒด ๊ทธ๋ฃน ์ˆ˜๋กœ ๋‚˜๋ˆ„๋ฉด Backtest Path ๊ฐœ์ˆ˜๊ฐ€ ๋‚˜์˜ด
    • ์œ„์˜ ๊ทธ๋ฆผ์—์„œ 1๋ฒˆ๋ผ๋ฆฌ๋งŒ ์—ฐ๊ฒฐํ•œ ๊ฒƒ์ด ํ•˜๋‚˜์˜ OOS Simulation Path, 2๋ฒˆ๋ผ๋ฆฌ๋งŒ ์—ฐ๊ฒฐํ•œ ๊ฒƒ์ด 2๋ฒˆ์งธ OOS Simulation Path
    • ์œ„์˜ ์˜ˆ์ œ๋Š” N=6, k=2, pi=5์— ํ•ด๋‹น
  • Issue
    • T๊ฐœ์˜ Data Point๋ฅผ N๊ฐœ๋กœ ๋‚˜๋ˆ„๋ฉด N๋ฒˆ์งธ ๊ทธ๋ฃน์˜ Data Point ๊ฐœ์ˆ˜๋Š” ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ์Œ. (๋‚˜๋จธ์ง€๊ฐ€ 0์ธ ๊ฒฝ์šฐ ์ œ์™ธ)
      • image
        • image
    • ๊ฐ ๊ทธ๋ฃน์€ T * (1-k/N)์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•ด ํ•™์Šตํ•˜๊ฒŒ ๋œ๋‹ค.

The CPCV Backtesting Algorithm

  1. T ๊ด€์ธก๊ฐ’์„ N๊ทธ๋ฃน์œผ๋กœ ์„ž์ž„ ์—†์ด ๋ถ„ํ• .
    • N๋ฒˆ์งธ ๊ทธ๋ฃน ํฌ๊ธฐ๋Š” ์œ„์˜ Issue๋ฅผ ๋”ฐ๋ฆ„
  2. ๊ฐ€๋Šฅํ•œ ๋ชจ๋“  Train/Test ๋ถ„ํ• ์„ ๊ณ„์‚ฐ ํ›„ '๊ฐ ๋ถ„ํ• '์—์„  N-k ๊ทธ๋ฃน์€ Train ์ง‘ํ•ฉ, k ๊ทธ๋ฃน์€ Test ์ง‘ํ•ฉ์œผ๋กœ ๊ตฌ์„ฑ
  3. Purge / Embargo Process๋ฅผ ์ง„ํ–‰.
  4. Combination ( N, N-k ) ์˜ Train ์ง‘ํ•ฉ์— fittingํ•œ ํ›„ Combination ( N, N-k )์˜ Test ์ง‘ํ•ฉ์— Inference๋ฅผ ์‹คํ–‰
  5. ๊ฐ Test ๊ทธ๋ฃน๋“ค์„ Sequence์— ๋งž์ถฐ์„œ Backtest Path๋กœ pi๊ฐœ ํ˜•์„ฑ
  6. pi๊ฐœ์— ๋Œ€ํ•œ Sharpe Ratio ๊ฐ’์„ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๊ณ  ์ด๋ฅผ ํ†ตํ•ด SR์˜ empirical distribution์„ ์–ป์„ ์ˆ˜ ์žˆ์Œ
    • Real Sharpe Ration๋ฅผ ๋ชจ๋“  Backtest Path์˜ Sharpe Ratio์˜ Average๋กœ ํ•œ๋‹ค๋“ ์ง€ ํ™œ์šฉ ๊ฐ€๋Šฅ.

A Few Examples

  • k=1์ด๋ฉด CV method์™€ ๋™์ผํ•ด์ง
  • k=2์ด๋ฉด N-1๊ฐœ์˜ backtest path ํ˜•์„ฑ
  • ๋” ๋งŽ์€ ๊ฒฝ๋กœ๊ฐ€ ํ•„์š”ํ•œ ๊ฒฝ์šฐ์— Train ์ง‘ํ•ฉ์˜ Point๊ฐ€ ์ค„์–ด๋“œ๋Š” ์–‘์„ ๊ณ ๋ คํ•ด์„œ k -> N/2๊นŒ์ง€ ์ฆ๊ฐ€์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค. ๋‹ค๋งŒ, k=2์—ฌ๋„ ์ถฉ๋ถ„ํ•œ ๊ฒƒ์œผ๋กœ ๋ณธ๋‹ค.

How Combinatorial Purged Cross-Validation Addresses Backtest Overfitting

  • ์™œ CPCV๊ฐ€ CV๋‚˜ WF๋ณด๋‹ค ์ ์€ ๊ฑฐ์ง“์„ ๋ฐœ๊ฒฌํ•˜๊ฒŒ ๋˜๋Š”์ง€ ์—ฐ์—ญ์ ์œผ๋กœ ์ฆ๋ช… (Bailey, D., J. Borwein, M. Lยดopez de Prado, and J. Zhu (2014))
  • ์ง๊ด€์  ์„ค๋ช…
    • WF, CV๋Š” ๋‹จ์ผ Backtest Path๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๊ณ  N๊ฐœ์˜ ๋ชจ๋ธ์— ๋Œ€ํ•ด N๊ฐœ์˜ Sharpe Ratio๋ฅผ ๊ณ„์‚ฐํ•œ ํ›„ Max Sharpe Ratio ๊ฐ’์„ ๊ฐ–๋Š” N_star ๋ชจ๋ธ์„ ์ตœ์ข…์ ์œผ๋กœ ์„ ํƒํ•˜๊ฒŒ ๋จ
      • ๊ฐ ๋ชจ๋ธ์˜ Backtest๋Š” ๊ณผ์ ํ•ฉ ๋  ๊ฐ€๋Šฅ์„ฑ์„ ๊ฐ€์ง€๊ณ  ์žˆ๊ณ  ์ด ์ค‘์—์„œ Max๋ฅผ ์ฐพ๊ฒŒ ๋˜๋ฉด ๊ณผ์ ํ•ฉ ๊ฐ€๋Šฅ์„ฑ์ด ๋งค์šฐ ๋†’์•„์ง
    • CPCV๋Š” 1๊ฐœ์˜ ๋ชจ๋ธ์ด์–ด๋„ Pi๊ฐœ์˜ Backtest Path์— ๊ด€ํ•œ Pi๊ฐœ์˜ Sharpe Ratio๋ฅผ ๊ฐ–๊ฒŒ ๋˜๊ณ  ์ด๋ฅผ ํ‰๊ท ํ•œ ๊ฒƒ์„ ํ•ด๋‹น ๋ชจ๋ธ์˜ Sharpe Ratio Average๋ผ๊ณ  ์ •์˜ํ•˜๊ฒŒ ๋˜๋ฉด 1๊ฐœ Path์— ์šฐ์—ฐํžˆ Sharpe๊ฐ€ ๋†’๊ฒŒ ๋‚˜์˜จ ๊ฒฝ์šฐ๋ฅผ ๊ฑธ๋Ÿฌ๋‚ผ ์ˆ˜ ์žˆ๊ฒŒ ๋จ. ์ด Sharpe Ratio ๊ฐœ์ˆ˜๋Š” ์ด Pi * N๊ฐœ ๋งŒํผ ๋‚˜์˜ค๊ฒŒ ๋˜๊ณ  Sharpe Ratio Average๋Š” N๊ฐœ๋งŒํผ ๋ฐœ์ƒํ•จ
    • CPCV์— ์˜ํ•ด์„œ ๋‚˜์˜จ Sharpe Ratio N๊ฐœ๋Š” ํ•œ ๋ฒˆ์˜ ํ‰๊ท  ์ž‘์—…์„ ๊ฑฐ์นœ ๊ฒฐ๊ณผ๋ผ WF,CV์— ์˜ํ•ด ๋‚˜์˜จ Sharpe Ratio N๊ฐœ์— ๋น„ํ•ด Sharpe Ratio Distribution์„ ๊ทธ๋ฆฌ๋ฉด ๋ถ„์‚ฐ์ด ๋‚ฎ๊ฒŒ ๋‚˜์˜ฌ ์ˆ˜ ๋ฐ–์— ์—†์–ด์ง. ์ฆ‰, CPCV๋Š” Backtest์˜ ๋ถ„์‚ฐ์„ ๋งค์šฐ ์ž‘๊ฒŒ ํ•ด์ค˜์„œ ๊ณผ์ตœ์  Backtest๋ฅผ ๋ฐœ๊ฒฌํ•˜๊ฒŒ ํ•  ํ™•๋ฅ ์„ ๋งค์šฐ ๋‚ฎ์ถฐ์ฃผ๊ฒŒ ๋œ๋‹ค.
      • ์ฆ‰, Sharpe Ratio Average๋ฅผ ํ™œ์šฉํ•˜๋ฉด ๋งˆ์Œ ํŽธํžˆ max๊ฐ’์œผ๋กœ strategy selection์„ ํ•ด๋„ ๋œ๋‹ค.