Machine Learning For Trading - penny4860/study-note GitHub Wiki

1. ์ •๋ฆฌ

์š”์•ฝ

  • ๋ฐ์ดํ„ฐ ํŒŒ์ผ : https://finance.yahoo.com
  • ๋ฐ์ดํ„ฐ ๋ถ„์„ ์˜ˆ์ œ : finance.py
  • Rolling Mean
  • Correlation ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ• : corr.py
    1. ์ž์‚ฐ๋ณ„ csvํŒŒ์ผ ๋‹ค์šด๋กœ๋“œ : https://finance.yahoo.com
    2. ํŠน์ •๊ธฐ๊ฐ„ ๋™์•ˆ df๋กœ ์ฝ์–ด์˜ค๊ธฐ
    3. daily return ๊ณ„์‚ฐ
    4. correlation ๊ณ„์‚ฐ
      • returns.corr(method="pearson")

์งˆ๋ฌธ

  • adjust close ๋Š” ์ข…๊ฐ€์—์„œ ์–ด๋–ค๊ฑธ ๋ณด์ •ํ•œ๊ฑฐ์ง€?

2. ๋‚ด์šฉ

Lesson 1-2. Working with multiple stocks

  • ๋‚ ์งœ๋ฅผ index๋กœํ•˜๋Š” df ์ƒ์„ฑํ•˜๊ธฐ : pd.date_range(s, e)
  • read / join / slicing / plotting

Lesson 1-4. Statistical Analysis of Time Series

  • Global Stat.
  • Rolling Stat.
    • ํŠน์ •๊ธฐ๊ฐ„(์ผ์ฃผ์ผ) ๋™์•ˆ window๋ฅผ ์”Œ์›Œ์„œ ํ‰๊ท ์„๋‚ธ๋‹ค.
    • window์˜ value๋กœ ํ†ต๊ณ„์น˜๋ฅผ ๊ณ„์‚ฐ
      • ๋…ธ์ด์ฆˆ๋ฅผ ์—†์• ๋Š” ํšจ๊ณผ
      • smooth / lag
    • global mean์— ๋น„ํ•ด์„œ laggingํ•œ ์ปค๋ธŒ๊ฐ€ ๊ทธ๋ ค์ง„๋‹ค.
      • Trade signal๋กœ ํ™œ์šฉ๊ฐ€๋Šฅ.
      • global mean๊ณผ rolling mean์˜ ๊ต์ฐจ์ 
      • (global mean - rolling mean) > 2 * ํ‘œ์ค€ํŽธ
  • Daily Returns
    • ํ•˜๋ฃจ ์ˆ˜์ต๋ฅ 
    • daily_ret[t] = (price[t] / price[t-1]) - 1
  • Cumulative Returns
    • ํŠน์ • ๊ธฐ๊ฐ„๋™์•ˆ์˜ ๋ˆ„์  ์ˆ˜์ต๋ฅ 
    • cum_ret[t] = (price[t] / price[o]) - 1

Lesson 1-5. Incomplete Data

  • ๋ฐ์ดํ„ฐ ์†์‹ค์ด ์žˆ๋Š” ์ด์œ 
    1. single price๊ฐ€ ์•„๋‹˜
      • ์—ฌ๋Ÿฌ๊ฐœ์˜ ๊ฑฐ๋ž˜์†Œ๊ฐ€ ์žˆ์Œ.
      • ํ˜„์‹ค์ ์œผ๋กœ 1๊ฐœ์˜ ๊ฐ€๊ฒฉ์€ ์—†๋‹ค.
    2. ๋ชจ๋“  ์ฃผ์‹์ด ๋งค์ผ ๊ฑฐ๋ž˜๊ฐ€ ์ผ์–ด๋‚˜๋Š” ๊ฒƒ์€ ์•„๋‹ˆ๋‹ค
  • df.fillna() ๋กœ ์—†๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ฑ„์šฐ์ž.

Lesson 1-7. Histograms and Scatter Plots

  • Histograms of Daily Returns
    • ๊ฐ€๋กœ์ถ• : ํ•˜๋ฃจ ์ˆ˜์ต๋ฅ  (์˜ˆ์‹œ: -1.0% ~ +1.0%)
    • ์„ธ๋กœ์ถ• : count
  • Correlation
    • Measure of how tightly dots fit the line
    • linear fitting์˜ slop์€ correlation์ด ์•„๋‹ˆ๋‹ค.
    • ๊ตฌํ˜„๋ฐฉ๋ฒ•
      1. ์ž์‚ฐ๋ณ„ csvํŒŒ์ผ ๋‹ค์šด๋กœ๋“œ
      2. ํŠน์ •๊ธฐ๊ฐ„ ๋™์•ˆ df๋กœ ์ฝ์–ด์˜ค๊ธฐ
      3. daily return ๊ณ„์‚ฐ
      4. correlation ๊ณ„์‚ฐ
        • returns.corr(method="pearson")

Lesson 1-7. Portpolio Statistics

  • Daily Portpolio value ๊ณ„์‚ฐ
    1. Input
      • start_val
      • start_date
      • end_date
      • symbols
      • allocs =[0.4, 0.4, 0.1, 0.1]
    2. Flow
      1. prices(์ž์‚ฐ๋ณ„ historical df) ๋ฅผ ์ •๊ทœํ™”ํ•œ๋‹ค.
        • normed = prices / prices[0]
      2. ๋ถ„๋ฐฐ๋น„์œจ์„ ๊ณฑํ•œ๋‹ค.
        • alloced = normed * allocs
      3. ์ดˆ๊ธฐ์ž์‚ฐ์„ ๊ณฑํ•œ๋‹ค.
        • pos_vals = alloced * start_val
      4. Portpolio value ๊ณ„์‚ฐ
        • port_val = pos_vals.sum(axis=1)
  • Porpolio Statitics
    • Daily Returns
    • Cum. Returns : (port_val[-1] / port_val[0]) - 1.0
    • avg. daily returns : daily_returns.mean()
    • std. daily returns
    • sharp ratio : K * daily_returns.mean()/daily_returns.std()
      • K : sqrt(number of samples per year)
        • 1์ผ๋งˆ๋‹ค sampling : sqrt(252)
        • 1์ฃผ๋งˆ๋‹ค sampling : sqrt(52)
        • 1๋‹ฌ๋งˆ๋‹ค sampling : sqrt(12)
      • sampling ์ฃผ๊ธฐ๋ฅผ ์งง๊ฒŒ ํ•˜๋ฉด ํ‘œ์ค€ํŽธ์ฐจ๊ฐ€ ๋†’์•„์ ธ์„œ SR๊ฐ’์ด ์ž‘์•„์ง„๋‹ค. ์ด๊ฒƒ์„ K๊ฐ’์œผ๋กœ ๋ณด์ •