time series - jjin-choi/study_note GitHub Wiki

Autoregressive Integrated Moving Average (ARIMA)

  • AR (Autoregression) + MA (Moving Average)
    • AR : ์ž๊ธฐ์ƒ๊ด€์„ฑ. ์ด์ „ ์ž์‹ ์˜ ๊ด€์ธก๊ฐ’์ด ์ดํ›„ ์ž์‹ ์˜ ๊ด€์ธก๊ฐ’์— ์˜ํ–ฅ์„ ์ค€๋‹ค. +MA : ์˜ˆ์ธก ์˜ค์ฐจ๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ฏธ๋ž˜ ์˜ˆ์ธก
  • Stationary ๋ฅผ ๊ฐ€์ •ํ•œ๋‹ค. (ํ‰๊ท , ๋ถ„์‚ฐ์ด ์‹œ๊ฐ„์— ๋”ฐ๋ผ ์ผ์ •ํ•œ ์„ฑ์งˆ)
  • ์ •์ƒ์„ฑ์„ ๋‚˜ํƒ€๋‚ด์ง€ ์•Š๋Š” ๊ฒฝ์šฐ,
    • ๋ณ€๋™ํญ์ด ์ผ์ •ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ โ†’ ๋กœ๊ทธ ๋ณ€ํ™˜
    • ์ถ”์„ธ, ๊ณ„์ ˆ์„ฑ์ด ์กด์žฌํ•˜๋Š” ๊ฒฝ์šฐ โ†’ ์ฐจ๋ถ„ (differencing, y_t - y_(t-1))
  • AutoCorrelation Function (ACF)
    • ์‹œ์ฐจ์— ๋”ฐ๋ฅธ ์ผ๋ จ์˜ ์ž๊ธฐ ์ƒ๊ด€์„ ์˜๋ฏธํ•˜๋ฉฐ ์‹œ์ฐจ๊ฐ€ ์ปค์งˆ์ˆ˜๋ก ACF ๋Š” 0์— ๊ฐ€๊นŒ์›Œ์ง„๋‹ค.
    • ์ •์ƒ ์‹œ๊ณ„์—ด์€ ๋น ๋ฅด๊ฒŒ 0์— ์ˆ˜๋ ดํ•˜๋ฉฐ, ๋น„์ •์ƒ ์‹œ๊ณ„์—ด์€ ์ฒœ์ฒœํžˆ ๊ฐ์†Œํ•˜๊ณ  ์ข…์ข… ํฐ ์–‘์ˆ˜ ๊ฐ’ ๊ฐ€์ง
  • Augmented Dickey-Fuller (ADF) test
    • H0 : ์ •์ƒ์„ฑ์„ ๋งŒ์กฑํ•˜์ง€ ์•Š๋Š”๋‹ค.
    • H1 : ์ •์ƒ์„ฑ์„ ๋งŒ์กฑํ•œ๋‹ค.
    • p-value ๊ฐ€ 0.05 ๋„˜์œผ๋ฉด ์ •์ƒ์„ฑ์„ ๋งŒ์กฑํ•˜์ง€ ๋ชปํ•จ.

Prophet

We are, in effect, framing the forecasting problem as a curve-fitting exercise, which is inherently different from time series models that explicitly account for the temporal dependence structure in the data. While we give up some important inferential advantages of using a generative model such as an ARIMA, this formulation provides a number of practical advantages: โ€“ Flexibility: We can easily accommodate seasonality with multiple periods and let the analyst make different assumptions about trends. 7 PeerJ Preprints | https://doi.org/10.7287/peerj.preprints.3190v2 | CC BY 4.0 Open Access | rec: 27 Sep 2017, publ: 27 Sep 2017 โ€“ Unlike with ARIMA models, the measurements do not need to be regularly spaced, and we do not need to interpolate missing values e.g. from removing outliers. โ€“ Fitting is very fast, allowing the analyst to interactively explore many model specifications, for example in a Shiny application (Chang et al. 2015). โ€“ The forecasting model has easily interpretable parameters that can be changed by the analyst to impose assumptions on the forecast. Moreover, analysts typically do have experience with regression and are easily able to extend the model to include new components

  • ๋งˆ์ง€๋ง‰์œผ๋กœ ํŽ˜๋ถ ์—ฐ๊ตฌ์›๋“ค์ด ๋†“์น˜์ง€ ์•Š๊ณ  ์นœ์ ˆํ•˜๊ฒŒ ๊ฐ€๋ฅด์ณ์ฃผ๋Š” ๋ชจ๋ธ ์ˆ˜์ •๋ฒ•๋„ ๋ณด๋„ˆ์Šค๋กœ ์žˆ์Šต๋‹ˆ๋‹ค.
    • Baseline ๋ชจ๋ธ๊ณผ ๋น„๊ตํ•˜์—ฌ ๋ญ”๊ฐ€ ๋–จ์–ด์ ธ๋ณด์ผ๋•Œ๋Š” trend, seasonality ๋“ฑ์„ ์ˆ˜์ •ํ•˜์„ธ์š”.
    • ํŠน์ • ์ผ์ž์— ์˜ˆ์ธก๋ฅ ์ด ๋–จ์–ด์ง„๋‹ค๋ฉด, ์•„์›ƒ๋ผ์ด์–ด๋ฅผ ์ œ๊ฑฐํ•˜์„ธ์š”.
    • ํŠน์ • cutoff (์—ฐ๋ง ๋“ฑ) ์— ์˜ˆ์ธก๋ฅ ์ด ๋–จ์–ด์ง„๋‹ค๋ฉด, changepoint๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ฐธ๊ณ ๋กœ dataframe ์ „์ฒด ๋ณด๊ธฐ ์œ„ํ•œ ์˜ต์…˜ : pd.set_option('display.max_colwidth', -1)

multi step : https://coccocbox.tistory.com/5

LSTM cross validation ๊ด€๋ จ : https://shyu0522.tistory.com/7

  • attention ?