Time Series Forecasting - clizarraga-UAD7/Workshops GitHub Wiki

An Introduction to Time Series Forecasting.

What is Time Series Forecasting?

Time series forecasting is the process of using historical data to predict future values. It is a powerful tool that can be used in a variety of applications, such as:

Time series forecasting can be used in various fields such as business, finance, government, healthcare, and weather to predict trends and make better decisions. This information can be used by businesses, investors, governments, healthcare providers, and individuals to improve production, investments, policy-making, resource allocation, and safety planning.

In Time Series Forecasting what is a good choice: Statistical Models or Machine Learning Models?.

Both statistical and machine learning models have their own advantages and disadvantages for time series forecasting.

Statistical models are typically easier to understand and interpret than machine learning models. They are also less computationally expensive to train and deploy. However, statistical models can be less accurate than machine learning models, especially for non-linear data.
Machine learning models can be more accurate than statistical models, especially for non-linear data. However, machine learning models can be more difficult to understand and interpret than statistical models. They are also more computationally expensive to train and deploy.

The best type of model for a particular application will depend on the specific characteristics of the data and the requirements of the forecaster.

Ultimately, the best way to choose between statistical and machine learning models for time series forecasting is to experiment with both types of models and see which one produces the best results for your specific application.

Most popular libraries for doing Time Series Forecasting using Python.

There are many popular Machine Learning Python libraries for doing time series forecasting. Some of the most popular libraries include:

Darts: Darts is a Python library for easy manipulation and forecasting of time series. It contains a variety of models, from classics such as ARIMA to deep neural networks.
DeepAR: DeepAR is a forecasting library developed by Amazon. It is a deep learning-based forecasting library that can be used to forecast a wide variety of time series data.
pmdarima: pmdarima is a Python library for statistical analysis of time series data. It is based on the ARIMA model and provides a variety of tools for analyzing, forecasting, and visualizing time series data.
Prophet: Prophet is a forecasting library developed by Facebook. It is a non-linear time series forecasting library that is easy to use and can be used to forecast a wide variety of time series data.
pyCaret: Pycaret is an open-source, low-code machine learning library in Python that automates machine learning workflows.
sktime: Sktime is a Python toolkit for working with time-series data. It provides a set of tools for dealing with time-series data, including tools for processing, visualizing, and analyzing data.
StatsForecast. StatsForecast offers a collection of popular univariate time series forecasting models optimized for high performance and scalability.

These are just a few of the many popular Machine Learning Python libraries for doing time series forecasting. The best library for you will depend on your specific needs and requirements.

Main steps for doing a time series forecasting

The main steps for doing time series forecasting are:

Data preparation: This step involves cleaning and formatting the data so that it is ready for analysis. This may include removing outliers, imputing missing values, and transforming the data to make it stationary.
Time series decomposition: This step involves breaking down the time series into its component parts, such as trend, seasonality, and noise. This can help to identify the underlying patterns in the data and make it easier to forecast.
Modeling: This step involves selecting a forecasting model and fitting it to the data. There are many different forecasting models available, each with its own strengths and weaknesses. The best model for a particular application will depend on the characteristics of the data.
Forecasting: This step involves using the fitted model to generate forecasts for future values. It is important to evaluate the accuracy of the forecasts before using them to make decisions.
Communication: This step involves communicating the results of the forecasting process to stakeholders. This includes explaining the assumptions made, the limitations of the forecasts, and the implications of the results.

These are just the main steps involved in time series forecasting. The specific steps involved will vary depending on the application and the data.

Time Series Forecasting Metrics

There are many different metrics that can be used to evaluate the accuracy of time series forecasts. Some of the most common metrics include:

Mean absolute error (MAE): The MAE is the average of the absolute errors between the actual and predicted values.
Mean squared error (MSE): The MSE is the average of the squared errors between the actual and predicted values.
Root mean squared error (RMSE): The RMSE is the square root of the MSE.
Mean absolute percentage error (MAPE): The MAPE is the average of the absolute percentage errors between the actual and predicted values.
R-squared (R2) or Coefficient of Determination: This is a statistical measure that represents the proportion of the variance in the dependent variable that is explained by the independent variables.

The best metric to use will depend on the specific application. For example, if the forecast is being used to make decisions about production, then the MAE or MSE may be the best metric to use. If the forecast is being used to make decisions about marketing, then the MAPE may be the best metric to use.

It is important to note that no single metric can perfectly capture the accuracy of a time series forecast. It is always best to use multiple metrics and to consider the specific application when evaluating the accuracy of a forecast.

See Jupyter Notebook Example

References

Training Forecasting Models on Multiple Time Series with Darts.
Time Series Forecasting with PyCaret Regression.
Multiple Time Series Forecasting with PyCaret.
Forecasting with sktime.
Forecast with ARIMA and ETS in StatsForecast
Multiple seasonalities forecasting with StatsForecast
Time-Series Forecasting: Deep Learning vs Statistics — Who Wins?. Nikos Kafritsas. Towards Data Science, Medium.

Created: 04/16/2022 (C. Lizárraga); Last update: 04/17/2023 (C. Lizárraga)

CC BY-NC-SA 4.0