- General info
- Introduction
- Project 1: Sales forecasting with ARIMA
- Project 2: Bitcoin forecasting with LSTM
- Project 3: BTC and ETH forecasting with multivariate time series
In this repository you'll learn how to analyse a time serie and forecast it's values.
Here we will try to forecast sales and cryptocurrencies using ARIMA, LSTM and multivariate time series.
Because cryptocurrencies are highly volatile and doesn't follow a specific pattern the results won't be good for the last two projects.
For the purpose of this introduction we'll use as dataset Bitcoin's closing market price everyday since 2014.
A time serie is composed of 3 components:
- (y: the baseline value for the series)
- Trend: Linear increasing or decreasing behavior of the serie over time
- Seasonality: Repeating patterns or cycles of behavior over time.
- Noise: Variability in the observations that cannot be explained by the model
The reunion of these 3 components is equal to the time serie, it can be additive if:
y = trend + seasonality + noise
or multiplicative if:
y = trend * seasonality * noise
In our case the time serie is multiplicative.
We can clearly see that the trend is constantly rising, with a peak toward 2021.
The Trend is calculated based on the moving average with a sliding window L :
Bitcoin's Seasonality since 2014
The seasonality indicates that there is a rising trend toward January and June and a decreasing trend in between.
In order to get the seasonality and the noise we need to refer to the multiplicative formula in the beginning:
Then in order to isolate the seasonality we calculate the moving average based on one year, L=365 (could be one month) of our last formula:
Finally we calculate the noise by dividing the seasonal noise by the seasonal component:
As we can see the data is very noisy, this time serie depends on a lot of factors.
In this first project we will forecast the sales and number of orders of a retail store with the model ARIMA which stands for ‘Auto Regressive Integrated Moving Average’.
The dataset is from Kaggle: Superstore Sales Dataset
We can notice that the number of orders and the sales income are not correlated.
In order to use ARIMA we need to make the time serie stationary, meaning the serie won't depend on time anymore, by differencing it.
An ARIMA model is defined by 3 terms: p, d, q where:
- p is the order of the Auto Regressive term
- q is the order of the Moving Average term
- d is the number of differencing required to make the time serie stationary
We won't go into full details, but the core ideas on how to find these parameters will be shown.
here we can see that these series are already stationary thanks to the autocorrelation plot and the dickey–fuller test, they both have p < 0.05.
So d = 0 and q = 1 because the lag 1 is way above the significance line.
We then plot the Partial Autocorrelation plot and see that p = 1.
To finish we make the model with a library called "pmdarima" which will try different combination of these parameters in order to find the best model.
Finally we plot the forecasts:
Yes... It's a straight line because there is no "trend" or "seasonality" in our data !
ARIMA is useful only in certain cases where the time serie has a "trend" or a "seasonality", in this case it will only predict the mean of the time serie which result in a straight line.
Furthermore we use ARIMA for short-term forecasts, long term forecasts will only result in a straight line too.
For this project we'll use as dataset Bitcoin's closing market price everyday since 2014.
The network is made of two layers of bidirectionnal LSTM units with a 20 dense at the end in order to predict the next 20 values of the time serie.
LSTMs are great at learning from long-term dependencies on sequences of data, when made bidirectionnal they also train on a reversed copy of the input sequence, this can provide additional context to the network and result in faster and even fuller learning on the problem.
Sadly the forecasts are quite imprecise, this is because there is a lot of noises and actions or cryptocurrencies price depends on a lot of factors that a time serie alone can not represent.
Though the model did manage to learn a rising trend and is not totally wrong.
This third project aims to forecast the price of cryptocurrencies using all the features in the dataset, and combining two cryptocurrencies to see if the results are better.
The LSTM overfits so quickly that we need to set the number of epochs to 12 (batch size also influence a lot the number of epochs required), furthermore if the number of features is too big we need to set the loss to a MSLE (mean squared logarithmic error) so that big differences count as much as little ones.
But nope this is still too bad.