This is the official implementation of Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting.
🚩News(June 27, 2024): We update our article [v3] on arXiv. An additional experiment setting is added in Ablation study. The repo is made public now.
🚩News(May 18, 2024): We update our article [v2] on arXiv and provide our Source Code on Github. All experiments are rerun on a new machine and the results are updated. The repo is set private still.
🚩News(April 26, 2024): We publish our article [v1] on arXiv. The repo is currently private.
🤠 Exploring the validity of Mamba in multivariate long-term time series forecasting (MLTSF).
🤠 Proposing a unified architecture for channel-independent and channel-mixing tokenization strategies based on a novel designed series-relation-aware (SRA) decider.
🤠 Proposing Mamba+, an improved Mamba block specifically designed for LTSF to preserve historical information in a longer range.
🤠 Introducing a Bidirectional Mamba+ in a patching manner. The model captures intra-series dependencies or inter-series dependencies based on the variable correlation of specific datasets.
We test Bi-Mamba+ on 8 real-world Datasets: (a) Weather, (b) Traffic, (c) Electricity, (d) ETTh1, (e) ETTh2, (f) ETTm1, (g) ETTm2 and (h) Solar.
All datasets are widely used and are publicly available at https://github.com/zhouhaoyi/Informer2020 and https://github.com/thuml/Autoformer.
Compared to iTransformer, the current SOTA Transformer-based model, the MSE results of Bi-Mamba+ are reduced by 4.85% and the MAE results are reduced by 2.70% on average. The improvement comes to 3.85% and 2.75% compared to S-Mamba.
We calculate the average MSE and MAE results of (i) without SRA decider (w/o SRA-I & w/o SRA-M); (ii) without bidirectional design (w/o Bi); (iii) replacing Mamba+ with Mamba (Bi-Mamba), (iv) without residual connection (w/o Residual); (v) S-Mamba and (vi) PatchTST. The SRA decider, added forget gate, bidirectional and residual design are all valid.
We conduct the following experiments to comprehensively evaluate the model efficiency from (a) predicting accuracy, (b) memory usage and (c) training speed. We set
- Install the Requirements Packages(Linux only)
Run
pip install -r requirements.txt
to install the necessary Python Packages.
Tips for installing mamba-ssm and our proposed mamba_plus: run the following commands in conda (ENTERING STRICTLY IN ORDER!):
conda create -n your_env_name python=3.10.13
conda activate your_env_name
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
conda install packaging
cd causal-conv1d;CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .;cd ..
cd mamba_plus;MAMBA_FORCE_BUILD=TRUE pip install .;cd ..
These python package installing tips is work up to (04.24 2024)
.
I strongly recommand doing all these on Linux, or, WSL2 on Windows! The default cuda version should be at least 11.8 (or 11.6? seems that new versions allow for lower cuda versions).
The tips listed here will force local compilation of causal-conv1d and mamba_plus. The mamba_plus here is the modified hardware-aware parallel computing algorithm of our proposed Mamba+. If you want to run S-Mamba or else Mamba-based models, just go with cd mamba;pip install .
or pip install mamba-ssm
in a new python environment to download the original mamba_ssm of Mamba. Please use different python environments for mamba_plus
and mamba_ssm
, because the selective_scan
program may be covered by one of them.
Take cuda 11.8 as an example, there should be a directory named 'cuda-11.8' in /usr/local
. You should make sure that cuda exists in the path. Take bash
as an example. Run vi ~/.bashrc
and make sure the following paths exist:
export CPATH=/usr/local/cuda-11.8/targets/x86_64-linux/include:$CPATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/targets/x86_64-linux/lib:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-11.8/bin:$PATH
After saving the new profile, run bash
again and your SHELL will identify the new env path.
Of course, if you do not want to force local compilation, these paths are not necessary.
- Run the script: Find the model you want to run in
/scripts
and choose the dataset you want to use.
Run
sh ./scripts/{model}/{dataset}.sh 1
to start training.
Run
sh ./scripts/{model}/{dataset}.sh 0
to start testing.
Run
sh ./scripts/{model}/{dataset}.sh -1
to start predicting.
We provide the trained models in checkpoints
, currently the Bi-Mamba+ for Weather is offered.
We have compiled the datasets we need to use and provide download link: data.zip.
We are grateful for the following awesome works when implementing Bi-Mamba+:
@article{liang2024bi,
title={Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting},
author={Liang, Aobo and Jiang, Xingguo and Sun, Yan and Shi, Xiaohou and Li Ke},
journal={arXiv preprint arXiv:2404.15772},
year={2024}
}