Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting

This is the official implementation of Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting.

🚩News(June 27, 2024): We update our article [v3] on arXiv. An additional experiment setting is added in Ablation study. The repo is made public now.

🚩News(May 18, 2024): We update our article [v2] on arXiv and provide our Source Code on Github. All experiments are rerun on a new machine and the results are updated. The repo is set private still.

🚩News(April 26, 2024): We publish our article [v1] on arXiv. The repo is currently private.

Key Designs of the proposed Bi-Mamba+🔑

🤠 Exploring the validity of Mamba in multivariate long-term time series forecasting (MLTSF).

🤠 Proposing a unified architecture for channel-independent and channel-mixing tokenization strategies based on a novel designed series-relation-aware (SRA) decider.

🤠 Proposing Mamba+, an improved Mamba block specifically designed for LTSF to preserve historical information in a longer range.

🤠 Introducing a Bidirectional Mamba+ in a patching manner. The model captures intra-series dependencies or inter-series dependencies based on the variable correlation of specific datasets.

Datasets

We test Bi-Mamba+ on 8 real-world Datasets: (a) Weather, (b) Traffic, (c) Electricity, (d) ETTh1, (e) ETTh2, (f) ETTm1, (g) ETTm2 and (h) Solar.

All datasets are widely used and are publicly available at https://github.com/zhouhaoyi/Informer2020 and https://github.com/thuml/Autoformer.

Results✅

Main Results

Compared to iTransformer, the current SOTA Transformer-based model, the MSE results of Bi-Mamba+ are reduced by 4.85% and the MAE results are reduced by 2.70% on average. The improvement comes to 3.85% and 2.75% compared to S-Mamba.

Ablation Study

We calculate the average MSE and MAE results of (i) without SRA decider (w/o SRA-I & w/o SRA-M); (ii) without bidirectional design (w/o Bi); (iii) replacing Mamba+ with Mamba (Bi-Mamba), (iv) without residual connection (w/o Residual); (v) S-Mamba and (vi) PatchTST. The SRA decider, added forget gate, bidirectional and residual design are all valid.

Model Efficiency

We conduct the following experiments to comprehensively evaluate the model efficiency from (a) predicting accuracy, (b) memory usage and (c) training speed. We set $L=96,H=96$ as the forecasting task and use $Batch=32$ for ETTh1 and Traffic. Bi-Mamba+ strikes a good balance among predicting performance, training speed and memory usage.

Getting Start🛫

Install the Requirements Packages(Linux only)

Run pip install -r requirements.txt to install the necessary Python Packages.

Tips for installing mamba-ssm and our proposed mamba_plus: run the following commands in conda (ENTERING STRICTLY IN ORDER!):

conda create -n your_env_name python=3.10.13
conda activate your_env_name
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
conda install packaging
cd causal-conv1d;CAUSAL_CONV1D_FORCE_BUILD=TRUE pip install .;cd ..
cd mamba_plus;MAMBA_FORCE_BUILD=TRUE pip install .;cd ..

These python package installing tips is work up to (04.24 2024).

I strongly recommand doing all these on Linux, or, WSL2 on Windows! The default cuda version should be at least 11.8 (or 11.6? seems that new versions allow for lower cuda versions).

The tips listed here will force local compilation of causal-conv1d and mamba_plus. The mamba_plus here is the modified hardware-aware parallel computing algorithm of our proposed Mamba+. If you want to run S-Mamba or else Mamba-based models, just go with cd mamba;pip install . or pip install mamba-ssm in a new python environment to download the original mamba_ssm of Mamba. Please use different python environments for mamba_plus and mamba_ssm, because the selective_scan program may be covered by one of them.

Take cuda 11.8 as an example, there should be a directory named 'cuda-11.8' in /usr/local. You should make sure that cuda exists in the path. Take bash as an example. Run vi ~/.bashrc and make sure the following paths exist:

export CPATH=/usr/local/cuda-11.8/targets/x86_64-linux/include:$CPATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/targets/x86_64-linux/lib:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-11.8/bin:$PATH

After saving the new profile, run bash again and your SHELL will identify the new env path.

Of course, if you do not want to force local compilation, these paths are not necessary.

Run the script: Find the model you want to run in /scripts and choose the dataset you want to use.

Run sh ./scripts/{model}/{dataset}.sh 1 to start training.

Run sh ./scripts/{model}/{dataset}.sh 0 to start testing.

Run sh ./scripts/{model}/{dataset}.sh -1 to start predicting.

We provide the trained models in checkpoints, currently the Bi-Mamba+ for Weather is offered.

Datasets🔗

We have compiled the datasets we need to use and provide download link: data.zip.

Acknowledgements🙏

We are grateful for the following awesome works when implementing Bi-Mamba+:

Mamba

iTransformer

Citation🙂

@article{liang2024bi,
  title={Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting},
  author={Liang, Aobo and Jiang, Xingguo and Sun, Yan and Shi, Xiaohou and Li Ke},
  journal={arXiv preprint arXiv:2404.15772},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
causal-conv1d		causal-conv1d
checkpoints/2024		checkpoints/2024
configs		configs
layers		layers
mamba		mamba
mamba_plus		mamba_plus
models		models
pics		pics
results/2024		results/2024
scripts		scripts
trainer		trainer
utils		utils
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting

This is the official implementation of Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting.

Key Designs of the proposed Bi-Mamba+🔑

Datasets

Results✅

Main Results

Ablation Study

Model Efficiency

Getting Start🛫

Datasets🔗

Acknowledgements🙏

Citation🙂

About

Releases

Packages

Languages

Leopold2333/Bi-Mamba4TS

Folders and files

Latest commit

History

Repository files navigation

Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting

This is the official implementation of Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting.

Key Designs of the proposed Bi-Mamba+🔑

Datasets

Results✅

Main Results

Ablation Study

Model Efficiency

Getting Start🛫

Datasets🔗

Acknowledgements🙏

Citation🙂

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages