An on-policy MARL algorithm for highway on-ramp merging problems, which features parameter sharing, action masking, local reward design and a priority-based safety supervisor.
All the MARL algorithms are extended from the single-agent RL with parameter sharing among agents.
- MAA2C (The safety supervisor and other settings are in configs/configs.ini)
- MAPPO.
- MAACKTR.
- MADQN: Does not work well.
- MASAC: TBD.
- create an python virtual environment:
conda create -n marl_cav python=3.6 -y
- active the virtul environment:
conda activate marl_cav
- install pytorch (torch>=1.2.0):
pip install torch===1.7.0 torchvision===0.8.1 torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
- install the requirements:
pip install -r requirements.txt
Fig.1 Illustration of the considered on-ramp merging traffic scenario. CAVs (blue) and HDVs (green) coexist on both ramp and through lanes.
To run the code, just run it via python run_xxx.py
. The config files contain the parameters for the MARL policies.
Fig.2 Performance comparison between the proposed method and 3 state-of-the-art MARL algorithms.
To reproduce, we train the algorithms for 3 random seeds, 0, 2000, 2021. For example, we can set the torch_seed and seed to 0
to run the seed 0. We can plot the comparison curves with the code: python common/plot_benchmark_safety.py
@book{chen2023deep,
title={Deep Multi-Agent Reinforcement Learning for Efficient and Scalable Networked System Control},
author={Chen, Dong},
year={2023},
publisher={Michigan State University}
}
@article{chen2023deep,
title={Deep multi-agent reinforcement learning for highway on-ramp merging in mixed traffic},
author={Chen, Dong and Hajidavalloo, Mohammad R and Li, Zhaojian and Chen, Kaian and Wang, Yongqiang and Jiang, Longsheng and Wang, Yue},
journal={IEEE Transactions on Intelligent Transportation Systems},
year={2023},
publisher={IEEE}
}