Skip to content

The official code repo for HyperAgent algorithm published in ICML 2024.

License

Notifications You must be signed in to change notification settings

szrlee/HyperAgent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HyperAgent Hits

Author: Yingru Li, Jiawei Xu, Lei Han, Zhi-Quan Luo

This repository contains the official implementation of the HyperAgent algorithm, introduced in our ICML 2024 paper Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent.

For integrating the Generative Pre-trained Transformer (GPT) with HyperAgent, see szrlee/GPT-HyperAgent, designed for adaptive foundation models for online decisions.

HyperAgent Performance

  • Data Efficient ✅: HyperAgent achieves human-level performance (1 IQM) with only 15% of the data used by Double-DQN (DDQN, 2016, DeepMind) in 1.5M interactions.
  • Computation Efficient ✅: HyperAgent uses just 5% of the model parameters compared to the 2023 state-of-the-art algorithm (BBF, DeepMind).
  • Ensemble+ Comparison: Achieves only 0.22 IQM score under 1.5M interactions and requires double the parameters of HyperAgent.

Reference:

Installation

cd HyperAgent
pip install -e .

Usage

To reproduce the results for Atari (e.g., Pong):

sh experiments/start_atari.sh Pong

To reproduce the results for DeepSea (e.g., size 20):

sh experiments/start_deepsea.sh 20

Citation

If you find this work useful in your research, please cite our paper:

@inproceedings{li2024hyperagent,
  title         = {{Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent}},
  author        = {Li, Yingru and Xu, Jiawei and Han, Lei and Luo, Zhi-Quan},
  booktitle     = {Forty-first International Conference on Machine Learning},
  year          = {2024},
  series        = {Proceedings of Machine Learning Research},
  eprint        = {2402.10228},
  archiveprefix = {arXiv},
  primaryclass  = {cs.LG},
  url           = {https://arxiv.org/abs/2402.10228}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.