ChemReasoner: Discovering catalysts via Generative AI and Computational Chemistry

Installation Instructions

Installation assumes cuda version 12.0

mamba env create -f chemreasoner.yml
conda activate chemreasoner
git clone https://github.com/pnnl/chemreasoner.git
cd chemreasoner
git submodule update --init --recursive
cd ext/ocp/
pip install -e .
cd ../Open-Catalyst-Dataset
pip install -e .
cd ../..

To test the installation:

python src/scripts/test_gnn.py      # use --cpu to test on cpu only

Running the ICML Code

The code to reproduce the ICML results is located in src/scripts/run_icml_queries.py. An example run script has been provided in src/launch_scripts/run_icml.sh. You will need to set a few parameters...

savedir: The directory to save the results in
start-query: The index of the first query to evaluate (see data/input_data/dataset.csv)
start-query: The index of the final query to evaluate
gnn-traj-dir: The directroy in which to store relaxation trajectories
dotenv-path: The path to .env file containing api keys for your azure openai setup (see instructions below)

Setting up .env file

The .env file should be located in the chemreasoner root directory and contain the api keys and info for your Azure OpenAI interface, which can be found on the Azure portal.

AZURE_OPENAI_DEPLOYMENT_NAME=<deployment name>
AZURE_OPENAI_ENDPOINT=<url to deployment endpoint>
AZURE_OPENAI_API_KEY=<api key>
AZURE_OPENAI_API_VERSION="2023-07-01-preview"

Setting up local GNN server

To run relaxations with the GNN model, you will have to set up a redis server. To do so open a new terminal on the same machine that you will be running chemreasoner on (with access to a GPU). Then run,

redis-server --dir <directory to store redis server cache>

Here, --dir can be set to any directory.

Running ICML code

Once you have set up the run script, the .env file, and started the local redis server, run the ICML code by entering

./src/launch_scripts/run_icml.sh

News/Presentations/Publications

AAAI 2025: S. Beus, H. W. Sprueill, R. Meyur, M. V. Olarte, K. Agarwal, D. Zhang, U. Sanyal, J. Lercher, S. Choudhury, "Human-AI Interaction using Linguistic Reasoning and Computational Chemistry for Trustworthy Materials Discovery"
PNNL Article: Accelerating Materials Discovery with AI
PNNL Article: Novel Computing Tool Learns the Language of Chemistry
ICML 2024: "Heuristic Search over a Large Language Model's Knowledge Space using Quantum-Chemical Feedback" arXiv
Presentation at MLCommons Science Working Group
We will have two presentations at upcoming American Chemical Society Spring 2024 National Meeting!
- Sprueill H.W., C. Edwards, M.V. Olarte, U. Sanyal, H. Ji, and S. Choudhury. "Integrating generative AI with computational chemistry for catalyst design in biofuel/bioproduct applications." American Chemical Society Spring 2024 National Meeting, New Orleans, Louisiana (oral presentation).
- Sprueill H.W., C. Edwards, M.V. Olarte, U. Sanyal, K. Agarwal, H. Ji, and S. Choudhury. 03/18/2024. "Extreme-Scale Heterogeneous Inference with Large Language Models and Atomistic Graph Neural Networks for Catalyst Discovery." American Chemical Society Spring 2024 National Meeting, New Orleans, Louisiana (poster).
Our work on Monte Carlo Thought Search is accepted for publication in EMNLP 2023 Findings (arXiv)
Excited to present "ChemReasoner: Large Language Model-driven Search over Chemical Spaces with Quantum Chemistry-guided Feedback" at 2023 Stanford Graph Learning Workshop
We are thrilled to be selected for the Microsoft Accelerate Foundation Models Research Initiative
Presentation at AI Hardware and Edge AI Summit, Santa Clara, September 2023

Citation

Please cite the following papers [https://arxiv.org/abs/2310.14420] [https://arxiv.org/abs/2402.10980] if you find our work useful.

@inproceedings{sprueill2023MCR,
  title={Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design},
  author={Sprueill, Henry W. and Edwards, Carl and Sanyal, Udishnu and Olarte, Mariefel and Ji, Heng and Choudhury, Sutanay}
  booktitle={In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP2023) Findings},
  year={2023}
}
@article{sprueill2024chemreasoner,
  title={CHEMREASONER: Heuristic Search over a Large Language Model's Knowledge Space using Quantum-Chemical Feedback},
  author={Sprueill, Henry W and Edwards, Carl and Agarwal, Khushbu and Olarte, Mariefel V and Sanyal, Udishnu and Johnston, Conrad and Liu, Hongbin and Ji, Heng and Choudhury, Sutanay},
  journal={arXiv preprint arXiv:2402.10980},
  year={2024}
}

Contact

Sutanay Choudhury sutanay tod choudhury ta pnnl tod gov

Name		Name	Last commit message	Last commit date
Latest commit History 1,385 Commits
data		data
docs		docs
ext		ext
src		src
test		test
.gitignore		.gitignore
.gitmodules		.gitmodules
COPYRIGHT.md		COPYRIGHT.md
README.md		README.md
chemreasoner.yml		chemreasoner.yml
compile_dft_dataset.py		compile_dft_dataset.py
pip_requirements.txt		pip_requirements.txt
run_automatically.py		run_automatically.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChemReasoner: Discovering catalysts via Generative AI and Computational Chemistry

Installation Instructions

Running the ICML Code

Setting up .env file

Setting up local GNN server

Running ICML code

News/Presentations/Publications

Citation

Contact

About

Releases

Packages

Contributors 4

Languages

pnnl/chemreasoner

Folders and files

Latest commit

History

Repository files navigation

ChemReasoner: Discovering catalysts via Generative AI and Computational Chemistry

Installation Instructions

Running the ICML Code

Setting up .env file

Setting up local GNN server

Running ICML code

News/Presentations/Publications

Citation

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages