This repository collects papers on Large Language Model for Chemistry.
😎 Welcome to recommend missing papers through Adding Issues or Pull Requests.
2022.05
Foundation Models of Scientific Knowledge for Chemistry: Opportunities, Challenges and Lessons Learned. ACL Workshop2022.11
Galactica: A large language model for science. arXiv2022.11
Is GPT-3 all you need for machine learning for chemistry? NIPS2022 Workshop2023.08
Fine-tuning GPT-3 for machine learning electronic and functional properties of organic molecules. Chemical Science2023.08
HoneyBee: Progressive Instruction Finetuning of Large Language Models for Materials Science. EMNLP20232023.10
MatChat: A Large Language Model and Application Service Platform for Materials Science. Chinese Physics B2024.01
ChemDFM: Dialogue Foundation Model for Chemistry. arXiv2024.01
Structured information extraction from scientific text with large language models. Nature Communication2024.02
Leveraging large language models for predictive chemistry. Nature Machine Intelligence2024.03
SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning. arXiv2024.03
Domain-Agnostic Molecular Generation with Chemical Feedback. ICLR20242024.04
ChemLLM: A Chemical Large Language Model. arXiv2024.04
BatGPT-Chem: A Foundation Large Model For Chemical Engineering. chemRxiv2024.04
Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models. ICLR20242024.04
LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset. arXiv2024.05
nach0: Multimodal Natural and Chemical Languages Foundation Model. Chemical Science2024.06
Fine-tuning large language models for chemical text mining. Chemical Science2024.06
MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property Prediction. arXiv2024.06
SynAsk: Unleashing the Power of Large Language Models in Organic Synthesis. arXiv2024.06
PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes. arXiv2024.09
SciDFM: A Large Language Model with Mixture-of-Experts for Science. arXiv
2023.03
Uni-Mol: A Universal 3D Molecular Representation Learning Framework. ICLR2023.05
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs. arXiv2023.06
MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter. EMNLP20232023.06
MolFM: A Multimodal Molecular Foundation Model. arXiv2023.08
BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine. arXiv2023.09
3D-MOLM: TOWARDS 3D MOLECULE-TEXT INTERPRETATION IN LANGUAGE MODELS. ICLR20242023.11
InstructMol: Multi-Modal Integration for Building a Versatile and Reliable Molecular Assistant in Drug Discovery. arXiv2023.12
MoleculeGPT: Instruction Following Large Language Models for Molecular Property Prediction. NIPS Workshop2024.01
MolTC: Towards Molecular Relational Modeling In Language Models ACL20242024.01
ReactXT: Understanding Molecular “Reaction-ship” viaReaction-Contextualized Molecule-Text Pretraining. ACL20242024.03
GIT-Mol: A multi-modal large language model for molecular science with graph, image, and text. arXiv2024.06
HIGHT: Hierarchical Graph Tokenization for Graph-Language Alignment. arXiv2024.06
3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization. arXiv2024.06
MolX: Enhancing Large Language Models for Molecular Learning with A Multi-Modal Extension. arXiv2024.07
MolLM: a unified language model for integrating biomedical text with 2D and 3D molecular representations. Bioinformatics2024.08
UniMoT: Unified Molecule-Text Language Model with Discrete Token Representation. arXiv2024.08
ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area. arXiv2024.09
ChemDFM-X: Towards Large Multimodal Model for Chemistry. arXiv
2023.09
Generative Retrieval-Augmented Ontologic Graph and Multiagent Strategies for Interpretive Large Language Model-Based Materials Design. ACS Engineering Au2023.10
Large language models for chemistry robotics. Autonomous Robots2023.10
Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design. EMNLP20232023.11
Chemist-X: Large Language Model-empowered Agent for Reaction Condition Recommendation in Chemical Synthesis. arXiv2023.12
Autonomous chemical research with large language models. Nature2024.01
Structured Chemistry Reasoning with Large Language Models. ICML20242024.01
ChemReasoner: Heuristic Search over a Large Language Model's Knowledge Space using Quantum-Chemical Feedback. ICML20242024.02
An Autonomous Large Language Model Agent for Chemical Literature Data Mining. arXiv2024.03
From Artificially Real to Real: Leveraging Pseudo Data from Large Language Models for Low-Resource Molecule Discovery. AAAI20242024.03
DRAK: Unlocking Molecular Insights with Domain-Specific Retrieval-Augmented Knowledge in LLMs. arXiv2024.04
Integrating Chemistry Knowledge in Large Language Models via Prompt Engineering. arXiv2024.04
Large Language Models are In-Context Molecule Learners. arXiv2024.04
A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions. arXiv2024.04
Large Language Models Open New Way of AI-Assisted Molecule Design for Chemists. ChemRxiv2024.05
Augmenting large language models with chemistry tools. Nature Machine Intelligence2024.05
ChatMOF: an artificial intelligence system for predicting and generating metal-organic frameworks using large language models. Nature Communications2024.06
LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval and Distillation. arXiv
2017.09
Crowdsourcing multiple choice science questions. ACL Workshop2020.09
ChemistryQA: A Complex Question Answering Dataset from Chemistry. OpenReview2023.01
Assessment of chemistry knowledge in large language models that generate code. Digital Discovery2023.03
Do Large Language Models Understand Chemistry? A Conversation with ChatGPT. Journal of Chemical Information and Modeling2023.06
Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT Perspective. TKDE2023.07
Can Large Language Models Empower Molecular Property Prediction? arXiv2023.10
ReLM: Leveraging Language Models for Enhanced Chemical Reaction Prediction. arXiv2023.10
GPT-MolBERTa: GPT Molecular Features Language Model for molecular property. arXiv2023.12
What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks. NeurIPS20232023.12
SciMT-Safety: Control Risk for Potential Misuse of Artificial Intelligence in Science. arXiv2024.01
SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research. AAAI20242024.01
SciAssess: Benchmarking LLM Proficiency in Scientific Literature Analysis. arXiv2024.02
Scientific Language Modeling: A Quantitative Review of Large Language Models in Molecular Science. arXiv2024.02
Building a Dataset for Language+Molecules. arXiv2024.03
Benchmarking Large Language Models for Molecule Prediction Tasks. arXiv2024.03
MoleculeQA: A Dataset to Evaluate Factual Accuracy in Molecular Comprehension. arXiv2024.03
Benchmarking Large Language Models for Molecule Prediction Tasks. arXiv2024.02
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models. arXiv2024.04
Are large language models superhuman chemists? arXiv2024.06
SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models. arXiv2024.07
ScholarChemQA: Unveiling the Power of Language Models in Chemical Research Question Answering. arXiv2024.09
VisScience: An Extensive Benchmark for Evaluating K12 Educational Multi-modal Scientific Reasoning. arXiv2024.09
ChemEval: A Comprehensive Multi-Level Chemical Evalution for Large Language Models. arXiv2024.10
Unraveling Molecular Structure: A Multimodal Spectroscopic Dataset for Chemistry. NIPS20242024.10
MassSpecGym: A benchmark for the discovery and identification of molecules. NIPS20242024.10
Can LLMs Solve Molecule Puzzles? A Multimodal Benchmark for Molecular Structure Elucidation. NIPS20242024.10
DFT: A Universal Quantum Chemistry Dataset of Drug-Like Molecules and a Benchmark for Neural Network Potentials. NIPS2024
2023.04
A Systematic Survey of Chemical Pre-trained Models. IJCAI20232023.09
Large Language Models in Molecular Discovery. NIPS2023 Workshop2024.01
Scientific Large Language Models: A Survey on Biological & Chemical Domains. arXiv2024.01
From Words to Molecules: A Survey of Large Language Models in Chemistry. IJCAI20242024.03
Bridging Text and Molecule: A Survey on Multimodal Frameworks for Molecule. arXiv2024.03
Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey. arXiv2024.06
A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery. arXiv2024.07
A Review of Large Language Models and Autonomous Agents in Chemistry. arXiv