This repository contains list of papers according to our survey:
Multilingual Large Language Models: A Systematic Survey
Shaolin Zhu1, Supryadi1, Shaoyang Xu1, Haoran Sun1, Leiyu Pan1, Menglong Cui1,
Jiangcun Du1, Renren Jin1, António Branco2†, Deyi Xiong1†*
1TJUNLP Lab, College of Intelligence and Computing, Tianjin University
2NLX, Department of Informatics, University of Lisbon
(*: Corresponding author, †: Advisory role)
-
"CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages".
Thuat Nguyen et al. LREC-COLING 2024. [Paper]
-
"RedPajama: an Open Dataset for Training Large Language Models".
-
"The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset".
-
"Zyda: A 1.3T Dataset for Open Language Modeling".
-
"Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning".
Shivalika Singh et al. ACL 2024. [Paper]
-
"Bactrian-X: Multilingual Replicable Instruction-Following Models with Low-Rank Adaptation".
-
"CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society".
-
"OpenAssistant Conversations - Democratizing Large Language Model Alignment".
-
"Phoenix: Democratizing ChatGPT across Languages".
-
"Crosslingual Generalization through Multitask Finetuning".
-
"Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena".
-
"OpenAssistant Conversations - Democratizing Large Language Model Alignment".
-
"Finetuned Language Models Are Zero-Shot Learners".
-
"Multitask Prompted Training Enables Zero-Shot Task Generalization".
-
"Training language models to follow instructions with human feedback".
-
"Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback".
Yuntao Bai et al. arXiv 2022. [Paper]
-
"A General Language Assistant as a Laboratory for Alignment".
Amanda Askell et al. arXiv 2021. [Paper]
-
"Self-Instruct: Aligning Language Models with Self-Generated Instructions".
-
"WizardLM: Empowering Large Language Models to Follow Complex Instructions".
-
"WizardCoder: Empowering Code Large Language Models with Evol-Instruct".
-
"Self-Alignment with Instruction Backtranslation".
Xian Li et al. ICLR 2024. [Paper]
-
"Instruction Tuning With Loss Over Instructions".
-
"Instruction Fine-Tuning: Does Prompt Loss Matter?".
Mathew Huerta-Enochian et al. EMNLP 2024. [Paper]
-
"Instruction Tuning for Large Language Models: A Survey".
-
"Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback".
Yuntao Bai et al. arXiv 2022. [Paper]
-
"Training language models to follow instructions with human feedback".
-
"Fine-Tuning Language Models from Human Preferences".
-
"Learning to summarize from human feedback".
-
"WebGPT: Browser-assisted question-answering with human feedback".
Reiichiro Nakano et al. arXiv 2021. [Paper]
-
"Training language models to follow instructions with human feedback".
-
"Direct Preference Optimization: Your Language Model is Secretly a Reward Model".
Rafael Rafailov et al. NeurIPS 2023. [Paper]
-
"Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons".
Ralph Allan Bradley and Milton E. Terry Biometrika 1952. [Paper]
-
"The Impact of Preference Agreement in Reinforcement Learning from Human Feedback: A Case Study in Summarization".
Sian Gooding and Hassan Mansoor arXiv 2023. [Paper]
-
"Understanding the Effects of RLHF on LLM Generalisation and Diversity".
-
"Proximal Policy Optimization Algorithms".
John Schulman et al. arXiv 2017. [Paper]
-
"A General Theoretical Paradigm to Understand Learning from Human Preferences".
Mohammad Gheshlaghi Azar et al. AISTATS 2024. [Paper]
-
"Preference Ranking Optimization for Human Alignment".
-
"RRHF: Rank Responses to Align Language Models with Human Feedback without tears".
-
"KTO: Model Alignment as Prospect Theoretic Optimization".
Kawin Ethayarajh et al. ICML 2024. [Paper]
-
"SLiC-HF: Sequence Likelihood Calibration with Human Feedback".
Yao Zhao et al. arXiv 2023. [Paper]
-
"β-DPO: Direct Preference Optimization with Dynamic β".
-
"SimPO: Simple Preference Optimization with a Reference-Free Reward".
-
"Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint".
Wei Xiong et al. ICML 2024. [Paper]
-
"Crosslingual Generalization through Multitask Finetuning".
-
"Bactrian-X: A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation".
-
"Phoenix: Democratizing ChatGPT across Languages".
-
"PolyLM: An Open Source Polyglot Large Language Model".
-
"SeaLLMs -- Large Language Models for Southeast Asia".
Xuan-Phi Nguyen et al. ACL 2024 DEMO TRACK. [Paper] [Github]
-
"Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model".
Ahmet Üstün et al. arXiv 2024. [Paper] [Huggingface]
-
"Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback".
-
"Multilingual Instruction Tuning With Just a Pinch of Multilinguality".
Uri Shaham et al. ACL 2024 Findings. [Paper]
-
"Zero-shot cross-lingual transfer in instruction tuning of large language models".
Nadezhda Chirkova et al. INLG 2024. [Paper]
-
"Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca".
-
"Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?".
-
"Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?".
-
"Lucky 52: How Many Languages Are Needed to Instruction Fine-Tune Large Language Models?".
Shaoxiong Ji et al. arXiv 2024. [Paper] [Huggingface
-
"Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment".
Zhaofeng Wu et al. EMNLP 2024. [Paper]
-
"The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts".
-
"Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrations".
-
"Extrapolating Large Language Models to Non-English by Aligning Languages".
Wenhao Zhu et al. arXiv 2023. [Paper]
-
"The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights".
Wenhao Zhu et al. arXiv 2024. [Paper]
-
"Question Translation Training for Better Multilingual Reasoning".
-
"BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models".
-
"InstructAlign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning".
Samuel Cahyawijaya et al. SEALP 2023. [Paper]
-
"xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning".
Linzheng Chai et al. arXiv 2024. [Paper]
-
"PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning".
-
"TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes".
-
"MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization".
-
"Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca".
-
"Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models".
Seungduk Kim et al. arXiv 2024. [Paper]
-
"Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities".
Kazuki Fujii et al. COLM 2024. [Paper]
-
"MaLA-500: Massive Language Adaptation of Large Language Models".
Peiqin Lin et al. arXiv 2024. [Paper]
-
"SeaLLMs -- Large Language Models for Southeast Asia".
Xuan-Phi Nguyen et al. ACL 2024 DEMO TRACK. [Paper] [Github]
-
"LangBridge: Multilingual Reasoning Without Multilingual Supervision".
-
"RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization".
-
"BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting".
-
"LLaMA Beyond English: An Empirical Study on Language Capability Transfer".
Jun Zhao et al. arXiv 2024. [Paper]
-
"Rethinking LLM language adaptation: A case study on chinese mixtral".
-
"Towards Robust In-Context Learning for Machine Translation with Large Language Models".
Shaolin Zhu et al. LREC 2024. [Paper]
-
"LANDeRMT: Dectecting and Routing Language-Aware Neurons for Selectively Finetuning LLMs to Machine Translation".
Shaolin Zhu et al. ACL 2024. [Paper]
-
"FEDS-ICL: Enhancing translation ability and efficiency of large language model by optimizing demonstration selection".
Shaolin Zhu et al. Information Processing & Management 2024. [Paper]
-
"Efficiently Exploring Large Language Models for Document-Level Machine Translation with In-context Learning".
Menglong Cui et al. ACL 2024 Findings. [Paper]
-
"DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms".
-
"Paying More Attention to Source Context: Mitigating Unfaithful Translations from Large Language Model".
-
"BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages".
-
"A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models".
-
"Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions".
Jiahuan Li et al. TACL 2024. [Paper]
-
"Fine-Tuning Large Language Models to Translate: Will a Touch of Noisy Data in Misaligned Languages Suffice?".
-
"Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation".
-
"Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution".
Nuo Xu et al. arXiv 2024. [Paper]
-
"Word Alignment as Preference for Machine Translation".
Qiyu Wu et al. EMNLP 2024. [Paper]
-
"Teaching Large Language Models to Translate with Comparison".
-
"Improving Translation Faithfulness of Large Language Models via Augmenting Instructions".
-
"Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models".
-
"Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages".
Zhuoyuan Mao et al. LoResMT 2024. [Paper]
-
"Relay Decoding: Concatenating Large Language Models for Machine Translation".
Chengpeng Fu et al. arXiv 2024. [Paper]
-
"m3P: Towards Multimodal Multilingual Translation with Multimodal Prompt".
Jian Yang et al. LREC 2024. [Paper] [Huggingface]
-
"CultureLLM: Incorporating Cultural Differences into Large Language Models".
-
"CulturePark: Boosting Cross-cultural Understanding in Large Language Models".
-
"Self-Pluralising Culture Alignment for Large Language Models".
-
"Global Gallery: The Fine Art of Painting Culture Portraits through Multilingual Instruction Tuning".
-
"The Echoes of Multilinguality: Tracing Cultural Value Shifts during LM Fine-tuning".
Rochelle Choenni et al. ACL 2024. [Paper]
-
"CRAFT: Extracting and Tuning Cultural Instructions from the Wild".
-
"How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models".
Phillip Rust and Jonas Pfeiffer et al. ACL-IJCNLP 2021. [Paper] [GitHub]
-
"ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models".
Linting Xue, Aditya Barua, Noah Constant, and Rami Al-Rfou et al. TACL 2023. [Paper] [GitHub]
-
"Language Model Tokenizers Introduce Unfairness Between Languages".
-
"Tokenizer Choice For LLM Training: Negligible or Crucial?".
Mehdi Ali, Michael Fromm, and Klaudia Thellmann et al. NAACL (Findings) 2024. [Paper]
-
"MEGA: Multilingual Evaluation of Generative AI".
-
"MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks".
Sanchit Ahuja et al. arXiv 2024. [Paper]
-
"ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models".
Viet Dac Lai, Nghia Trung Ngo, and Amir Pouran Ben Veyseh et al. EMNLP (Findings) 2023. [Paper]
-
"Investigating the Translation Performance of a Large Multilingual Language Model: the Case of BLOOM".
-
"Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis".
-
"M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models".
-
"Evaluating the Elementary Multilingual Capabilities of Large Language Models with MULTIQ".
Carolin Holtermann and Paul Röttger et al. ACL (Findings) 2024. [Paper] [GitHub]
-
"SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization Evaluation".
-
"xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark".
-
"MEEP: Is this Engaging? Prompting Large Language Models for Dialogue Evaluation in Multilingual Settings".
-
"Ethical Reasoning and Moral Value Alignment of LLMs Depend on the Language we Prompt them in".
Utkarsh Agarwal, Kumar Tanmay, and Aditi Khandelwal et al. LREC-COLING 2024. [Paper]
-
"RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios?".
-
"PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models".
Devansh Jain and Priyanshu Kumar et al. COLM 2024. [Paper] [GitHub]
-
"On Evaluating and Mitigating Gender Biases in Multilingual Settings".
Aniket Vashishtha and Kabir Ahuja et al. ACL (Findings) 2021. [Paper] [GitHub]
-
"All Languages Matter: On the Multilingual Safety of LLMs".
-
"Low-Resource Languages Jailbreak GPT-4".
Zheng-Xin Yong et al. NeurIPS (Workshop) 2023. [Paper]
-
"Multilingual Jailbreak Challenges in Large Language Models".
-
"A Cross-Language Investigation into Jailbreak Attacks in Large Language Models".
Jie Li et al. arXiv 2024. [Paper]
-
"How Vocabulary Sharing Facilitates Multilingualism in LLaMA?".
-
"Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?".
-
"METAL: Towards Multilingual Meta-Evaluation".
Rishav Hada and Varun Gumma et al. NAACL (Findings) 2024. [Paper] [GitHub]
-
"How do Large Language Models Handle Multilingualism?".
Zhao Y, Zhang W, Chen G, et al. arXiv 2024. [Paper]
-
"Do Llamas Work in English? On the Latent Language of Multilingual Transformers".
Wendler C, Veselovsky V, Monea G, et al. ACL 2024. [Paper]
-
"Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models".
Blevins T, Gonen H, Zettlemoyer L. EMNLP 2022. [Paper]
-
"Unveiling Multilinguality in Transformer Models: Exploring Language Specificity in Feed-Forward Networks".
Bhattacharya S, Bojar O. Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP 2023. [Paper]
-
"Unveiling Linguistic Regions in Large Language Models".
Zhang Z, Zhao J, Zhang Q, et al. ACL 2024. [Paper]
-
"Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models".
Tang T, Luo W, Huang H, et al. ACL 2024. [Paper]
-
"Unraveling Babel: Exploring Multilingual Activation Patterns of LLMs and Their Applications".
Liu W, Xu Y, Xu H, et al. EMNLP 2024. [Paper]
-
"On the Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific Neurons".
Kojima T, Okimura I, Iwasawa Y, et al. NAACL 2024. [Paper]
-
"The Geometry of Multilingual Language Model Representations".
Chang T, Tu Z, Bergen B. EMNLP 2022. [Paper]
-
"Language-agnostic Representation from Multilingual Sentence Encoders for Cross-lingual Similarity Estimation".
Tiyajamorn N, Kajiwara T, Arase Y, et al. EMNLP 2021. [Paper]
-
"An Isotropy Analysis in the Multilingual BERT Embedding Space".
Rajaee S, Pilehvar M T. ACL 2022. [Paper]
-
"Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations".
Xie Z, Zhao H, Yu T, et al. EMNLP 2022. [Paper]
-
"Emerging Cross-lingual Structure in Pretrained Language Models".
Conneau A, Wu S, Li H, et al. ACL 2020. [Paper]
-
"Probing LLMs for Joint Encoding of Linguistic Categories".
Starace G, Papakostas K, Choenni R, et al. EMNLP 2023. [Paper]
-
"Morph Call: Probing Morphosyntactic Content of Multilingual Transformers".
Mikhailov V, Serikov O, Artemova E. Proceedings of the Third Workshop on Computational Typology and Multilingual NLP 2021. [Paper]
-
"Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models".
Stanczak K, Ponti E, Hennigen L T, et al. NAACL 2022. [Paper]
-
"Probing Cross-Lingual Lexical Knowledge from Multilingual Sentence Encoders".
Vulić I, Glavaš G, Liu F, et al. EACL 2023. [Paper]
-
"The Emergence of Semantic Units in Massively Multilingual Models".
de Varda A G, Marelli M. LREC-COLING 2024. [Paper]
-
"X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models".
Jiang Z, Anastasopoulos A, Araki J, et al. EMNLP 2020. [Paper]
-
"Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models".
Kassner N, Dufter P, Schütze H. EACL 2021. [Paper]
-
"Cross-Lingual Consistency of Factual Knowledge in Multilingual Language Models".
Qi J, Fernández R, Bisazza A. EMNLP 2023. [Paper]
-
"Language Representation Projection: Can We Transfer Factual Knowledge across Languages in Multilingual Language Models?".
Xu S, Li J, Xiong D. EMNLP 2023. [Paper]
-
"Are Structural Concepts Universal in Transformer Language Models? Towards Interpretable Cross-Lingual Generalization"
Xu N, Zhang Q, Ye J, et al. EMNLP 2023. [Paper]
-
"When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer".
Deshpande A, Talukdar P, Narasimhan K. NAACL 2022. [Paper]
-
"Emerging Cross-lingual Structure in Pretrained Language Models".
Conneau A, Wu S, Li H, et al. ACL 2020. [Paper]
-
"Cross-Lingual Ability of Multilingual BERT: An Empirical Study".
Karthikeyan K, Wang Z, Mayhew S, et al. ICLR 2020. [Paper]
-
"Unveiling Linguistic Regions in Large Language Models".
Zhang Z, Zhao J, Zhang Q, et al. ACL 2024. [Paper]
-
"Unraveling Babel: Exploring Multilingual Activation Patterns of LLMs and Their Applications".
Liu W, Xu Y, Xu H, et al. EMNLP 2024. [Paper]
-
"The Geometry of Multilingual Language Model Representations".
Chang T, Tu Z, Bergen B. EMNLP 2022. [Paper]
-
"How do Large Language Models Handle Multilingualism?".
Zhao Y, Zhang W, Chen G, et al. arXiv 2024. [Paper]
-
"Do Llamas Work in English? On the Latent Language of Multilingual Transformers".
Wendler C, Veselovsky V, Monea G, et al. ACL 2024. [Paper]
-
"Biobert: a pre-trained biomedical language representation model for biomedical text mining."
-
"DNABERT: pre-trained bidirectional encoder representations from transformers model for dna-language in genome."
-
"DNABERT-2: efficient foundation model and benchmark for multi-species genome."
-
"MING-MOE: enhancing medical multi-task learning in large language models with sparse mixture of low-rank adapter experts."
-
"Doctorglm: Fine-tuning your chinese doctor is not a herculean task."
-
"Huatuogpt, towards taming language model to be a doctor."
-
"Medgpt: Medical concept prediction from clinical narratives."
Zeljko Kraljevic et al. arXiv 2021 [paper]
-
"Clinicalgpt: Large language models finetuned with diverse medical data and comprehensive evaluation."
Guangyu Wang et al. arXiv 2023 [paper]
-
"Ivygpt: Interactive chinese pathway language model in medical domain."
-
"Bianque: Balancing the questioning and suggestion ability of health llms with multi-turn health conversations polished by chatgpt."
-
"Soulchat: Improving llms’ empathy, listening, and comfort abilities through fine-tuning with multi-turn empathy conversations."
-
"Towards expert-level medical question answering with large language models."
-
"Chatdoctor: A medical chat model fine-tuned on llama model using medical domain knowledge."
Yunxiang Li et al. arXiv 2023 [paper] [[github](Chatdoctor: A medical chat model fine-tuned on llama model using medical domain knowledge)]
-
"Codebert: A pre-trained model for programming and natural languages."
-
"Learning and evaluating contextual embedding of source code."
-
"Unified pre-training for program understanding and generation."
-
"Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation."
-
"Codet5+: Open code large language models for code understanding and generation."
-
"Competition-level code generation with alphacode"
-
"Evaluating large language models trained on code."
-
"A systematic evaluation of large language models of code."
-
"Codegen: An open large language model for code with multi-turn program synthesis."
-
"A generative model for code infilling and synthesis."
-
"Code llama: Open foundation models for code."
-
"Starcoder: may the source be with you!"
-
"CodeGeeX: A pretrained model for code generation with multilingual benchmarking on humaneval-x."
-
"Codeshell technical report."
-
"CodeGemma: Open Code Models Based on Gemma"
-
"Qwen2.5-Coder Technical Report."
-
"Tree-based representation and generation of natural and mathematical language."
-
"Chatglm-math: Improving math problem-solving in large language models with a self-critique pipeline."
-
"Deepseek-math: Pushing the limits of mathematical reasoning in open language models."
-
"Metamath: Bootstrap your own mathematical questions for large language models."
-
"Mammoth: Building math generalist models through hybrid instruction tuning."
-
"Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct."
-
"Generative ai for math: Abel."
Ethan Chern et al. GitHub 2023 [github]
-
"Orca-math: Unlocking the potential of slms in grade school math."
-
"LEGAL-BERT: the muppets straight out of law school."
-
"Lawformer: A pre-trained language model for chinese legal long documents."
-
"A brief report on lawgpt 1.0: A virtual legal assistant based on GPT-3."
Ha-Thanh Nguyen arXiv 2023 [paper]
-
"Disc-lawllm: Fine-tuning large language models for intelligent legal services."
-
"Chatlaw: Open-source legal large language model with integrated external knowledge bases."
-
"SAILER: structure-aware pre-trained language model for legal case retrieval."
-
"Lawyer llama technical report."
-
"Legal-relectra: Mixeddomain language modeling for long-range legal text comprehension."
Wenyue Hua et al. arXiv 2022 [paper]