Breaking the Script Barrier in Multilingual Pre-Trained Language Models with Transliteration-Based Post-Training Alignment
-
Updated
Jul 3, 2024 - Python
Breaking the Script Barrier in Multilingual Pre-Trained Language Models with Transliteration-Based Post-Training Alignment
In this work we applied multilingual zero-shot transfer concept for the task of toxic comments detection. This concept allows a model trained only on a single-language dataset to work in arbitrary language, even low-resource.
Official implementation of "CONCRETE: Improving Cross-lingual Fact Checking with Cross-lingual Retrieval" (COLING'22)
Improving Quality of Multilingual Question Answering and Cross-Lingual Transfer using Multitask Learning, Knowledge Distillation, and Data Augmentation
Code for "Preference Tuning For Toxicity Mitigation Generalizes Across Languages." Paper accepted at Findings of EMNLP 2024
Code for importance-weighted domain alignment, and the paper “Cross-Lingual Transfer with Class-Weighted Language-Invariant Representations”.
Cross Lingual Language models for making search engines for Holy Quran and Sahih Hadiths
This is a project proposal to implement Yan et al.'s (2020) mBERT-Unaligned for cross-lingual RDs with Japanese, German and Italian untranslatable terms
This repository contains the implementation of cross-lingual transfer learning experiments for Named Entity Recognition (NER) between Hindi and Nepali, utilizing pre-trained multilingual BERT models to explore the effectiveness of linguistic knowledge transfer across these languages.
Cascading Adaptors to Leverage English Data to Improve Performance ofQuestion Answering for Low-Resource Languages
UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages (ACL 2024)
Author's implementation of the paper https://www.aclweb.org/anthology/2021.ltedi-1.30/
[EMNLP 2022] Discovering Language-neutral Sub-networks in Multilingual Language Models.
This repository contains the code for the experiments related to higher-level semantic tasks and related to the meta-learning from: "From Zero to Hero: On the Limitations of Zero Shot Cross-Lingual Transfer"
Implementation of ICLR 2022 paper "Enhancing Cross-lingual Transfer by Manifold Mixup".
Improving Indonesian text classification using multilingual language model
This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs" published in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23), July 9-14, 2023.
CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates
Add a description, image, and links to the cross-lingual-transfer topic page so that developers can more easily learn about it.
To associate your repository with the cross-lingual-transfer topic, visit your repo's landing page and select "manage topics."