This repository is a summary of the final group project for HKUST ELEC4230 (Deep Learning for Natural Language Processing), developed by Chia-Wei Wu and Chia-Hong Hsu. The main reference for our project is from a group project of Stanford's NLP class CS224N Winter 2020.
pip install transformers pytorch matplotlib
This project explores the concept of Transfer Learning for NLP by using adapters to reduce the computational and storage costs of fine-tuning large language models. Instead of fine-tuning the entire model, only the adapter parameters are trained, while the pretrained model parameters remain frozen. We introduce ResAdapters, a novel approach that applies residual networks to adapters, improving their effectiveness without adding extra parameters.
The test is conducted on the SQuAD2.0 dataset, with the task of reading and extracting answers from the text for the given question.
Model | ALL | NoAns | HasAns | |||
---|---|---|---|---|---|---|
EM | F1 | EM | F1 | EM | F1 | |
Non Linear | ||||||
Adapter | 30.6 | 31.6 | 54.5 | 54.5 | 4.6 | 6.6 |
Adapter ResNet | 43.8 | 46.1 | 68.3 | 68.3 | 17.1 | 21.8 |
With Attention | ||||||
Attention | 33.2 | 40.0 | 56.8 | 56.8 | 7.5 | 21.8 |
Attention ResNet | 45.0 | 49.0 | 53.5 | 53.5 | 35.7 | 43.9 |
Attention ResNeXt | 42.6 | 51.4 | 42.8 | 42.8 | 42.3 | 60.0 |
Please see the project report here!