GitHub - ByUnal/AviBERT: Domain specific (Aircraft) Bert models trained on uniquely collected data.

AviBert: on Classifying the news about Aircraft

Overview

This repository focuses on Aircraft, and we work towards developing an Aircraft-specific classification model on a multi-class development set by using BERT and its lightweight and heavyweight variants. Besides, introduces a pipeline that comprises data collection, data tagging and model training. Overall, since data and targets are unique, the presented model in this study is also a groundbreaker. Details of the dataset can be investigated further, and the results are compared by using macro-f1 and accuracy scores between models.

Setup

Install the requirements. I've added torch to requirements.txt, but you can prefer to install by yourself according to different cuda version and resources.

pip install -r requirements.txt

Run the Code

I've concluded hyperparameter tuning by using optuna, and therefore main.py fixed accordingly. Also, you can train standalone model by using train_loop()

Results

The results that we obtained our experiments as below:

You can also see the best parameters for the models after hyperparameter optimization in results/params.txt

Some of the conclusions obtained:

In DistilBert training, the model overt fits the training data up to %93 accuracy score however it generalizes badly.
torch.clip_norm function demolishes the model success rate, it shows that additional algorithms are unnecessary for bert base models.

Acknowledgement

Currently, I've prepared the paper of this project besides including data collection steps. However, we're doing an additional novel experiments on this topic. So, paper link/details will be shared as soon as the paper is published.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
notebooks		notebooks
results		results
.gitignore		.gitignore
README.md		README.md
bert_model.py		bert_model.py
main.py		main.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AviBert: on Classifying the news about Aircraft

Overview

Setup

Run the Code

Results

Acknowledgement

About

Releases

Packages

Languages

ByUnal/AviBERT

Folders and files

Latest commit

History

Repository files navigation

AviBert: on Classifying the news about Aircraft

Overview

Setup

Run the Code

Results

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages