CyberBullying Analysis

Implementation of AlBert finetuning for sentimental analysis of cyber bullying

Instructions for training

Clone the Albert Library

https://github.com/google-research/albert

Download the pre-trained weights (this implementation uses the base weights)
Ready the data to be formatted in way AlBert uses.
Use the following command to run the fine-tuning script present in the AlBert library

python -m albert.run_classifier --data_dir="data/" --output_dir="outputs/" --spm_model_file="albert_base/30k-clean.model" --init_checkpoint="albert_base/model.ckpt-best" --albert_config_file="albert_base/albert_config.json" --do_train --task_name=CoLA --max_seq_length=512 --optimizer=adamw --warmup_step=320 --learning_rate=1e-5 --train_step=5336 --save_checkpoints_steps=100 --vocab_file="albert_base/30k-clean.vocab" --train_batch_size=4

Important args

data_dir -> Dir where the train.tsv file is present (Note: the .tsv files must be in subdir CoLA as we are using CoLA for the training method)
output_dir -> Where you store the outputs.
spm_model_file -> The model.
init_checkpoint -> Initial checkpoint to start fine-tuning from.
vocab_file -> AlBert vocab file
do_train -> Flag to start training.
do_eval -> Flag to eval at each ckpt.
train_batch_size -> batch size whilst training, if mem error occurs reduce this value

Instructions for testing

Use the following command

python -m albert.run_classifier --data_dir="data/" --output_dir="outputs/" --spm_model_file="albert_base/30k-clean.model" --init_checkpoint="outputs/model.ckpt-best" --albert_config_file="albert_base/albert_config.json" --do_predict --task_name=CoLA --max_seq_length=512 --vocab_file="albert_base/30k-clean.vocab"

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
outputs		outputs
.gitignore		.gitignore
README.md		README.md
data_prep.ipynb		data_prep.ipynb
testing.py		testing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CyberBullying Analysis

Instructions for training

Important args

Instructions for testing

About

Releases

Packages

Languages

jk-tripathy/CyberBullying-Senti-Analysis

Folders and files

Latest commit

History

Repository files navigation

CyberBullying Analysis

Instructions for training

Important args

Instructions for testing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages