GitHub - hatanp/finngpt2: 950M parameter Finnish GPT-2 model pre-trained and evaluated as a Master's thesis project

A collection of some scripts that I have used to pre-train a GPT-2 model for Finnish. The total parameter count is around 1.2B and the model should fir into 24GB of vram with batch size of 4 with this configuration. However, I found that I want to use a second GPU to run a display if I want to watch youtube for example so I wouldn't recommend using this identical configuration with an RTX 3090 on Windows. Also contains an example classification finetuning task.

Contains some preprocessing, pre-training and a finetuning example. Preprocessing pre-trainign data in preprocess_* files. Tokenization with train_bpe_tokenizer_linux2_fix.py file. Pre_training is done with train.py. Files required for finetuning are *yle.py files.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config.json		config.json
create_yle.py		create_yle.py
ft_yle.py		ft_yle.py
preprocess_linux_1.py		preprocess_linux_1.py
preprocess_win2_fix.py		preprocess_win2_fix.py
readme.md		readme.md
train.py		train.py
train_bpe_tokenizer_linux2_fix.py		train_bpe_tokenizer_linux2_fix.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

hatanp/finngpt2

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages