GitHub

The code in this repo is used in the order shown by their filenames. For clarification find the following:

1-Preprocessing is used to load the data from huggingface and preprocess it for the purposes of this thesis. It stores the datasets.

2-Selection is used to select the relevant datapoints for the following steps in the pipeline from the previous stored data and stores the new datasets in train-val-test splits.

3-News_raining and 3-Twitter_training are used to train Phi2 on both finalized datasets. It stores the models and the logs of the training.

4-Logging is used to visualize the logs of the training for the models from the stored logs.

5-GPT_test, 5-Phi_test, 5-Phi_q_test are used to generate the results with the trained and GPT models. The results are stored.

6-Scoring is used to retrieve the metrics for the responses, the confusion matrices are stored.

7-Data_upload_to_HF, 7-Model_upload_to_HF is used to upload the final versions of the data and models to HuggingFace.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
1-Preprocessing.ipynb		1-Preprocessing.ipynb
2-Selection.ipynb		2-Selection.ipynb
3-Mix_training.ipynb		3-Mix_training.ipynb
3-News_training.ipynb		3-News_training.ipynb
3-Twitter_training.ipynb		3-Twitter_training.ipynb
4-Logging.ipynb		4-Logging.ipynb
5-GPT_test.ipynb		5-GPT_test.ipynb
5-Phi_q_test.ipynb		5-Phi_q_test.ipynb
5-Phi_test.ipynb		5-Phi_test.ipynb
6-Scoring.ipynb		6-Scoring.ipynb
7-Data_upload_to_HF.ipynb		7-Data_upload_to_HF.ipynb
7-Model_upload_to_HF.ipynb		7-Model_upload_to_HF.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

koen430/ThesisJADS

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages