Releases · foundation-model-stack/fms-hf-tuning

18 May 00:01

jbusche

v0.1.0

d515f78

v0.1.0 - First release

Summary of Changes

Supports and validated tuning technique: full fine tuning using single-GPU and multi-GPU
- Multi-GPU training using HuggingFace accelerate library, focused on FSDP
Experimental tuning techniques:
- Single GPU Prompt tuning
- Single GPU LoRA tuning
Scripts to allow local inference and evaluation of tuned models
Build scripts for containerization of library
Initial trainer controller framework for controlling the trainer loop using user-defined rules and metrics

Pip package: pip install fms-hf-tuning==0.1.0

What's Changed

Init by @raghukiran1224 in #1
allows disable flash attn and torch dtype param by @Ssukriti in #2
First refactor train by @Ssukriti in #3
fix : the way args are passed by @Ssukriti in #10
fix full param tuning by @lchu-ibm in #14
fix import of aim_loader by @anhuong in #13
fix: set model max length to either passed in or tokenizer value by @anhuong in #17
fix: do not set model max length when loading model by @anhuong in #21
add EOS token to dataset by @Ssukriti in #15
Local inference by @alex-jw-brooks in #27
feat: add validation dataset to train by @anhuong in #26
feat: support str in target_modules for LoraConfig by @VassilisVassiliadis in #39
Add formatting tools by @hickeyma in #31
Enable code formatting by @hickeyma in #40
Enable daily dependabot updates by @hickeyma in #41
Add file logger callback & export train loss json file by @alex-jw-brooks in #22
Merge models by @alex-jw-brooks in #32
Local inference merged models by @alex-jw-brooks in #43
feat: track validation loss in logs file by @anhuong in #51
Add linting capability by @hickeyma in #52
Add PR/Issue templates by @tedhtchang in #65
Add sample unit tests by @tedhtchang in #61
Initial commit for trainer image by @tharapalanivel in #69
Adding copyright notices by @tharapalanivel in #77
Enable pylint in the github workflow by @tedhtchang in #63
Bump aim from 3.17.5 to 3.18.1 by @dependabot in #42
Add Contributing file by @jbusche in #58
docs: lora and getting modules list by @anhuong in #46
Allow SFT_TRAINER_CONFIG_JSON_ENV_VAR to be encoded json string by @kellyaa in #82
Document lint by @tedhtchang in #84
Let Huggingface Properly Initialize Arguments, and Fix FSDP-LORA Checkpoint-Saves and Resumption by @fabianlim in #53
Unit tests by @tharapalanivel in #83
Update CONTRIBUTING.md by @Ssukriti in #86
Update input args to max_seq_length and training_data_path by @anhuong in #94
feat: move to accelerate launch for distributed training by @kmehant in #92
Update README.md by @Ssukriti in #95
Modify copyright notice by @tharapalanivel in #96
Switches dependencies from txt file to toml file by @jbusche in #68
fix: use attn_implementation="flash_attention_2" by @kmehant in #101
fix: not passing PEFT argument should default to full parameter finetuning by @kmehant in #100
feat: update launch training with accelerate for multi-gpu by @anhuong in #98
Setting default values in training job config by @tharapalanivel in #104
add refactored build utils into docker image by @anhuong in #108
feat: combine train and eval loss into one file by @anhuong in #109
docs: add note on ephemeral storage by @anhuong in #106
Move accelerate launch args parsing by @tharapalanivel in #107
Docs improvements by @Ssukriti in #111
feat: add env var SET_NUM_PROCESSES_TO_NUM_GPUS by @anhuong in #110
feat: Trainer controller framework by @seshapad in #45
Copying logs file by @tharapalanivel in #113
Fix copying over logs by @tharapalanivel in #114
Add eval script by @alex-jw-brooks in #102
Lint tests by @tharapalanivel in #112
Move sklearn to optional, install optionals for linting by @alex-jw-brooks in #117
Build Wheel Action by @jbusche in #105
rstrip eos in evaluation by @alex-jw-brooks in #121
Fix eos token suffix removal by @alex-jw-brooks in #125
Make use of instruction field optional by @alex-jw-brooks in #123
Deprecating the requirements.txt for dependencies management by @tedhtchang in #116
Add unit tests for various edge cases by @alex-jw-brooks in #97
fix typo in build gha by @jbusche in #138
Install whl in Dockerfile by @tedhtchang in #126
feat: add flash attn to inference and eval scripts by @anhuong in #132
OS update in dockerfile by @jbusche in #127
fix: ignore the build output and auto-generated files by @HarikrishnanBalagopal in #140
Propose ADR for Training Acceleration by @fabianlim in #119
feat: new format for the controller metrics and operations by @HarikrishnanBalagopal in #130
adr: Format change to the trainer controller configuration by @seshapad in #128
Generic tracker API and implementation of Aimstack tracker by @dushyantbehl in #89
fix: Allow makefile to run test independent of fmt/lint by @dushyantbehl in #145
feat: Trainer state as a trainer controller metric by @seshapad in #150
Bump aim from 3.18.1 to 3.19.0 by @dependabot in #93
fix: launch_training.py arguments with new tracker api by @dushyantbehl in #153
feat: Exposed the evaluation metrics for rules within trainer controller by @seshapad in #146
Comment out aim in dockerfile by @jbusche in #155
fix: replace eval with a safer alternative by @HarikrishnanBalagopal in #147
doc...

Contributors

dushyantbehl, tedhtchang, and 15 other contributors

Assets 2

16 May 20:08

anhuong

v0.1.0-rc.1

d515f78

v0.1.0-rc.1 Pre-release

Pre-release

What's Changed

fix: replace eval with a safer alternative by @HarikrishnanBalagopal in #147
docs: ADR for moving from eval to simpleeval for evaluating trainer controller rules by @HarikrishnanBalagopal in #151
Add exception catching / writing to termination log by @kellyaa in #149
fix: merging of model for multi-gpu by @anhuong in #158
add .complete file to output dir when done by @kellyaa in #159

Full Changelog: v0.0.2rc2...v0.1.0-rc.1

Contributors

kellyaa, anhuong, and HarikrishnanBalagopal

Assets 2

13 May 21:03

jbusche

v0.0.2rc2

40fd75c

v0.0.2rc.2 Pre-release

Pre-release

What's Changed

fix typo in build gha by @jbusche in #138
Install whl in Dockerfile by @tedhtchang in #126
feat: add flash attn to inference and eval scripts by @anhuong in #132
OS update in dockerfile by @jbusche in #127
fix: ignore the build output and auto-generated files by @HarikrishnanBalagopal in #140
Propose ADR for Training Acceleration by @fabianlim in #119
feat: new format for the controller metrics and operations by @HarikrishnanBalagopal in #130
adr: Format change to the trainer controller configuration by @seshapad in #128
Generic tracker API and implementation of Aimstack tracker by @dushyantbehl in #89
fix: Allow makefile to run test independent of fmt/lint by @dushyantbehl in #145
feat: Trainer state as a trainer controller metric by @seshapad in #150
Bump aim from 3.18.1 to 3.19.0 by @dependabot in #93
fix: launch_training.py arguments with new tracker api by @dushyantbehl in #153
feat: Exposed the evaluation metrics for rules within trainer controller by @seshapad in #146
Comment out aim in dockerfile by @jbusche in #155

New Contributors

@HarikrishnanBalagopal made their first contribution in #140
@dushyantbehl made their first contribution in #89

Full Changelog: v0.0.2rc1...v0.0.2rc2

Contributors

dushyantbehl, tedhtchang, and 6 other contributors

Assets 2

24 Apr 22:57

jbusche

v0.0.2rc1

8548a6d

v0.0.2rc1 Pre-release

Pre-release

What's Changed

Init by @raghukiran1224 in #1
allows disable flash attn and torch dtype param by @Ssukriti in #2
First refactor train by @Ssukriti in #3
fix : the way args are passed by @Ssukriti in #10
fix full param tuning by @lchu-ibm in #14
fix import of aim_loader by @anhuong in #13
fix: set model max length to either passed in or tokenizer value by @anhuong in #17
fix: do not set model max length when loading model by @anhuong in #21
add EOS token to dataset by @Ssukriti in #15
Local inference by @alex-jw-brooks in #27
feat: add validation dataset to train by @anhuong in #26
feat: support str in target_modules for LoraConfig by @VassilisVassiliadis in #39
Add formatting tools by @hickeyma in #31
Enable code formatting by @hickeyma in #40
Enable daily dependabot updates by @hickeyma in #41
Add file logger callback & export train loss json file by @alex-jw-brooks in #22
Merge models by @alex-jw-brooks in #32
Local inference merged models by @alex-jw-brooks in #43
feat: track validation loss in logs file by @anhuong in #51
Add linting capability by @hickeyma in #52
Add PR/Issue templates by @tedhtchang in #65
Add sample unit tests by @tedhtchang in #61
Initial commit for trainer image by @tharapalanivel in #69
Adding copyright notices by @tharapalanivel in #77
Enable pylint in the github workflow by @tedhtchang in #63
Bump aim from 3.17.5 to 3.18.1 by @dependabot in #42
Add Contributing file by @jbusche in #58
docs: lora and getting modules list by @anhuong in #46
Allow SFT_TRAINER_CONFIG_JSON_ENV_VAR to be encoded json string by @kellyaa in #82
Document lint by @tedhtchang in #84
Let Huggingface Properly Initialize Arguments, and Fix FSDP-LORA Checkpoint-Saves and Resumption by @fabianlim in #53
Unit tests by @tharapalanivel in #83
Update CONTRIBUTING.md by @Ssukriti in #86
Update input args to max_seq_length and training_data_path by @anhuong in #94
feat: move to accelerate launch for distributed training by @kmehant in #92
Update README.md by @Ssukriti in #95
Modify copyright notice by @tharapalanivel in #96
Switches dependencies from txt file to toml file by @jbusche in #68
fix: use attn_implementation="flash_attention_2" by @kmehant in #101
fix: not passing PEFT argument should default to full parameter finetuning by @kmehant in #100
feat: update launch training with accelerate for multi-gpu by @anhuong in #98
Setting default values in training job config by @tharapalanivel in #104
add refactored build utils into docker image by @anhuong in #108
feat: combine train and eval loss into one file by @anhuong in #109
docs: add note on ephemeral storage by @anhuong in #106
Move accelerate launch args parsing by @tharapalanivel in #107
Docs improvements by @Ssukriti in #111
feat: add env var SET_NUM_PROCESSES_TO_NUM_GPUS by @anhuong in #110
feat: Trainer controller framework by @seshapad in #45
Copying logs file by @tharapalanivel in #113
Fix copying over logs by @tharapalanivel in #114
Add eval script by @alex-jw-brooks in #102
Lint tests by @tharapalanivel in #112
Move sklearn to optional, install optionals for linting by @alex-jw-brooks in #117
Build Wheel Action by @jbusche in #105
rstrip eos in evaluation by @alex-jw-brooks in #121
Fix eos token suffix removal by @alex-jw-brooks in #125
Make use of instruction field optional by @alex-jw-brooks in #123
Deprecating the requirements.txt for dependencies management by @tedhtchang in #116
Add unit tests for various edge cases by @alex-jw-brooks in #97

New Contributors

@raghukiran1224 made their first contribution in #1
@Ssukriti made their first contribution in #2
@lchu-ibm made their first contribution in #14
@anhuong made their first contribution in #13
@alex-jw-brooks made their first contribution in #27
@VassilisVassiliadis made their first contribution in #39
@hickeyma made their first contribution in #31
@tedhtchang made their first contribution in #65
@tharapalanivel made their first contribution in #69
@dependabot made their first contribution in #42
@jbusche made their first contribution in #58
@kellyaa made their first contribution in #82
@fabianlim made their first contribution in #53
@kmehant made their first contribution in #92
@seshapad made their first contribution in #45

Full Changelog: https://github.com/foundation-model-stack/fms-hf-tuning/commits/v.0.0.2rc1

Contributors

tedhtchang, raghukiran1224, and 13 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Summary of Changes

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

Releases: foundation-model-stack/fms-hf-tuning

v0.1.0 - First release

Summary of Changes

What's Changed

Contributors

v0.1.0-rc.1

What's Changed

Contributors

v0.0.2rc.2

What's Changed

New Contributors

Contributors

v0.0.2rc1

What's Changed

New Contributors

Contributors