v0.2.1
What's Changed
- docker fixes: py310, fix cuda arg in deepspeed by @winglian in #115
- add support for gradient accumulation steps by @winglian in #123
- split up llama model loading so config can be loaded from base config and models can be loaded from a path by @winglian in #120
- copy xformers attn from ooba since we removed dep on alpaca_lora_4bit by @winglian in #124
- Fix(readme): Fix torch missing from readme by @NanoCode012 in #118
- Add accelerate dep by @winglian in #114
- Feat(inference): Swap to GenerationConfig by @NanoCode012 in #119
- add py310 support from base image by @winglian in #127
- add badge info to readme by @winglian in #129
- fix packing so that concatenated sequences reset the attention by @winglian in #131
- swap batch size for gradient accumulation steps to decouple from num gpu by @winglian in #130
- fix batch size calculation by @winglian in #134
- Fix: Update doc for grad_accu and add validation tests for batch size by @NanoCode012 in #135
- Feat: Add lambdalabs instruction by @NanoCode012 in #141
- Feat: Add custom prompt readme and add missing prompt strategies to Readme by @NanoCode012 in #142
- added docker-compose file by @FarisHijazi in #146
- Update README.md for correct image tags by @winglian in #147
- fix device map by @winglian in #148
- clone in docker by @winglian in #149
- new prompters, misc fixes for output dir missing using fsdp, and changing max seq len by @winglian in #155
- fix camel ai, add guanaco/oasst mapping for sharegpt by @winglian in #158
- Fix: Update peft and gptq instruction by @NanoCode012 in #161
- Fix: Move custom prompts out of hidden by @NanoCode012 in #162
- Fix future deprecate prepare_model_for_int8_training by @NanoCode012 in #143
- Feat: Set matmul tf32=True when tf32 passed by @NanoCode012 in #163
- Fix: Validate falcon with fsdp by @NanoCode012 in #164
- Axolotl supports falcon + qlora by @utensil in #132
- Fix: Set to use cfg.seed or 42 for seed by @NanoCode012 in #166
- Fix: Refactor out unmodified save_steps and eval_steps by @NanoCode012 in #167
- Disable Wandb if no wandb project is specified by @bratao in #168
- Feat: Improve lambda labs instruction by @NanoCode012 in #170
- Fix falcon support lora by @NanoCode012 in #171
- Feat: Add landmark attention by @NanoCode012 in #169
- Fix backward compat for peft by @NanoCode012 in #176
- Update README.md to reflect current gradient checkpointing support by @PocketDocLabs in #178
- fix for max sequence len across different model types by @winglian in #179
- Add streaming inference & fix stopping at EOS by @Glavin001 in #180
- add support to extend context with xpos rope by @winglian in #181
- fix for local variable 'LlamaForCausalLM' referenced before assignment by @winglian in #182
- pass a prompt in from stdin for inference by @winglian in #183
- Update FAQS.md by @akj2018 in #186
- various fixes by @winglian in #189
- more config pruning and migrating by @winglian in #190
- Add save_steps and eval_steps to Readme by @NanoCode012 in #191
- Fix config path after config moved by @NanoCode012 in #194
- Fix training over existing lora by @AngainorDev in #159
- config fixes by @winglian in #193
- misc fixes by @winglian in #192
- Fix landmark attention patch by @NanoCode012 in #177
- peft no longer needs device_map by @winglian in #187
- chore: Fix inference README. by @mhenrichsen in #197
- Update README.md to include a community showcase by @PocketDocLabs in #200
- chore: Refactor inf_kwargs out by @NanoCode012 in #199
- tweak config to work by @winglian in #196
New Contributors
- @FarisHijazi made their first contribution in #146
- @utensil made their first contribution in #132
- @bratao made their first contribution in #168
- @PocketDocLabs made their first contribution in #178
- @Glavin001 made their first contribution in #180
- @akj2018 made their first contribution in #186
- @AngainorDev made their first contribution in #159
- @mhenrichsen made their first contribution in #197
Full Changelog: v0.2.0...v0.2.1