Skip to content

Commit

Permalink
Additional README Changes for PR #57 (#61)
Browse files Browse the repository at this point in the history
* edits to readme

Signed-off-by: 1000960000 user <[email protected]>

* Apply suggestions from code review

Co-authored-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: 1000960000 user <[email protected]>

* more readme changes

Signed-off-by: 1000960000 user <[email protected]>

---------

Signed-off-by: 1000960000 user <[email protected]>
Co-authored-by: Yu Chin Fabian Lim <[email protected]>
  • Loading branch information
achew010 and fabianlim authored Aug 2, 2024
1 parent d510923 commit 0e15e8c
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 6 deletions.
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ The fms-acceleration framework includes accelerators for Full and Parameter Effi
- Bits-and-Bytes (BNB) quantised LoRA : QLoRA acceleration
- AutoGPTQ quantised LoRA : GPTQ-LoRA acceleration
- Full Fine Tuning acceleration (coming soon)
- Padding-Free Attention

Our tests show a significant increase in training token throughput using this fms-acceleration framework.

Expand All @@ -29,9 +30,10 @@ For example:

Plugin | Description | Depends | License | Status
--|--|--|--|--
[framework](./plugins/framework/README.md) | This acceleration framework for integration with huggingface trainers | | | Beta
[accelerated-peft](./plugins/accelerated-peft/README.md) | For PEFT-training, e.g., 4bit QLoRA. | Huggingface<br>AutoGPTQ | Apache 2.0<br>MIT | Beta
[framework](./plugins/framework/README.md) | This acceleration framework for integration with huggingface trainers | | | Alpha
[accelerated-peft](./plugins/accelerated-peft/README.md) | For PEFT-training, e.g., 4bit QLoRA. | Huggingface<br>AutoGPTQ | Apache 2.0<br>MIT | Alpha
[fused-op-and-kernels](./plugins/fused-ops-and-kernels/README.md) | Fused LoRA and triton kernels (e.g., fast cross-entropy, rms, rope) | -- | Apache 2.0 [(contains extracted code)](./plugins/fused-ops-and-kernels/README.md#code-extracted-from-unsloth)| Beta
[instruct-lab](./plugins/instruct-lab/README.md) | Padding-Free Flash Attention Computation | flash-attn | Apache 2.0 | Beta
MOE-training-acceleration | [MegaBlocks](https://github.com/databricks/megablocks) inspired triton Kernels and acclerations for Mixture-of-Expert models | | Apache 2.0 | Coming Soon

## Usage with FMS HF Tuning
Expand Down
8 changes: 4 additions & 4 deletions plugins/instruct-lab/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@ This library contains plugins to accelerate finetuning with the following optimi

Plugin | Description | Depends | Loading | Augmentation | Callbacks
--|--|--|--|--|--
[padding_free](./src/fms_acceleration_ilab/framework_plugin_padding_free.py) | Padding-Free Flash Attention Computation | flash_attn | ✅ | ✅
[padding_free](./src/fms_acceleration_ilab/framework_plugin_padding_free.py) | Padding-Free Flash Attention Computation | flash_attn | | ✅ | ✅


## Native Transformers Support from V4.44.0
Transformers natively supports padding-free from v4.44.0. The padding-free plugin will use the transformers library if compatible,
otherwise if `transformers < V4.44.0` the plugin will use an internal implementation instead.
## Native Transformers Support from v4.44.0
Transformers natively supports padding-free from v4.44.0 [see here](https://github.com/huggingface/transformers/pull/31629). The padding-free plugin will use the transformers library if compatible,
otherwise if `transformers < v4.44.0` the plugin will use an internal implementation instead.

## Known Issues

Expand Down

0 comments on commit 0e15e8c

Please sign in to comment.