transformers support generation, trainer, tutorial, etc. #748

zhanghuiyao · 2024-11-08T03:17:18Z

What does this PR do?

Adds # (feature)

generation
trainer
some tutorials, readme, docs
llama3 8b infer/gen/finetune

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you make sure to update the documentation with your changes? E.g. record bug fixes or new features in What's New. Here are the
documentation guidelines
Did you build and run the code without any errors?
Did you report the running environment (NPU type/MS version) and performance in the doc? (better record it for data loading, model inference, or training tasks)
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@xxx

CaitinZhao · 2024-11-15T02:08:00Z

examples/transformers/llama/finetune_in_native_mindspore.py

+        python finetune_in_native_mindspore.py \
+          --model_path meta-llama/Meta-Llama-3-8B \
+          --dataset_path Yelp/yelp_review_full \
+          \


SamitHuang · 2024-11-15T02:23:33Z

mindone/transformers/mindspore_adapter/amp.py

+                nn.AvgPool1d,
+                nn.AvgPool2d,
+                nn.AvgPool3d,
+                nn.CrossEntropyLoss,


this hard-coded fp32 layers may not fit for all models

SamitHuang · 2024-11-15T02:24:07Z

mindone/transformers/mindspore_adapter/attention.py

+    return out
+
+
+class FlashAttention2(nn.Cell):


may illustrate it's just a wrapper, not the real FA2.

i heard before that it is really fa2

SamitHuang · 2024-11-15T02:24:32Z

mindone/transformers/mindspore_adapter/adamw.py

what about using mint adamw?

will switch the mint uniformly later

SamitHuang · 2024-11-15T02:25:06Z

mindone/transformers/mindspore_adapter/recompute.py

not necessary for pynative?

SamitHuang · 2024-11-15T02:37:58Z

mindone/transformers/README.md

Can list the supported features compared to torch? For example, beam_search for generation, 8-bit quantization for memory reduction, are quite commonly used, but seems to be missing in this PR

currently only the most basic sample method for generation is supported to provide for MLLMs to use, and do not support any quantification.

the full interface will be provided in the subsequent version 4.46.2, including beam_search for generation

SamitHuang · 2024-11-15T02:41:35Z

mindone/transformers/README.md

better claim which ms version and mode (graph/pynative) are mainly tested. If both graph and pynative mode are supported, do both of them guarantee good accuracy? or just both runnable?

the he newly added features were validated on ms2.3.1, but some existing interfaces may not be supported, this will require complete validation before providing a confirmed support version to the public.

for infer/generate, it can have good accuracy, and for training, it is currently only in a runnable state.

zhanghuiyao added 30 commits October 25, 2024 10:20

add transformers generate

3aaf623

fix tensor_scatter_elements input data dtype

c3d0d0c

1

8a86468

support flash attention

f6262b5

debug

6ea4b64

fix fa mask

dbb8129

1

1edd1b3

add loading checkpoint

121fe6f

debug

7783d5a

debug

8817181

1

98ded28

fix PreTrainedModel loaded_keys

cccf2bd

1

f3bdbed

add trainer support

45c1bc6

debug

8ddac5b

fix torch demo

32d6305

debug

91ae1d3

debug

8003840

debug

013ce28

debug

12cca4a

debug

a63fedb

debug

56a6886

debug

602676b

debug

0896b00

fix generate length

65ec177

debug

5a5d759

debug

217de5d

debug

e1b2d1d

debug

a54bd30

debug

99bccb8

zhanghuiyao added 16 commits November 6, 2024 22:24

enable bf16

667c9cb

debug

8dd0ccb

fix trainer amp fp16

98b4658

support distribute llama_ft_in_native_mindspore

20c7033

set bs to 1

13967cf

set mindspore_dtype with args.fp16/bf16

b35012b

fix adamw_zero bf16

5704771

add lazy_inline for llama

2d24180

set bs to 8

e59bd56

decomment loading checkpoint

8020af6

add native train script

e1329f5

fix args

b7318ca

update readme

6021035

update readme

77b6d56

update docs

229bf43

delete hf_configs

fa3dce0

zhanghuiyao requested a review from vigo999 as a code owner November 8, 2024 03:17

zhanghuiyao added 4 commits November 8, 2024 11:21

Merge branch 'master' into _transformers_pr

ddaaedf

update readme

ad76449

update readme

5265311

update readme

06bf39e

wcrzlh mentioned this pull request Nov 12, 2024

feat(minicpm-v): Support MiniCPM-V inference pipeline #749

Open

9 tasks

zhanghuiyao added 2 commits November 14, 2024 22:11

fix clip grad norm on zero

5589753

fix pre-commit format

28f0293

CaitinZhao reviewed Nov 15, 2024

View reviewed changes

SamitHuang reviewed Nov 15, 2024

View reviewed changes

zhanghuiyao added 2 commits November 15, 2024 14:44

delete comment

8048e81

modify amp

328217e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transformers support generation, trainer, tutorial, etc. #748

transformers support generation, trainer, tutorial, etc. #748

zhanghuiyao commented Nov 8, 2024 •

edited

Loading

CaitinZhao Nov 15, 2024

zhanghuiyao Nov 15, 2024

SamitHuang Nov 15, 2024

SamitHuang Nov 15, 2024

zhanghuiyao Nov 15, 2024

SamitHuang Nov 15, 2024

zhanghuiyao Nov 15, 2024

SamitHuang Nov 15, 2024

SamitHuang Nov 15, 2024 •

edited

Loading

zhanghuiyao Nov 15, 2024

zhanghuiyao Nov 15, 2024

SamitHuang Nov 15, 2024

zhanghuiyao Nov 15, 2024

zhanghuiyao Nov 15, 2024

transformers support generation, trainer, tutorial, etc. #748

Are you sure you want to change the base?

transformers support generation, trainer, tutorial, etc. #748

Conversation

zhanghuiyao commented Nov 8, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SamitHuang Nov 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhanghuiyao commented Nov 8, 2024 •

edited

Loading

SamitHuang Nov 15, 2024 •

edited

Loading