Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for adapter fine-tuning #1545

Merged
merged 2 commits into from
Mar 14, 2024

Conversation

marcoyang1998
Copy link
Collaborator

Add documentation for #1512.

@marcoyang1998 marcoyang1998 merged commit f28c05f into k2-fsa:master Mar 14, 2024
108 checks passed
Comment on lines +148 to +154
.. code-block::

For dev, WER of different settings are:
greedy_search 15.44 best for dev

For test, WER of different settings are:
greedy_search 15.42 best for test

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi!
Do you have accuracy metrics for the case when you fine-tune whole ASR model on the same data?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, fine-tuning the whole model gives us 13.31/13.39. You may want to have a look at #1484 for more reference numbers.

@1215thebqtic
Copy link

Hi,

I conducted experiments with my own data and found that the training time for adapter-finetune and full-finetune are almost the same. I print weight values of different epochs of adapter-finetune, and only adapters' weights change. Is this normal? Thank you!

@JinZr
Copy link
Collaborator

JinZr commented May 28, 2024 via email

@1215thebqtic
Copy link

yes this is normal. you need to look into the implementation and technical details of adapter based finetuning to gain deeper understanding. best regards jin

On May 28, 2024, at 15:26, 1215thebqtic @.***> wrote: Hi, I conducted experiments with my own data and found that the training time for adapter-finetune and full-finetune are almost the same. I print weight values of different epochs of adapter-finetune, and only adapters' weights change. Is this normal? Thank you! — Reply to this email directly, view it on GitHub <#1545 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOON42CM5RETC7FIRVZABYTZEQWQVAVCNFSM6AAAAABEROZKC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZUGUZDCOBZGY. You are receiving this because you are subscribed to this thread.

The doc said adapter-finetune is much faster than full-finetune, so I'm confused.
adapter4
https://k2-fsa.github.io/icefall/recipes/Finetune/adapter/finetune_adapter.html#fine-tune-with-adapter

@marcoyang1998
Copy link
Collaborator Author

marcoyang1998 commented May 28, 2024

Could you show me the average time for finishing one epoch for both of your experiments?

In my experiment, adapter-based training took ~25 min per epoch, and the full-finetune took ~30 min per epoch.

@1215thebqtic
Copy link

adapter-finetune is 60min/epoch, full-finetune is 46min/epoch. Adapter finetune costs more time. I change base_lr=0.00005, lr_epochs=100, lr_batches=100000 for full-finetune. I use default values for adapter finetune.

@marcoyang1998
Copy link
Collaborator Author

Adapter finetune costs more time.

That's unusual, how big is your model? BTW, your base-lr seems very small (even for full fine-tuning), which might lead to poor performance.

@1215thebqtic
Copy link

For model with adapter: Number of model parameters: 169281897, A total of 1234624 trainable parameters (0.729% of the whole model)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants