Share of embeddings #4

antonio-mastropaolo · 2021-03-19T11:32:33Z

Hi all,

I was wondering what the benefits of sharing the word and projection weights when training a BLM model?
Do you think/suggest using it as default hyper-param when training the BLM model, or we're better off fine-tuning i?

Thank you all :)

antonio-mastropaolo changed the title ~~Embedding sharing~~ Share of embeddings Mar 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Share of embeddings #4

Share of embeddings #4

antonio-mastropaolo commented Mar 19, 2021

Share of embeddings #4

Share of embeddings #4

Comments

antonio-mastropaolo commented Mar 19, 2021