Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How much RAM needed to run model? #2

Open
texturejc opened this issue Apr 6, 2022 · 1 comment
Open

How much RAM needed to run model? #2

texturejc opened this issue Apr 6, 2022 · 1 comment

Comments

@texturejc
Copy link

Thanks for creating this implementation. I've tried running it in a Google Colab Pro notebook, but the session keeps crashing due to maxing out the RAM. Do you have any sense of how much RAM is needed to run the model? Thanks!

@lopho
Copy link

lopho commented Apr 6, 2022

~40gb vram to load ~45gb to infer full 2048 token length. About the twice the amount in cpu RAM (~81gb) as it currently is intantiated on the cpu and then converted to fp16 and uploaded to VRAM. Dunno if that's intended behaviour, seems like it's supposed to create meta tensors, but thats not actually working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants