Large models are not supported in WASM #78

yukiarimo · 2024-02-26T06:44:19Z

yukiarimo
Feb 26, 2024
Maintainer

gitmanbloggus · 2024-03-11T15:11:17Z

gitmanbloggus
Mar 11, 2024

might be something to due with how web browser handle RAM management

0 replies

yukiarimo · 2024-03-11T19:00:02Z

yukiarimo
Mar 11, 2024
Maintainer Author

How web browsers handle RAM management

It could be for WASM, cause I tried some llama-cpp built (current one), but my maximum was around 300 MB.

Would be nice if you could help me with building it or maybe some WebGPI solution

0 replies

gitmanbloggus · 2024-03-12T16:42:50Z

gitmanbloggus
Mar 12, 2024

hmmm might be a limitation of wasm itself... I could help

0 replies

gitmanbloggus · 2024-03-12T16:48:58Z

gitmanbloggus
Mar 12, 2024

quick thought:
-use webgpi to load the model into ram (the base model only)
-somehow get llama-cpp-web to use it as the model (maybe split it up)
-make a virtual interface to get offline to work with it (send prompt to llama-cpp-wasm)

0 replies

yukiarimo · 2024-03-12T17:09:29Z

yukiarimo
Mar 12, 2024
Maintainer Author

Somehow get llama-cpp-web to use it as the model

Are you referring to the llama-cpp-wasm? What do you mean by split it up?

Make a virtual interface

Yeah, I'm already working on it (check the offline.js file)

Use webgpi to load the model into ram

How? All the build files included (and simplified) in Yuna are from this repo: https://github.com/tangledgroup/llama-cpp-wasm. It's not using WebGPU, so I have no idea. (There's also a llm.js, but I was unable to extract its core)

Note:

If you're using Safari like me, enable WebGPU in the settings in the Feature Flags developer section

0 replies

gitmanbloggus · 2024-03-12T17:59:38Z

gitmanbloggus
Mar 12, 2024

I'm using Firefox since chromium doesn't work

0 replies

gitmanbloggus · 2024-03-12T18:03:40Z

gitmanbloggus
Mar 12, 2024

and I just realized the base model is 4.5gb 🤦‍♂️
that rules out loading the model into ram...
we're stuck: we could make it so users download the model and then add it to the website and then somehow get llama to read it... would be annoying, though it would allow a html version to be released for serverless use, but how we do that when llama-cpp for wasm doesn't work properly is my question

0 replies

yukiarimo · 2024-03-12T18:10:52Z

yukiarimo
Mar 12, 2024
Maintainer Author

I just realized the base model is 4.5GB

I could make it to the ~3.2 GB to fit the WASM limit of 4 GB!

0 replies

gitmanbloggus · 2024-03-13T02:42:09Z

gitmanbloggus
Mar 13, 2024

it might work then... I'll try it

0 replies

yukiarimo · 2024-03-13T15:06:58Z

yukiarimo
Mar 13, 2024
Maintainer Author

Sure! Let me know if you'll figure it out. I'll try to modify stuff on the HF today, and you can probably get a more quantized model before tomorrow morning 👍🏻

0 replies

gitmanbloggus · 2024-03-13T16:28:54Z

gitmanbloggus
Mar 13, 2024

sure... I'll try getting one

0 replies

yukiarimo · 2024-03-14T17:46:57Z

yukiarimo
Mar 14, 2024
Maintainer Author

Updated model link: https://huggingface.co/yukiarimo/yuna-ai-v1

0 replies

gitmanbloggus · 2024-03-15T14:10:00Z

gitmanbloggus
Mar 15, 2024

I'll get back to you once I get home... I'm at school rn and they blocked huggingface

0 replies

yukiarimo · 2024-03-15T15:50:09Z

yukiarimo
Mar 15, 2024
Maintainer Author

Sure thing! Lol, is ChatGPT or Perplexity also blocked in your school?

0 replies

gitmanbloggus · 2024-03-16T02:01:38Z

gitmanbloggus
Mar 16, 2024

yep... classified as ai

0 replies

gitmanbloggus · 2024-03-16T02:01:57Z

gitmanbloggus
Mar 16, 2024

I've been home for awhile so let me get the model... I'll test it

0 replies

yukiarimo · 2024-03-16T05:32:59Z

yukiarimo
Mar 16, 2024
Maintainer Author

Sure, you can grab any (better q5) from the HF repo above. By the way, I'm also starting training V2 in a few days, so keep updated (150k+ tokens)!

0 replies

gitmanbloggus · 2024-03-16T12:23:54Z

gitmanbloggus
Mar 16, 2024

ok

0 replies

gitmanbloggus · 2024-03-17T22:07:24Z

gitmanbloggus
Mar 17, 2024

im using the light model... but i do want to see if the heavy version works too

0 replies

yukiarimo · 2024-03-18T02:33:45Z

yukiarimo
Mar 18, 2024
Maintainer Author

Are you doing a light model in WASM? Where? Which model?

0 replies

gitmanbloggus · 2024-03-19T01:59:17Z

gitmanbloggus
Mar 19, 2024

Are you doing a light model in WASM? Where? Which model?

the model is "yuna-ai-v1-q3_k_m.gguf"
in the yuna folder

0 replies

yukiarimo · 2024-03-19T02:19:34Z

yukiarimo
Mar 19, 2024
Maintainer Author

Is it working in WASM? How exactly did you try?

0 replies

gitmanbloggus · 2024-03-19T13:24:27Z

gitmanbloggus
Mar 19, 2024

I tried enabling the setting... and then nothing happened (probably since I was using the pi to try and chat)

0 replies

gitmanbloggus · 2024-03-19T13:24:34Z

gitmanbloggus
Mar 19, 2024

I'll try my phone today

0 replies

yukiarimo · 2024-03-19T15:59:19Z

yukiarimo
Mar 19, 2024
Maintainer Author

Sure. And don’t forget to check the console for logs! (because I was lazy to implement popup-errors)

0 replies

yukiarimo · 2024-03-20T18:02:54Z

yukiarimo
Mar 20, 2024
Maintainer Author

Everyone! I think, I'll convert this issue into a discussion!

0 replies

gitmanbloggus · 2024-03-24T05:08:56Z

gitmanbloggus
Mar 24, 2024

took a small break: I'm going try my phone again (my phone was dead and I forgot)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large models are not supported in WASM #78

{{title}}

Replies: 27 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Large models are not supported in WASM #78

yukiarimo Feb 26, 2024 Maintainer

Replies: 27 comments

yukiarimo Mar 11, 2024 Maintainer Author

yukiarimo Mar 12, 2024 Maintainer Author

yukiarimo Mar 12, 2024 Maintainer Author

yukiarimo Mar 13, 2024 Maintainer Author

yukiarimo Mar 14, 2024 Maintainer Author

yukiarimo Mar 15, 2024 Maintainer Author

yukiarimo Mar 16, 2024 Maintainer Author

yukiarimo Mar 18, 2024 Maintainer Author

yukiarimo Mar 19, 2024 Maintainer Author

yukiarimo Mar 19, 2024 Maintainer Author

yukiarimo Mar 20, 2024 Maintainer Author

yukiarimo
Feb 26, 2024
Maintainer

yukiarimo
Mar 11, 2024
Maintainer Author

yukiarimo
Mar 12, 2024
Maintainer Author

yukiarimo
Mar 12, 2024
Maintainer Author

yukiarimo
Mar 13, 2024
Maintainer Author

yukiarimo
Mar 14, 2024
Maintainer Author

yukiarimo
Mar 15, 2024
Maintainer Author

yukiarimo
Mar 16, 2024
Maintainer Author

yukiarimo
Mar 18, 2024
Maintainer Author

yukiarimo
Mar 19, 2024
Maintainer Author

yukiarimo
Mar 19, 2024
Maintainer Author

yukiarimo
Mar 20, 2024
Maintainer Author