Does this support this original Meta LLaMA2? #335

realhaik · 2023-08-27T20:22:41Z

realhaik
Aug 27, 2023

I see in the docs that you use Hugging Face llama2 model.
Unfortunately HF model is broken, and produces bad results.
I am looking for a way to improve the parallel performance of the original meta llama2 model, that you download from their servers.
Is there a way to use this software for my needs?

Currently when I run the model using the llama , the parallel performance does not exist. Even if I run the requests in parallel, my 4090 stays below 30% load.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does this support this original Meta LLaMA2? #335

{{title}}

Replies: 0 comments

Select a reply

Does this support this original Meta LLaMA2? #335

realhaik Aug 27, 2023

Replies: 0 comments

realhaik
Aug 27, 2023