Discrepancies in LLM Performance on CPU vs. GPU #2227

joaos96 · 2024-04-17T00:12:19Z

joaos96
Apr 17, 2024

Hello everyone,

I've been working on a script for forensic analysis of messages and I've observed some intriguing discrepancies in the performance of the model when run on CPU versus GPU. Specifically, the model tends to generate more accurate and reliable responses when executed on a GPU rather than a CPU.

Has anyone else experienced similar issues? I'm curious about the technical reasons behind such differences and whether they are related to the specific LLM architecture, data handling, or perhaps the computation of certain operations that are more efficiently handled by GPUs.

euroblaze · 2024-04-17T04:18:47Z

euroblaze
Apr 17, 2024

@joaos96

That is indeed intriguing!
Which LLM did you test?
And which inference server are you using?

Thanks

0 replies

alew3 · 2024-09-18T14:12:45Z

alew3
Sep 18, 2024

@joaos96 if you train the model on a specific GPU and run it on a CPU you can get different results because of the way each architecture handles the floating point precision.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discrepancies in LLM Performance on CPU vs. GPU #2227

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Discrepancies in LLM Performance on CPU vs. GPU #2227

joaos96 Apr 17, 2024

Replies: 2 comments

euroblaze Apr 17, 2024

alew3 Sep 18, 2024

joaos96
Apr 17, 2024

euroblaze
Apr 17, 2024

alew3
Sep 18, 2024