Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Help needed] CPU usage do not decrease after a request is completed #24

Open
3deep5me opened this issue Oct 1, 2023 · 1 comment
Open

Comments

@3deep5me
Copy link

3deep5me commented Oct 1, 2023

Does someone else also has the problem that after chat request the cpu load do not decrease?

I'm using CodeLlama-34B-Instruct-GGUF and the ChatGPT-Next-Web-UI.

With other bindings i do not have this problem e.g. ialacol.

Logs are looking normal:

Defaulted container "test-local-ai" out of: test-local-ai, download-model (init)
@@@@@
Skipping rebuild
@@@@@
If you are experiencing issues with the pre-compiled builds, try setting REBUILD=true
If you are still experiencing issues with the build, try setting CMAKE_ARGS and disable the instructions set as needed:
CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF"
see the documentation at: https://localai.io/basics/build/index.html
Note: See also https://github.com/go-skynet/LocalAI/issues/288
@@@@@
CPU info:
model name      : AMD EPYC-Milan Processor
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core invpcid_single ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr wbnoinvd arat umip pku ospke rdpid fsrm
CPU:    AVX    found OK
CPU:    AVX2   found OK
CPU: no AVX512 found
@@@@@
2:33AM INF Starting LocalAI using 24 threads, with models path: /models
2:33AM INF LocalAI version: v1.30.0 (274ace289823a8bacb7b4987b5c961b62d5eee99)

 ┌───────────────────────────────────────────────────┐
 │                   Fiber v2.49.2                   │
 │               http://127.0.0.1:8080               │
 │       (bound on host 0.0.0.0 and port 8080)       │
 │                                                   │
 │ Handlers ............ 70  Processes ........... 1 │
 │ Prefork ....... Disabled  PID ................ 14 │
 └───────────────────────────────────────────────────┘

rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:39597: connect: connection refused"
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:39639: connect: connection refused"
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:40599: connect: connection refused"
@jamiemoller
Copy link
Contributor

@3deep5me did you end up resolving this issue?
I found I had this error when running with an incompatible cuda version (i was accidentally running the C11 container with C12 on the host)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants