Skip to content

Commit

Permalink
feat: add LLAMA_CPP_N_THREADS env (#742)
Browse files Browse the repository at this point in the history
* feat: add LLAMA_CPP_N_THREADS and LLAMA_CPP_N_THREADS_BATCH envs

* apply format

* improve: use LLAMA_CPP_N_THREADS for both n_threads and n_threads_batch

* Update crates/llama-cpp-bindings/src/engine.cc

---------

Co-authored-by: Meng Zhang <[email protected]>
  • Loading branch information
erfanium and wsxiaoys authored Nov 9, 2023
1 parent 4bbcdfa commit 138b745
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion crates/llama-cpp-bindings/src/engine.cc
Original file line number Diff line number Diff line change
Expand Up @@ -316,8 +316,12 @@ std::unique_ptr<TextInferenceEngine> create_engine(bool use_gpu, rust::Str model
llama_context_params ctx_params = llama_context_default_params();
ctx_params.n_ctx = N_CTX * parallelism;
ctx_params.n_batch = N_BATCH;
if (const char* n_thread_str = std::getenv("LLAMA_CPP_N_THREADS")) {
int n_threads = std::stoi(n_thread_str);
ctx_params.n_threads = n_threads;
ctx_params.n_threads_batch = n_threads;
}
llama_context* ctx = llama_new_context_with_model(model, ctx_params);

return std::make_unique<TextInferenceEngineImpl>(
owned<llama_model>(model, llama_free_model),
owned<llama_context>(ctx, llama_free),
Expand Down

0 comments on commit 138b745

Please sign in to comment.