Skip to content

Commit

Permalink
Incorporates Nastasha's review
Browse files Browse the repository at this point in the history
  • Loading branch information
benironside committed Jul 5, 2024
1 parent 1ff79b1 commit 61ba4a2
Showing 1 changed file with 7 additions and 8 deletions.
15 changes: 7 additions & 8 deletions docs/serverless/assistant/connect-to-byo-llm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -90,14 +90,7 @@ First, install [LM Studio](https://lmstudio.ai/). LM Studio supports the OpenAI

One current limitation of LM Studio is that when it is installed on a server, you must launch the application using its GUI before doing so using the CLI. For example, by using Chrome RDP with an [X Window System](https://cloud.google.com/architecture/chrome-desktop-remote-on-compute-engine). After you've opened the application the first time using the GUI, you can start it by using `sudo lms server start` in the CLI.

Once you've launched LM Studio, select a model:

<DocImage url="images/lms-model-select.png" alt="The LM Studio model selection interface"/>


<DocCallOut title="Important">
For security reasons, before downloading a model, verify that it is from a trusted source. It can be helpful to review community feedback on the model (for example using a site like Hugging Face).
</DocCallOut>
Once you've launched LM Studio:

1. Go to LM Studio's Search window.
1. Search for an LLM (for example, `Mixtral-8x7B-instruct`).
Expand All @@ -108,6 +101,12 @@ For security reasons, before downloading a model, verify that it is from a trust
* Red for "Likely too large for this machine", which typically will not work.
1. Download one or more models.

<DocCallOut title="Important">
For security reasons, before downloading a model, verify that it is from a trusted source. It can be helpful to review community feedback on the model (for example using a site like Hugging Face).
</DocCallOut>

<DocImage url="images/lms-model-select.png" alt="The LM Studio model selection interface"/>

In this example we used [`TheBloke/Mixtral-8x7B-Instruct-v0.1.Q3_K_M.gguf`](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF). It has 46.7B total parameters, a 32,000 token context window, and uses GGUF [quanitization](https://huggingface.co/docs/transformers/main/en/quantization/overview). For more information about model names and format information, refer to the following table.

| Model Name | Parameter Size | Tokens/Context Window | Quantization Format |
Expand Down

0 comments on commit 61ba4a2

Please sign in to comment.