elastic · benironside · Dec 17, 2024 · Dec 17, 2024 · Dec 17, 2024
@@ -10,7 +10,7 @@ This page provides instructions for setting up a connector to a large language m
 
 This example uses a single server hosted in GCP to run the following components:
 
-* LM Studio with the https://mistral.ai/technology/#models[Mixtral-8x7b] model
+* LM Studio with the https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407[Mistral-Nemo-Instruct-2407] model
 * A reverse proxy using Nginx to authenticate to Elastic Cloud
 
 image::images/lms-studio-arch-diagram.png[Architecture diagram for this guide]
@@ -20,7 +20,7 @@ NOTE: For testing, you can use alternatives to Nginx such as https://learn.micro
 [discrete]
 == Configure your reverse proxy
 
-NOTE: If your Elastic instance is on the same host as LM Studio, you can skip this step.
+NOTE: If your Elastic instance is on the same host as LM Studio, you can skip this step. Also, check out our https://www.elastic.co/blog/herding-llama-3-1-with-elastic-and-lm-studio[blog post] that walks through the whole process of setting up a single-host implementation.
 
 You need to set up a reverse proxy to enable communication between LM Studio and Elastic. For more complete instructions, refer to a guide such as https://www.digitalocean.com/community/tutorials/how-to-configure-nginx-as-a-reverse-proxy-on-ubuntu-22-04[this one].
 
@@ -74,7 +74,14 @@ server {
 }
 --------------------------------------------------
 
-IMPORTANT: If using the example configuration file above, you must replace several values: Replace `<secret token>` with your actual token, and keep it safe since you'll need it to set up the {elastic-sec} connector. Replace `<yourdomainname.com>` with your actual domain name. Update the `proxy_pass` value at the bottom of the configuration if you decide to change the port number in LM Studio to something other than 1234.
+[IMPORTANT]
+====
+If using the example configuration file above, you must replace several values:
+
+* Replace `<secret token>` with your actual token, and keep it safe since you'll need it to set up the {elastic-sec} connector.
+* Replace `<yourdomainname.com>` with your actual domain name.
+* Update the `proxy_pass` value at the bottom of the configuration if you decide to change the port number in LM Studio to something other than 1234.
+====
 
 [discrete]
 === (Optional) Set up performance monitoring for your reverse proxy
@@ -85,23 +92,20 @@ You can use Elastic's {integrations-docs}/nginx[Nginx integration] to monitor pe
 
 First, install https://lmstudio.ai/[LM Studio]. LM Studio supports the OpenAI SDK, which makes it compatible with Elastic's OpenAI connector, allowing you to connect to any model available in the LM Studio marketplace.
 
-One current limitation of LM Studio is that when it is installed on a server, you must launch the application using its GUI before doing so using the CLI. For example, by using Chrome RDP with an https://cloud.google.com/architecture/chrome-desktop-remote-on-compute-engine[X Window System]. After you've opened the application the first time using the GUI, you can start it by using `sudo lms server start` in the CLI. 
+You must launch the application using its GUI before doing so using the CLI. For example, use Chrome RDP with an https://cloud.google.com/architecture/chrome-desktop-remote-on-compute-engine[X Window System]. After you've opened the application the first time using the GUI, you can start it by using `sudo lms server start` in the CLI. 
 
 Once you've launched LM Studio: 
 
 1. Go to LM Studio's Search window.
-2. Search for an LLM (for example, `Mixtral-8x7B-instruct`). Your chosen model must include `instruct` in its name in order to work with Elastic.
-3. Filter your search for "Compatibility Guess" to optimize results for your hardware. Results will be color coded:
-    * Green means "Full GPU offload possible", which yields the best results.
-    * Blue means "Partial GPU offload possible", which may work.
-    * Red for "Likely too large for this machine", which typically will not work.
+2. Search for an LLM (for example, `Mistral-Nemo-Instruct-2407`). Your chosen model must include `instruct` in its name in order to work with Elastic.
+3. After you find a model, view download options and select a recommended version (green). For best performance, select one with the thumbs-up icon that indicates good performance on your hardware. 
 4. Download one or more models.
 
 IMPORTANT: For security reasons, before downloading a model, verify that it is from a trusted source. It can be helpful to review community feedback on the model (for example using a site like Hugging Face).  
 
 image::images/lms-model-select.png[The LM Studio model selection interface]
 
-In this example we used https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF[`TheBloke/Mixtral-8x7B-Instruct-v0.1.Q3_K_M.gguf`]. It has 46.7B total parameters, a 32,000 token context window, and uses GGUF https://huggingface.co/docs/transformers/main/en/quantization/overview[quanitization]. For more information about model names and format information, refer to the following table.
+In this example we used https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407[`mistralai/Mistral-Nemo-Instruct-2407`]. It has 12B total parameters, a 128,000 token context window, and uses GGUF https://huggingface.co/docs/transformers/main/en/quantization/overview[quanitization]. For more information about model names and format information, refer to the following table.
 
 [cols="1,1,1,1", options="header"]
 |===
@@ -124,18 +128,18 @@ After downloading a model, load it in LM Studio using the GUI or LM Studio's htt
 [discrete]
 === Option 1: load a model using the CLI (Recommended)
 
-It is a best practice to download models from the marketplace using the GUI, and then load or unload them using the CLI. The GUI allows you to search for models, whereas the CLI only allows you to import specific paths, but the CLI provides a good interface for loading and unloading.
+It is a best practice to download models from the marketplace using the GUI, and then load or unload them using the CLI. The GUI allows you to search for models, whereas the CLI allows you to use `lms get` to search for models. The CLI provides a good interface for loading and unloading.
 
-Use the following commands in your CLI:
+Once you've downloaded a model, use the following commands in your CLI:
 
 1. Verify LM Studio is installed: `lms`
 2. Check LM Studio's status: `lms status`
 3. List all downloaded models: `lms ls`
-4. Load a model: `lms load`
+4. Load a model: `lms load`. 
 
 image::images/lms-cli-welcome.png[The CLI interface during execution of initial LM Studio commands]
 
-After the model loads, you should see a `Model loaded successfully` message in the CLI.
+After the model loads, you should see a `Model loaded successfully` message in the CLI. 
 
 image::images/lms-studio-model-loaded-msg.png[The CLI message that appears after a model loads]
 
@@ -156,8 +160,8 @@ Refer to the following video to see how to load a model using LM Studio's GUI. Y
 <img
   style="width: 100%; margin: auto; display: block;"
   class="vidyard-player-embed"
-  src="https://play.vidyard.com/FMx2wxGQhquWPVhGQgjkyM.jpg"
-  data-uuid="FMx2wxGQhquWPVhGQgjkyM"
+  src="https://play.vidyard.com/c4AxH8d9tWMnwNp5J6bcfX.jpg"
+  data-uuid="c4AxH8d9tWMnwNp5J6bcfX"
   data-v="4"
   data-type="inline"
 />