Add ML tutorials #7180

kolchfa-aws · 2024-05-16T17:30:23Z

Closes #6673

Checklist

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Fanit Kolchina <[email protected]>

vagimeli

@kolchfa-aws Doc review complete. Minimal copyedits. Tutorial is crisp and clear. Well done!

_ml-commons-plugin/tutorials/build-chatbot.md

_ml-commons-plugin/tutorials/conversational-search-cohere.md

_ml-commons-plugin/tutorials/reranking-cohere.md

vagimeli · 2024-05-31T21:37:45Z

_ml-commons-plugin/tutorials/semantic-search-byte-vectors.md

+nav_order: 10
+---
+
+# Semantic search using byte quantized vectors


Should the titles match, or is it shortened for navigation menu?

Shortened for left nav :)

_ml-commons-plugin/tutorials/semantic-search-byte-vectors.md

_ml-commons-plugin/tutorials/index.md

_ml-commons-plugin/tutorials/semantic-search-byte-vectors.md

_ml-commons-plugin/tutorials/generate-embeddings.md

Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

natebower

@kolchfa-aws Great job on this 😄. Please see my comments and changes and let me know if you have any questions. Thanks!

_ml-commons-plugin/agents-tools/index.md

_ml-commons-plugin/tutorials/build-chatbot.md

natebower · 2024-06-03T13:40:58Z

_ml-commons-plugin/tutorials/build-chatbot.md

+
+## Prerequisite
+
+Log in to the OpenSearch Dashboards homepage, select **Add sample data**, and add **Sample eCommerce orders** data.


Should "the" precede Sample eCommerce orders?

natebower · 2024-06-03T13:43:06Z

_ml-commons-plugin/tutorials/build-chatbot.md

+
+## Step 1: Configure a knowledge base
+
+Follow Prerequisite and Step 1 of the [RAG with a conversational flow agent tutorial]({{site.url}}{{site.baseurl}}/ml-commons-plugin/tutorials/rag-conversational-agent/) to configure the `test_population_data` knowledge base index, which contains US city population data.


Is "Prerequisite and Step 1" the name of a section, or should it read something like "Meet the prerequisite and follow step 1 of the..."?

natebower · 2024-06-03T13:46:19Z

_ml-commons-plugin/tutorials/build-chatbot.md

+- `llm`: Defines the LLM configuration:
+   - `"max_iteration": 5`:  The agent runs the LLM a maximum of five times.
+   - `"response_filter": "$.completion"`: Needed to retrieve the LLM answer from the Bedrock Claude model response.
+   - `"message_history_limit": 5`: The agent retrieves a maximum of the five most recent history messages and adds them to the LLM context. Set this parameter to `0` to omit message history in the context.


Is "history" necessary between "recent" and "messages"? It reads a bit awkwardly.

natebower · 2024-06-03T15:20:44Z

_ml-commons-plugin/tutorials/semantic-search-byte-vectors.md

+```
+{% include copy-curl.html %}
+
+For compatibility with the Neural Search plugin, the `data_type` (output in the `inference_results.output.data_type` field of the response) must be set to `FLOAT32` in the post-processing function, even though the actual embedding type will be `INT8`.


Suggested change

For compatibility with the Neural Search plugin, the `data_type` (output in the `inference_results.output.data_type` field of the response) must be set to `FLOAT32` in the post-processing function, even though the actual embedding type will be `INT8`.

To ensure compatibility with the Neural Search plugin, the `data_type` (output in the `inference_results.output.data_type` field of the response) must be set to `FLOAT32` in the post-processing function, even though the actual embedding type will be `INT8`.

natebower · 2024-06-03T15:21:56Z

_ml-commons-plugin/tutorials/semantic-search-byte-vectors.md

+```
+{% include copy-curl.html %}
+
+Note the model ID in the response; you'll use it in the following steps.


Minor consistency nit: At this point in the tutorials, we sometimes say "next steps" and sometimes say "following steps".

natebower · 2024-06-03T15:22:27Z

_ml-commons-plugin/tutorials/semantic-search-byte-vectors.md

+```
+{% include copy-curl.html %}
+
+Next, create a k-NN index and set the `data_type` on the `passage_embedding` field to `byte` so it can hold byte-quantized vectors:


Suggested change

Next, create a k-NN index and set the `data_type` on the `passage_embedding` field to `byte` so it can hold byte-quantized vectors:

Next, create a k-NN index and set the `data_type` for the `passage_embedding` field to `byte` so that it can hold byte-quantized vectors:

natebower · 2024-06-03T15:22:51Z

_ml-commons-plugin/tutorials/semantic-search-byte-vectors.md

+```
+{% include copy-curl.html %}
+
+Next, create a k-NN index and set the `data_type` on the `passage_embedding` field to `byte` so it can hold byte-quantized vectors:


"on" => "for"?

natebower · 2024-06-03T15:24:06Z

_ml-commons-plugin/tutorials/semantic-search-byte-vectors.md

+
+## Step 3: Configure semantic search
+
+Create a connector to an embedding model that has the `search_query` input type:


"that has" => "containing" or "with"?

_ml-commons-plugin/tutorials/build-chatbot.md

_ml-commons-plugin/tutorials/conversational-search-cohere.md

_ml-commons-plugin/tutorials/rag-chatbot.md

_ml-commons-plugin/tutorials/rag-conversational-agent.md

kolchfa-aws · 2024-06-03T17:30:10Z

_ml-commons-plugin/tutorials/rag-conversational-agent.md

+
+Note the agent ID; you'll use it in the next step. 
+
+## Step 4: Execute the agent


Suggested change

## Step 4: Execute the agent

## Step 4: Run the agent

kolchfa-aws · 2024-06-03T17:31:12Z

_ml-commons-plugin/tutorials/rag-conversational-agent.md

+```
+{% include copy-curl.html %}
+
+The response contains the LLM answer:


Suggested change

The response contains the LLM answer:

The response contains the answer generated by the LLM:

kolchfa-aws · 2024-06-03T17:31:50Z

_ml-commons-plugin/tutorials/rag-conversational-agent.md

+```
+{% include copy-curl.html %}
+
+The response contains the LLM answer:


Suggested change

The response contains the LLM answer:

The response contains the answer generated by the LLM:

kolchfa-aws · 2024-06-03T17:32:21Z

_ml-commons-plugin/tutorials/rag-conversational-agent.md

+```
+{% include copy-curl.html %}
+
+The agent will run the tools sequentially in the new order defined in `selected_tools`. 


Suggested change

The agent will run the tools sequentially in the new order defined in `selected_tools`.

The agent will run the tools one by one in the new order defined in `selected_tools`.

kolchfa-aws · 2024-06-03T17:36:06Z

_ml-commons-plugin/tutorials/rag-conversational-agent.md

+
+### Natural language query
+
+The `PPLTool` can translate a natural language query (NLQ) to [PPL]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index/) and execute the generated PPL query.


Suggested change

The `PPLTool` can translate a natural language query (NLQ) to [PPL]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index/) and execute the generated PPL query.

The `PPLTool` can translate a natural language query (NLQ) to [Piped Processing Language (PPL)]({{site.url}}{{site.baseurl}}/search-plugins/sql/ppl/index/) and execute the generated PPL query.

kolchfa-aws · 2024-06-03T17:36:36Z

_ml-commons-plugin/tutorials/rag-conversational-agent.md

+```
+{% include copy-curl.html %}
+
+### Step 2: Execute the agent with an NLQ


Suggested change

### Step 2: Execute the agent with an NLQ

### Step 2: Run the agent with an NLQ

kolchfa-aws · 2024-06-03T17:36:57Z

_ml-commons-plugin/tutorials/rag-conversational-agent.md

+```
+{% include copy-curl.html %}
+
+The response contains the LLM answer:


Suggested change

The response contains the LLM answer:

The response contains the answer generated by the LLM:

kolchfa-aws · 2024-06-03T17:42:07Z

_ml-commons-plugin/tutorials/semantic-search-byte-vectors.md

+
+## Step 3: Configure semantic search
+
+Create a connector to an embedding model that has the `search_query` input type:


Suggested change

Create a connector to an embedding model that has the `search_query` input type:

Create a connector to an embedding model with the `search_query` input type:

kolchfa-aws · 2024-06-03T17:45:11Z

_ml-commons-plugin/tutorials/build-chatbot.md

+- `llm`: Defines the LLM configuration:
+   - `"max_iteration": 5`:  The agent runs the LLM a maximum of five times.
+   - `"response_filter": "$.completion"`: Needed to retrieve the LLM answer from the Bedrock Claude model response.
+   - `"message_history_limit": 5`: The agent retrieves a maximum of the five most recent history messages and adds them to the LLM context. Set this parameter to `0` to omit message history in the context.


Suggested change

- `"message_history_limit": 5`: The agent retrieves a maximum of the five most recent history messages and adds them to the LLM context. Set this parameter to `0` to omit message history in the context.

- `"message_history_limit": 5`: The agent retrieves a maximum of the five most recent history messages and adds them to the LLM context. Set this parameter to `0` to omit message history in the context.

Suggested change

- `"message_history_limit": 5`: The agent retrieves a maximum of the five most recent history messages and adds them to the LLM context. Set this parameter to `0` to omit message history in the context.

- `"message_history_limit": 5`: The agent retrieves a maximum of the five most recent historical messages and adds them to the LLM context. Set this parameter to `0` to omit message history in the context.

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

Signed-off-by: Fanit Kolchina <[email protected]>

* Add ML tutorials Signed-off-by: Fanit Kolchina <[email protected]> * Writing Signed-off-by: Fanit Kolchina <[email protected]> * Conversational search Signed-off-by: Fanit Kolchina <[email protected]> * Add rag chatbot Signed-off-by: Fanit Kolchina <[email protected]> * Writing Signed-off-by: Fanit Kolchina <[email protected]> * Add RAG chatbot and convo agent Signed-off-by: Fanit Kolchina <[email protected]> * Add reranking cohere tutorial Signed-off-by: Fanit Kolchina <[email protected]> * Add semantic search tutorial Signed-off-by: Fanit Kolchina <[email protected]> * Add generating embeddings Signed-off-by: Fanit Kolchina <[email protected]> * Add generate embeddings to index Signed-off-by: Fanit Kolchina <[email protected]> * Rewriting Signed-off-by: Fanit Kolchina <[email protected]> * Rewriting Signed-off-by: Fanit Kolchina <[email protected]> * Rewriting Signed-off-by: Fanit Kolchina <[email protected]> * Reword Signed-off-by: Fanit Kolchina <[email protected]> * Apply suggestions from code review Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Apply suggestions from code review Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> * Consistency Signed-off-by: Fanit Kolchina <[email protected]> * Add tutorials section to index page Signed-off-by: Fanit Kolchina <[email protected]> --------- Signed-off-by: Fanit Kolchina <[email protected]> Signed-off-by: kolchfa-aws <[email protected]> Co-authored-by: Melissa Vagi <[email protected]> Co-authored-by: Nathan Bower <[email protected]> (cherry picked from commit d01e74f) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Add ML tutorials

5922b6e

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws self-assigned this May 16, 2024

kolchfa-aws added 8 commits May 22, 2024 14:04

Writing

9536e8e

Signed-off-by: Fanit Kolchina <[email protected]>

Conversational search

9001b87

Signed-off-by: Fanit Kolchina <[email protected]>

Add rag chatbot

05a71c8

Signed-off-by: Fanit Kolchina <[email protected]>

Writing

229b35a

Signed-off-by: Fanit Kolchina <[email protected]>

Add RAG chatbot and convo agent

31f85eb

Signed-off-by: Fanit Kolchina <[email protected]>

Add reranking cohere tutorial

459bc16

Signed-off-by: Fanit Kolchina <[email protected]>

Add semantic search tutorial

faebda9

Signed-off-by: Fanit Kolchina <[email protected]>

Add generating embeddings

66f8384

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws marked this pull request as ready for review May 29, 2024 22:02

kolchfa-aws requested review from hdhalter, Naarcha-AWS, vagimeli, AMoo-Miki, natebower, dlvenable, stephen-crawford and epugh as code owners May 29, 2024 22:02

kolchfa-aws and others added 6 commits May 29, 2024 18:04

Add generate embeddings to index

1b7bc6c

Signed-off-by: Fanit Kolchina <[email protected]>

Rewriting

325f5b5

Signed-off-by: Fanit Kolchina <[email protected]>

Rewriting

7787180

Signed-off-by: Fanit Kolchina <[email protected]>

Rewriting

f0b42c9

Signed-off-by: Fanit Kolchina <[email protected]>

Merge branch 'main' into ml-tutorials

9c76ed1

Reword

13415bf

Signed-off-by: Fanit Kolchina <[email protected]>

vagimeli approved these changes May 31, 2024

View reviewed changes