Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
  • Loading branch information
kolchfa-aws and natebower authored Jun 21, 2024
1 parent ee1693f commit 468050c
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 11 deletions.
4 changes: 2 additions & 2 deletions _ml-commons-plugin/api/model-apis/register-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -263,10 +263,10 @@ Field | Data type | Description
`type` | String | The guardrail type. Valid values are [`local_regex`](#example-request-regex-and-stopword-validation) and [`model`](#example-request-guardrail-model-validation). Using `local_regex`, you can specify a regular expression or stop words. Using `model`, you can specify a guardrail model. For more information, see [Guardrails]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/guardrails/).
`input_guardrail`| Object | The guardrail for the model input.
`output_guardrail`| Object | The guardrail for the model output.
`stop_words`| Object | The list of indexes containing stopwords used for the model input/output validation. If the model prompt/response contains a stopword contained in any of the indexes, the predict request on this model is rejected.
`stop_words`| Object | The list of indexes containing stopwords used for model input/output validation. If the model prompt/response contains a stopword contained in any of the indexes, then the predict request on the model is rejected.
`index_name`| Object | The name of the index storing the stopwords.
`source_fields`| Object | The name of the field storing the stopwords.
`regex`| Object | A regular expression used for input/output validation. If the model prompt/response matches the regular expression, the predict request on this model is rejected.
`regex`| Object | A regular expression used for input/output validation. If the model prompt/response matches the regular expression, then the predict request on the model is rejected.
`model_id`| String | The guardrail model used to validate user input and LLM output.
`response_filter`| String | The dot path of the field containing the guardrail model response.
`response_validation_regex`| String | The regular expression used to validate the guardrail model response.
Expand Down
18 changes: 9 additions & 9 deletions _ml-commons-plugin/remote-models/guardrails.md
Original file line number Diff line number Diff line change
Expand Up @@ -306,12 +306,12 @@ Guardrails are also triggered when a prompt matches the supplied regular express
{: .label .label-purple }

For more advanced validation, you can configure a guardrail model---a separate LLM whose purpose is to validate the user input and the LLM output. In this example, you'll configure two models:
- Chat model: An Anthropic Claude model hosted on Amazon Bedrock, whose purpose is to hold a conversation with a user
- Guardrail model: An OpenAI model, whose purpose is to validate the user input and Anthropic Claude output.
- Chat model: An Anthropic Claude model hosted on Amazon Bedrock whose purpose is to hold a conversation with a user.
- Guardrail model: An OpenAI model whose purpose is to validate the user input and Anthropic Claude output.

### Step 1: Create a connector for the guardrail model

First, create a connector to the OpenAI guardrail model. Note that the OpenAI prompt instructs the model to respond only with the words `accept` or `reject`, depending on whether the input/output is acceptable. Additionally, the request contains the `response_filter` parameter that specifies the field in which the guardrail model will provide the validation result:
First, create a connector to the OpenAI guardrail model. Note that the OpenAI prompt instructs the model to respond only with the words `accept` or `reject`, depending on whether the input/output is acceptable. Additionally, the request contains the `response_filter` parameter, which specifies the field in which the guardrail model will provide the validation result:

```json
POST /_plugins/_ml/connectors/_create
Expand Down Expand Up @@ -346,7 +346,7 @@ POST /_plugins/_ml/connectors/_create
```
{% include copy-curl.html %}

The response contains the connector ID that you'll use in the next steps:
The response contains the connector ID used in the next steps:

```json
{
Expand All @@ -367,7 +367,7 @@ POST /_plugins/_ml/model_groups/_register
```
{% include copy-curl.html %}

The response contains the model group ID that you'll use to register a model to this model group:
The response contains the model group ID used to register a model to this model group:

```json
{
Expand Down Expand Up @@ -430,7 +430,7 @@ When the operation is complete, the state changes to `COMPLETED`:

### Step 4 (Optional): Test the guardrail model

You can test the guardrail model user input validation by sending requests that do an do not contain offensive words.
You can test the guardrail model user input validation by sending requests that do and do not contain offensive words.

First, send a request that does not contain offensive words:

Expand Down Expand Up @@ -538,7 +538,7 @@ POST /_plugins/_ml/connectors/_create
```
{% include copy-curl.html %}

The response contains the connector ID that you'll use in the next steps:
The response contains the connector ID used in the next steps:

```json
{
Expand All @@ -548,7 +548,7 @@ The response contains the connector ID that you'll use in the next steps:

### Step 6: Register and deploy the chat model with guardrails

To register and deploy the Anthropic Claude chat model, send the following request. Note that the `guardrails` object contains a `response_validation_regex` parameter, which specifies to only treat the input/output as valid if the guardrail model responds with a variant of the word `accept`:
To register and deploy the Anthropic Claude chat model, send the following request. Note that the `guardrails` object contains a `response_validation_regex` parameter that specifies to only treat the input/output as valid if the guardrail model responds with a variant of the word `accept`:

```json
POST /_plugins/_ml/models/_register?deploy=true
Expand Down Expand Up @@ -585,7 +585,7 @@ OpenSearch returns the task ID of the register operation and the model ID of the

### Step 7 (Optional): Test the chat model with guardrails

You can test the Anthropic Claude chat model with guardrails by sending predict requests that do an do not contain offensive words.
You can test the Anthropic Claude chat model with guardrails by sending predict requests that do and do not contain offensive words.

First, send a request that does not contain offensive words:

Expand Down

0 comments on commit 468050c

Please sign in to comment.