Skip to content

Commit

Permalink
Merge branch 'main' into 7507-collapse-search-results
Browse files Browse the repository at this point in the history
  • Loading branch information
leanneeliatra authored Jul 16, 2024
2 parents e9f73e0 + a7a7155 commit 4a3effd
Show file tree
Hide file tree
Showing 18 changed files with 573 additions and 45 deletions.
1 change: 0 additions & 1 deletion .github/vale/styles/Vocab/OpenSearch/Products/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,6 @@ Painless
Peer Forwarder
Performance Analyzer
Piped Processing Language
Point in Time
Powershell
Python
PyTorch
Expand Down
43 changes: 43 additions & 0 deletions .github/workflows/pr_checklist.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: PR Checklist

on:
pull_request:
types: [opened]

permissions:
pull-requests: write

jobs:
add-checklist:
runs-on: ubuntu-latest

steps:
- name: Comment PR with checklist
uses: peter-evans/create-or-update-comment@v3
with:
token: ${{ secrets.GITHUB_TOKEN }}
issue-number: ${{ github.event.pull_request.number }}
body: |
Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged.
Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a [maintainer](https://github.com/opensearch-project/documentation-website/blob/main/MAINTAINERS.md).
**When you're ready for doc review, tag the assignee of this PR**. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review.
- name: Auto assign PR to repo owner
uses: actions/github-script@v6
with:
script: |
let assignee = context.payload.pull_request.user.login;
const prOwners = ['Naarcha-AWS', 'kolchfa-aws', 'vagimeli', 'natebower'];
if (!prOwners.includes(assignee)) {
assignee = 'hdhalter'
}
github.rest.issues.addAssignees({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
assignees: [assignee]
});
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ _site
.DS_Store
Gemfile.lock
.idea
*.iml
.jekyll-cache
6 changes: 3 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,10 +100,10 @@ Follow these steps to set up your local copy of the repository:

#### Troubleshooting

If you encounter an error while trying to build the documentation website, find the error in the following troubleshooting list:
Try the following troubleshooting steps if you encounter an error when trying to build the documentation website:

- When running `rvm install 3.2` if you receive a `Error running '__rvm_make -j10'`, resolve this by running `rvm install 3.2.0 -C --with-openssl-dir=/opt/homebrew/opt/[email protected]` instead of `rvm install 3.2`.
- If receive a `bundle install`: `An error occurred while installing posix-spawn (0.3.15), and Bundler cannot continue.` error when trying to run `bundle install`, resolve this by running `gem install posix-spawn -v 0.3.15 -- --with-cflags=\"-Wno-incompatible-function-pointer-types\"`. Then, run `bundle install`.
- If you see the `Error running '__rvm_make -j10'` error when running `rvm install 3.2`, you can resolve it by running `rvm install 3.2.0 -C --with-openssl-dir=/opt/homebrew/opt/[email protected]` instead of `rvm install 3.2`.
- If you see the `bundle install`: `An error occurred while installing posix-spawn (0.3.15), and Bundler cannot continue.` error when trying to run `bundle install`, you can resolve it by running `gem install posix-spawn -v 0.3.15 -- --with-cflags=\"-Wno-incompatible-function-pointer-types\"` and then `bundle install`.



Expand Down
19 changes: 17 additions & 2 deletions _automating-configurations/api/create-workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ You can include placeholder expressions in the value of workflow step fields. Fo

Once a workflow is created, provide its `workflow_id` to other APIs.

The `POST` method creates a new workflow. The `PUT` method updates an existing workflow.
The `POST` method creates a new workflow. The `PUT` method updates an existing workflow. You can specify the `update_fields` parameter to update specific fields.

You can only update a workflow if it has not yet been provisioned.
You can only update a complete workflow if it has not yet been provisioned.
{: .note}

## Path and HTTP methods
Expand Down Expand Up @@ -58,11 +58,26 @@ POST /_plugins/_flow_framework/workflow?validation=none
```
{% include copy-curl.html %}

You cannot update a full workflow once it has been provisioned, but you can update fields other than the `workflows` field, such as `name` and `description`:

```json
PUT /_plugins/_flow_framework/workflow/<workflow_id>?update_fields=true
{
"name": "new-template-name",
"description": "A new description for the existing template"
}
```
{% include copy-curl.html %}

You cannot specify both the `provision` and `update_fields` parameters at the same time.
{: .note}

The following table lists the available query parameters. All query parameters are optional. User-provided parameters are only allowed if the `provision` parameter is set to `true`.

| Parameter | Data type | Description |
| :--- | :--- | :--- |
| `provision` | Boolean | Whether to provision the workflow as part of the request. Default is `false`. |
| `update_fields` | Boolean | Whether to update only the fields included in the request body. Default is `false`. |
| `validation` | String | Whether to validate the workflow. Valid values are `all` (validate the template) and `none` (do not validate the template). Default is `all`. |
| User-provided substitution expressions | String | Parameters matching substitution expressions in the template. Only allowed if `provision` is set to `true`. Optional. If `provision` is set to `false`, you can pass these parameters in the [Provision Workflow API query parameters]({{site.url}}{{site.baseurl}}/automating-configurations/api/provision-workflow/#query-parameters). |

Expand Down
25 changes: 12 additions & 13 deletions _ingest-pipelines/processors/split.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,19 +26,18 @@ The following is the syntax for the `split` processor:

The following table lists the required and optional parameters for the `split` processor.

Parameter | Required/Optional | Description |
|-----------|-----------|-----------|
`field` | Required | The field containing the string to be split.
`separator` | Required | The delimiter used to split the string. This can be a regular expression pattern.
`preserve_field` | Optional | If set to `true`, preserves empty trailing fields (for example, `''`) in the resulting array. If set to `false`, empty trailing fields are removed from the resulting array. Default is `false`.
`target_field` | Optional | The field where the array of substrings is stored. If not specified, then the field is updated in-place.
`ignore_missing` | Optional | Specifies whether the processor should ignore documents that do not contain the specified
field. If set to `true`, then the processor ignores missing values in the field and leaves the `target_field` unchanged. Default is `false`.
`description` | Optional | A brief description of the processor.
`if` | Optional | A condition for running the processor.
`ignore_failure` | Optional | Specifies whether the processor continues execution even if it encounters an error. If set to `true`, then failures are ignored. Default is `false`.
`on_failure` | Optional | A list of processors to run if the processor fails.
`tag` | Optional | An identifier tag for the processor. Useful for debugging in order to distinguish between processors of the same type.
Parameter | Required/Optional | Description
:--- | :--- | :---
`field` | Required | The field containing the string to be split.
`separator` | Required | The delimiter used to split the string. This can be a regular expression pattern.
`preserve_field` | Optional | If set to `true`, preserves empty trailing fields (for example, `''`) in the resulting array. If set to `false`, empty trailing fields are removed from the resulting array. Default is `false`.
`target_field` | Optional | The field where the array of substrings is stored. If not specified, then the field is updated in-place.
`ignore_missing` | Optional | Specifies whether the processor should ignore documents that do not contain the specified field. If set to `true`, then the processor ignores missing values in the field and leaves the `target_field` unchanged. Default is `false`.
`description` | Optional | A brief description of the processor.
`if` | Optional | A condition for running the processor.
`ignore_failure` | Optional | Specifies whether the processor continues execution even if it encounters an error. If set to `true`, then failures are ignored. Default is `false`.
`on_failure` | Optional | A list of processors to run if the processor fails.
`tag` | Optional | An identifier tag for the processor. Useful for debugging in order to distinguish between processors of the same type.

## Using the processor

Expand Down
6 changes: 1 addition & 5 deletions _query-dsl/compound/hybrid.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,7 @@ You can use a hybrid query to combine relevance scores from multiple queries int

## Example

Before using a `hybrid` query, you must set up a machine learning (ML) model, ingest documents, and configure a search pipeline with a [`normalization-processor`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/normalization-processor/).

To learn how to set up an ML model, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).

Once you set up an ML model, learn how to use the `hybrid` query by following the steps in [Using hybrid search]({{site.url}}{{site.baseurl}}/search-plugins/hybrid-search/#using-hybrid-search).
Learn how to use the `hybrid` query by following the steps in [Using hybrid search]({{site.url}}{{site.baseurl}}/search-plugins/hybrid-search/#using-hybrid-search).

For a comprehensive example, follow the [Neural search tutorial]({{site.url}}{{site.baseurl}}/ml-commons-plugin/semantic-search#tutorial).

Expand Down
177 changes: 177 additions & 0 deletions _query-dsl/geo-and-xy/geopolygon.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
---
layout: default
title: Geopolygon
parent: Geographic and xy queries
grand_parent: Query DSL
nav_order: 30
---

# Geopolygon query

A geopolygon query returns documents containing geopoints that are within the specified polygon. A document containing multiple geopoints matches the query if at least one geopoint matches the query.

A polygon is specified by a list of vertices in coordinate form. Unlike specifying a polygon for a geoshape field, the polygon does not have to be closed (specifying the first and last points at the same is unnecessary). Though points do not have to follow either clockwise or counterclockwise order, it is recommended that you list them in either of these orders. This will ensure that the correct polygon is captured.

The searched document field must be mapped as `geo_point`.
{: .note}

## Example

Create a mapping with the `point` field mapped as `geo_point`:

```json
PUT /testindex1
{
"mappings": {
"properties": {
"point": {
"type": "geo_point"
}
}
}
}
```
{% include copy-curl.html %}

Index a geopoint, specifying its latitude and longitude:

```json
PUT testindex1/_doc/1
{
"point": {
"lat": 73.71,
"lon": 41.32
}
}
```
{% include copy-curl.html %}

Search for documents whose `point` objects are within the specified `geo_polygon`:

```json
GET /testindex1/_search
{
"query": {
"bool": {
"must": {
"match_all": {}
},
"filter": {
"geo_polygon": {
"point": {
"points": [
{ "lat": 74.5627, "lon": 41.8645 },
{ "lat": 73.7562, "lon": 42.6526 },
{ "lat": 73.3245, "lon": 41.6189 },
{ "lat": 74.0060, "lon": 40.7128 }
]
}
}
}
}
}
}
```
{% include copy-curl.html %}

The polygon specified in the preceding request is the quadrilateral depicted in the following image. The matching document is within this quadrilateral. The coordinates of the quadrilateral vertices are specified in `(latitude, longitude)` format.

![Search for points within the specified quadrilateral]({{site.url}}{{site.baseurl}}/images/geopolygon-query.png)

The response contains the matching document:

```json
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "testindex1",
"_id": "1",
"_score": 1,
"_source": {
"point": {
"lat": 73.71,
"lon": 41.32
}
}
}
]
}
}
```

In the preceding search request, you specified the polygon vertices in clockwise order:

```json
"geo_polygon": {
"point": {
"points": [
{ "lat": 74.5627, "lon": 41.8645 },
{ "lat": 73.7562, "lon": 42.6526 },
{ "lat": 73.3245, "lon": 41.6189 },
{ "lat": 74.0060, "lon": 40.7128 }
]
}
}
```

Alternatively, you can specify the vertices in counterclockwise order:

```json
"geo_polygon": {
"point": {
"points": [
{ "lat": 74.5627, "lon": 41.8645 },
{ "lat": 74.0060, "lon": 40.7128 },
{ "lat": 73.3245, "lon": 41.6189 },
{ "lat": 73.7562, "lon": 42.6526 }
]
}
}
```

The resulting query response contains the same matching document.

However, if you specify the vertices in the following order:

```json
"geo_polygon": {
"point": {
"points": [
{ "lat": 74.5627, "lon": 41.8645 },
{ "lat": 74.0060, "lon": 40.7128 },
{ "lat": 73.7562, "lon": 42.6526 },
{ "lat": 73.3245, "lon": 41.6189 }
]
}
}
```

The response returns no results.

## Request fields

Geopolygon queries accept the following fields.

Field | Data type | Description
:--- | :--- | :---
`_name` | String | The name of the filter. Optional.
`validation_method` | String | The validation method. Valid values are `IGNORE_MALFORMED` (accept geopoints with invalid coordinates), `COERCE` (try to coerce coordinates to valid values), and `STRICT` (return an error when coordinates are invalid). Optional. Default is `STRICT`.
`ignore_unmapped` | Boolean | Specifies whether to ignore an unmapped field. If set to `true`, then the query does not return any documents that contain an unmapped field. If set to `false`, then an exception is thrown when the field is unmapped. Optional. Default is `false`.

## Accepted formats

You can specify the geopoint coordinates when indexing a document and searching for documents in any [format]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/geo-point#formats) accepted by the geopoint field type.
2 changes: 1 addition & 1 deletion _query-dsl/geo-and-xy/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ OpenSearch provides the following geographic query types:

- [**Geo-bounding box queries**]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/geo-and-xy/geo-bounding-box/): Return documents with geopoint field values that are within a bounding box.
- [**Geodistance queries**]({{site.url}}{{site.baseurl}}/query-dsl/geo-and-xy/geodistance/): Return documents with geopoints that are within a specified distance from the provided geopoint.
- **Geopolygon queries**: Return documents with geopoints that are within a polygon.
- [**Geopolygon queries**]({{site.url}}{{site.baseurl}}/query-dsl/geo-and-xy/geodistance/): Return documents containing geopoints that are within a polygon.
- **Geoshape queries**: Return documents that contain:
- Geoshapes and geopoints that have one of four spatial relations to the provided shape: `INTERSECTS`, `DISJOINT`, `WITHIN`, or `CONTAINS`.
- Geopoints that intersect the provided shape.
2 changes: 1 addition & 1 deletion _search-plugins/hybrid-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Introduced 2.11
Hybrid search combines keyword and neural search to improve search relevance. To implement hybrid search, you need to set up a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) that runs at search time. The search pipeline you'll configure intercepts search results at an intermediate stage and applies the [`normalization_processor`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/normalization-processor/) to them. The `normalization_processor` normalizes and combines the document scores from multiple query clauses, rescoring the documents according to the chosen normalization and combination techniques.

**PREREQUISITE**<br>
Before using hybrid search, you must set up a text embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).
To follow this example, you must set up a text embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model). If you have already generated text embeddings, ingest the embeddings into an index and skip to [Step 4](#step-4-configure-a-search-pipeline).
{: .note}

## Using hybrid search
Expand Down
Loading

0 comments on commit 4a3effd

Please sign in to comment.