Skip to content

Commit

Permalink
Changing hybrid search section, adding info about aggregations
Browse files Browse the repository at this point in the history
Signed-off-by: Martin Gaievski <[email protected]>
  • Loading branch information
martin-gaievski committed Mar 13, 2024
1 parent 3416537 commit 480f60a
Showing 1 changed file with 196 additions and 0 deletions.
196 changes: 196 additions & 0 deletions _search-plugins/hybrid-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ To use hybrid search, follow these steps:
1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index).
1. [Configure a search pipeline](#step-4-configure-a-search-pipeline).
1. [Search the index using hybrid search](#step-5-search-the-index-using-hybrid-search).
1. [Enhance results of hybrid search](#step-6-enhance-results-of-hybrid-search).

## Step 1: Create an ingest pipeline

Expand Down Expand Up @@ -216,3 +217,198 @@ The response contains the matching document:
}
}
```

## Step 6: Enhance results of hybrid search

You cam combine hybrid query clause with other OpenSearch query features to enhance search expirience. Following existing clauses can be used together with hybrid query:

Check failure on line 223 in _search-plugins/hybrid-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: expirience. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: expirience. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/hybrid-search.md", "range": {"start": {"line": 223, "column": 92}}}, "severity": "ERROR"}
- Aggregations

### Aggregations
Aggreations allow you to use OpenSearch as analytics engine. For more detailed information about aggregations, see [`Aggregations`]({{site.url}}{{site.baseurl}}/aggregations/).

Check failure on line 227 in _search-plugins/hybrid-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: Aggreations. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: Aggreations. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/hybrid-search.md", "range": {"start": {"line": 227, "column": 1}}}, "severity": "ERROR"}

Hybrid query can be combined with any aggregation that is supported by OpenSearch. Aggregation performed on sub-set of documents that is returned by hybrid query, except for [`global`]({{site.url}}{{site.baseurl}}/aggregations/bucket/global)

Check failure on line 229 in _search-plugins/hybrid-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.LinksEndSlash] Add a trailing slash to the link '({{site.url}}{{site.baseurl}}/aggregations/bucket/global)'. Raw Output: {"message": "[OpenSearch.LinksEndSlash] Add a trailing slash to the link '({{site.url}}{{site.baseurl}}/aggregations/bucket/global)'.", "location": {"path": "_search-plugins/hybrid-search.md", "range": {"start": {"line": 229, "column": 185}}}, "severity": "ERROR"}
aggreagtion that is done for all documents.

Check failure on line 230 in _search-plugins/hybrid-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: aggreagtion. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: aggreagtion. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/hybrid-search.md", "range": {"start": {"line": 230, "column": 1}}}, "severity": "ERROR"}

To illustrate how aggreagtions can be used with hybrid query first let's create index. Aggreagtions typically used with a special field types, like `keyword` or `integer`. Following example will create index with fields of these types:

Check failure on line 232 in _search-plugins/hybrid-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: aggreagtions. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: aggreagtions. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/hybrid-search.md", "range": {"start": {"line": 232, "column": 19}}}, "severity": "ERROR"}

Check failure on line 232 in _search-plugins/hybrid-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: Aggreagtions. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: Aggreagtions. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/hybrid-search.md", "range": {"start": {"line": 232, "column": 88}}}, "severity": "ERROR"}

```json
PUT /my-nlp-index
{
"settings": {
"number_of_shards": 2
},
"mappings": {
"properties": {
"doc_index": {
"type": "integer"
},
"doc_keyword": {
"type": "keyword"
},
"category": {
"type": "keyword"
}
}
}
}
```

Following requests will ingest 6 documents to our new index:

```json
POST /_bulk
{ "index": { "_index": "my-nlp-index" } }
{ "category": "permission", "doc_keyword": "workable", "doc_index": 4976, "doc_price": 100}
{ "index": { "_index": "my-nlp-index" } }
{ "category": "sister", "doc_keyword": "angry", "doc_index": 2231, "doc_price": 200 }
{ "index": { "_index": "my-nlp-index" } }
{ "category": "hair", "doc_keyword": "likeable", "doc_price": 25 }
{ "index": { "_index": "my-nlp-index" } }
{ "category": "editor", "doc_index": 9871, "doc_price": 30 }
{ "index": { "_index": "my-nlp-index" } }
{ "category": "statement", "doc_keyword": "entire", "doc_index": 8242, "doc_price": 350 }
{ "index": { "_index": "my-nlp-index" } }
{ "category": "statement", "doc_keyword": "idea", "doc_index": 5212, "doc_price": 200 }
{ "index": { "_index": "index-test" } }
{ "category": "editor", "doc_keyword": "bubble", "doc_index": 1298, "doc_price": 130 }
{ "index": { "_index": "index-test" } }
{ "category": "editor", "doc_keyword": "bubble", "doc_index": 521, "doc_price": 75 }
```


The following example request combines hybrid query with min aggregation:

```json
GET /my-nlp-index/_search?search_pipeline=nlp-search-pipeline
{
"query": {
"hybrid": {
"queries": [
{
"term": {
"category": "permission"
}
},
{
"bool": {
"should": [
{
"term": {
"category": "editor"
}
},
{
"term": {
"category": "statement"
}
}
]
}
}
]
}
},
"aggs": {
"total_price": {
"sum": {
"field": "doc_price"
}
},
"keywords": {
"terms": {
"field": "doc_keyword",
"size": 10
}
}
}
}
```

The response contains the matching document and results of aggregations:
```json
{
"took": 9,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 4,
"relation": "eq"
},
"max_score": 0.5,
"hits": [
{
"_index": "my-nlp-index",
"_id": "mHRPNY4BlN82W_Ar9UMY",
"_score": 0.5,
"_source": {
"doc_price": 100,
"doc_index": 4976,
"doc_keyword": "workable",
"category": "permission"
}
},
{
"_index": "my-nlp-index",
"_id": "m3RPNY4BlN82W_Ar9UMY",
"_score": 0.5,
"_source": {
"doc_price": 30,
"doc_index": 9871,
"category": "editor"
}
},
{
"_index": "my-nlp-index",
"_id": "nXRPNY4BlN82W_Ar9UMY",
"_score": 0.5,
"_source": {
"doc_price": 200,
"doc_index": 5212,
"doc_keyword": "idea",
"category": "statement"
}
},
{
"_index": "my-nlp-index",
"_id": "nHRPNY4BlN82W_Ar9UMY",
"_score": 0.5,
"_source": {
"doc_price": 350,
"doc_index": 8242,
"doc_keyword": "entire",
"category": "statement"
}
}
]
},
"aggregations": {
"total_price": {
"value": 680.0
},
"doc_keywords": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "entire",
"doc_count": 1
},
{
"key": "idea",
"doc_count": 1
},
{
"key": "workable",
"doc_count": 1
}
]
}
}
}
```

0 comments on commit 480f60a

Please sign in to comment.