From c7bcb9c671ec435671a100a2aa0c501baaabca43 Mon Sep 17 00:00:00 2001 From: "opensearch-trigger-bot[bot]" <98922864+opensearch-trigger-bot[bot]@users.noreply.github.com> Date: Fri, 16 Aug 2024 16:02:27 +0000 Subject: [PATCH] [Backport 2.0] Propagate DQL changes to 2.10 (#8027) --- _dashboards/discover/dql.md | 374 ++++++++++++++++++++++++++++-------- 1 file changed, 289 insertions(+), 85 deletions(-) diff --git a/_dashboards/discover/dql.md b/_dashboards/discover/dql.md index 92fd5b19a5..c73044a8fe 100644 --- a/_dashboards/discover/dql.md +++ b/_dashboards/discover/dql.md @@ -14,152 +14,356 @@ Dashboards Query Language (DQL) is a simple text-based query language for filter Search term using DQL toolbar in Dashboard -Before you can search data in Dashboards, you must index it. In OpenSearch, the basic unit of data is a JSON document. Within an index, OpenSearch identifies each document using a unique ID. To learn more about indexing in OpenSearch, see [Index data]({{site.url}}{{site.baseurl}}/opensearch/index-data). -{: .note purple} +DQL and query string query (Lucene) language are the two search bar language options in Discover and Dashboards. +{: .tip} -## Searching with terms queries +## Setup -The most basic query specifies the search term, for example: +To follow this tutorial in OpenSearch Dashboards, expand the following setup steps. +
+ + Setup + + {: .text-delta} + +Use the following steps to prepare sample data for querying. + +**Step 1: Set up mappings for the index** + +On the main menu, select **Management** > **Dev Tools** to open Dev Tools. Send the following request to create index mappings: + +```json +PUT testindex +{ + "mappings" : { + "properties" : { + "date" : { + "type" : "date", + "format" : "yyyy-MM-dd" + } + } + } +} ``` -host:www.example.com +{% include copy-curl.html %} + +**Step 2: Ingest the documents into the index** + +In **Dev Tools**, ingest the following documents into the index: + +```json +PUT /testindex/_doc/1 +{ + "title": "The wind rises", + "description": "A biographical film", + "media_type": "film", + "date": "2013-07-20", + "page_views": 100 +} ``` +{% include copy-curl.html %} -To access an object's nested field, list the complete path to the field separated by periods. For example, use the following path to retrieve the `lat` field in the `coordinates` object: +```json +PUT /testindex/_doc/2 +{ + "title": "Gone with the wind", + "description": "A well-known 1939 American epic historical film", + "media_type": "film", + "date": "1939-09-09", + "page_views": 200 +} +``` +{% include copy-curl.html %} +```json +PUT /testindex/_doc/3 +{ + "title": "Chicago: the historical windy city", + "media_type": "article", + "date": "2023-07-29", + "page_views": 300 +} ``` -coordinates.lat:43.7102 +{% include copy-curl.html %} + +```json +PUT /testindex/_doc/4 +{ + "article title": "Wind turbines", + "media_type": "article", + "format": "2*3" +} ``` +{% include copy-curl.html %} + +**Step 3: Create an index pattern** + +Follow these steps to create an index pattern for your index: + +1. On the main menu, select **Management** > **Dashboards Management**. +1. Select **Index patterns** and then **Create index pattern**. +1. In **Index pattern name**, enter `testindex*`. Select **Next step**. +1. In **Time field**, select `I don't want to use the time filter`. +1. Select **Create index pattern**. + + +**Step 4: Navigate to Discover and select the index pattern** + +On the main menu, select **Discover**. In the upper-left corner, select `testindex*` from the **Index patterns** dropdown list. The main panel displays the documents in the index, and you can now try out the DQL queries described on this page. + +The [Object fields](#object-fields) and [Nested fields](#nested-fields) sections provide links for additional setup needed to try queries in those sections. +{: .note} +
-DQL supports leading and trailing wildcards, so you can search for any terms that match your pattern, for example: +## Search for terms +By default, DQL searches in the field set as the default field on the index. If the default field is not set, DQL searches all fields. For example, the following query searches for documents containing the words `rises` or `wind` in any of their fields: + +```python +rises wind ``` -host.keyword:*.example.com/* +{% include copy.html %} + +The preceding query matches documents in which any search term appears regardless of the order. By default, DQL combines search terms with an `or`. To learn how to create Boolean expressions containing search terms, see [Boolean operators](#boolean-operators). + +To search for a phrase (an ordered sequence of words), surround your text with quotation marks. For example, the following query searches for the exact text "wind rises": + +```python +"wind rises" ``` +{% include copy.html %} + +Hyphens are reserved characters in Lucene, so if your search term contains hyphens, DQL might prompt you to switch to Lucene syntax. To avoid this, surround your search term with quotation marks in a phrase search or omit the hyphen in a regular search. +{: .tip} + +## Reserved characters -To check whether a field exists or has any data, use a wildcard to see whether Dashboards returns any results,for example: +The following is a list of reserved characters in DQL: +`\`, `(`, `)`, `:`, `<`, `>`, `"`, `*` + +Use a backslash (`\`) to escape reserved characters. For example, to search for an expression `2*3`, specify the query as `2\*3`: + +```plaintext +2\*3 ``` -host.keyword:* +{% include copy.html %} + +## Search in a field + +To search for text in a particular field, specify the field name before the colon: + +```python +title: rises wind ``` +{% include copy.html %} + +The analyzer for the field you're searching parses the query text into tokens and matches documents in which any of the tokens appear. -## Searching with Boolean queries +DQL ignores white space characters, so `title:rises wind` and `title: rises wind` are the same. +{: .tip} -To mix and match or combine multiple queries for more refined results, you can use the Boolean operators `and`, `or`, and `not`. DQL is not case sensitive, so `AND` and `and` are the same, for example: +Use wildcards to refer to field names containing spaces. For example, `article*title` matches the `article title` field. +{: .tip} +## Field names + +Specify the field name before the colon. The following table contains example queries with field names. + +Query | Criterion for a document to match | Matching documents from the `testindex` index +:--- | :--- | :--- +`title: wind` | The `title` field contains the word `wind`. | 1, 2 +`title: (wind OR windy)` | The `title` field contains the word `wind` or the word `windy`. | 1, 2, 3 +`title: "wind rises"` | The `title` field contains the phrase `wind rises`. | 1 +`title.keyword: The wind rises` | The `title.keyword` field exactly matches `The wind rises`. | 1 +`title*: wind` | Any field that starts with `title` (for example, `title` and `title.keyword`) contains the word `wind` | 1, 2 +`article*title: wind` | The field that starts with `article` and ends with `title` contains the word `wind`. Matches the field `article title`. | 4 +`description:*` | Documents in which the field `description` exists. | 1, 2 + +## Wildcards + +DQL supports wildcards (`*` only) in both search terms and field names, for example: + +```python +t*le: *wind and rise* ``` -host.keyword:www.example.com and response.keyword:200 +{% include copy.html %} + +## Ranges + +DQL supports numeric inequalities using the `>`, `<`, `>=`, and `<=` operators, for example: + +```python +page_views > 100 and page_views <= 300 ``` +{% include copy.html %} -You also can use multiple Boolean operators in one query, for example: +You can use the range operators on dates. For example, the following query searches for documents containing dates within the 2013--2023 range, inclusive: +```python +date >= "2013-01-01" and date < "2024-01-01" ``` -geo.dest:US or response.keyword:200 and host.keyword:www.example.com +{% include copy.html %} + +You can query for "not equal to" by using `not` and the field name, for example: + +```python +not page_views: 100 ``` +{% include copy.html %} -Remember that Boolean operators follow the logical precedence order of `not`, `and`, and `or`, so if you have an expression like the one in the preceding example, `response.keyword:200 and host.keyword:www.example.com` is evaluated first. +Note that the preceding query returns documents in which either the `page_views` field does not contain `100` or the field is not present. To filter by those documents that contain the field `page_views`, use the following query: -To avoid confusion, use parentheses to dictate the order in which you want to evaluate operands. If you want to evaluate `geo.dest:US or response.keyword:200` first, you can use an expression like the following: +```python +page_views:* and not page_views: 100 +``` +{% include copy.html %} + +## Boolean operators +DQL supports the `and`, `or`, and `not` Boolean operators. DQL is not case sensitive, so `AND` and `and` are the same. For example, the following query is a conjunction of two Boolean clauses: + +```python +title: wind and description: epic ``` -(geo.dest:US or response.keyword:200) and host.keyword:www.example.com +{% include copy.html %} + +Boolean operators follow the logical precedence order of `not`, `and`, and `or`, so in the following example, `title: wind and description: epic` is evaluated first: + +```python +media_type: article or title: wind and description: epic ``` +{% include copy.html %} -## Querying dates and ranges +To dictate the order of evaluation, group Boolean clauses in parentheses. For example, in the following query, the parenthesized expression is evaluated first: -DQL supports numeric inequalities, for example, `bytes >= 15 and memory < 15`. +```python +(media_type: article or title: wind) and description: epic +``` +{% include copy.html %} -You can use the same method to find a date before or after the date specified in the query. `>` indicates a search for a date after the specified date, and `<` returns dates before the specified date, for example, `@timestamp > "2020-12-14T09:35:33`. +The field prefix refers to the token that immediately follows the colon. For example, the following query searches for documents in which the `title` field contains `windy` or documents containing the word `historical` in any of their fields: -## Querying nested fields +```python +title: windy or historical +``` +{% include copy.html %} -Searching a document with [nested fields]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/nested/) requires you to specify the full path of the field to be retrieved. In the following example document, the `superheroes` field has nested objects: +To search for documents in which the `title` field contains `windy` or `historical`, group the terms in parentheses: -```json -{ - "superheroes":[ - { - "hero-name": "Superman", - "real-identity": "Clark Kent", - "age": 28 - }, - { - "hero-name": "Batman", - "real-identity": "Bruce Wayne", - "age": 26 - }, - { - "hero-name": "Flash", - "real-identity": "Barry Allen", - "age": 28 - }, - { - "hero-name": "Robin", - "real-identity": "Dick Grayson", - "age": 15 - } - ] -} +```python +title: (windy or historical) ``` {% include copy.html %} -To retrieve documents that match a specific field using DQL, specify the field, for example: +The preceding query is equivalent to `title: windy or title: historical`. + +To negate a query, use the `not` operator. For example, the following query searches for documents that contain the word `wind` in the `title` field, are not of the `media_type` `article`, and do not contain `epic` in the `description` field: + +```python +title: wind and not (media_type: article or description: epic) +``` +{% include copy.html %} + +Queries can contain multiple grouping levels, for example: + +```python +title: ((wind or windy) and not rises) +``` +{% include copy.html %} + +## Object fields + +To refer to an object's inner field, list the dot path of the field. + +To index a document containing an object, follow the steps in the [object field type example]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/object/#example). To search the `name` field of the `patient` object, use the following syntax: +```python +patient.name: john ``` -superheroes: {hero-name: Superman} +{% include copy.html %} + +## Nested fields + +To refer to a nested object, list the JSON path of the field. + +To index a document containing an object, follow the steps in the [nested field type example]({{site.url}}{{site.baseurl}}/opensearch/supported-field-types/nested/). + +To search the `name` field of the `patients` object, use the following syntax: + +```python +patients: {name: john} ``` {% include copy.html %} -To retrieve documents that match multiple fields, specify all the fields, for example: +To retrieve documents that match multiple fields, specify all the fields. For example, consider an additional `status` field in the following document: +```json +{ + "status": "Discharged", + "patients": [ + {"name" : "John Doe", "age" : 56, "smoker" : true}, + {"name" : "Mary Major", "age" : 85, "smoker" : false} + ] +} ``` -superheroes: {hero-name: Superman} and superheroes: {hero-name: Batman} + +To search for a discharged patient whose name is John, specify the `name` and the `status` in the query: + +```python +patients: {name: john} and status: discharged ``` {% include copy.html %} You can combine multiple Boolean and range queries to create a more refined query, for example: -``` -superheroes: {hero-name: Superman and age < 50} +```python +patients: {name: john and smoker: true and age < 57} ``` {% include copy.html %} -## Querying doubly nested objects +## Doubly nested fields -If a document has doubly nested objects (objects nested inside other objects), retrieve a field value by specifying the full path to the field. In the following example document, the `superheroes` object is nested inside the `justice-league` object: +Consider a document with a doubly nested field. In this document, both the `patients` and `names` fields are of type `nested`: ```json { -"justice-league": [ -{ -"superheroes":[ -{ -"hero-name": "Superman", -"real-identity": "Clark Kent", -"age": 28 -}, -{ -"hero-name": "Batman", -"real-identity": "Bruce Wayne", -"age": 26 -}, -{ -"hero-name": "Flash", -"real-identity": "Barry Allen", -"age": 28 -}, -{ -"hero-name": "Robin", -"real-identity": "Dick Grayson", -"age": 15 -} -] -} -] + "patients": [ + { + "names": [ + { "name": "John Doe", "age": 56, "smoker": true }, + { "name": "Mary Major", "age": 85, "smoker": false} + ] + } + ] } ``` + +To search the `name` field of the `patients` object, use the following syntax: + +```python +patients: {names: {name: john}} +``` {% include copy.html %} -The following image shows the query result using the example notation `justice-league.superheroes: {hero-name:Superman}`. +In contrast, consider a document in which the `patients` field is of type `object` but the `names` field is of type `nested`: -DQL query result +```json +{ + "patients": + { + "names": [ + { "name": "John Doe", "age": 56, "smoker": true }, + { "name": "Mary Major", "age": 85, "smoker": false} + ] + } +} +``` + +To search the `name` field of the `patients` object, use the following syntax: + +```python +patients.names: {name: john} +``` +{% include copy.html %} \ No newline at end of file