Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow DQL search or publish parsing library to generate query from DQL #9498

Open
Galardolind opened this issue Aug 23, 2023 · 6 comments
Open
Labels
enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc

Comments

@Galardolind
Copy link

Is your feature request related to a problem? Please describe.

Searching using OpenSearch API requires a lot of documentation reading and requires to find samples ending often in copy pasting without really understanding why the query is formatted this way.
The documentation is there but split into dozens of pages due to how complex query can be, this result in making simple query unnecessarily complex to write.

OpenSearch already have DQL that would simplify greatly these requests.

Exemple of a simple DQL query with the complex result in JSON:

system.last_assessment>"2022-08-21T23:15:13.570Z" and version>2 and system.status:P2
{
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "bool": {
            "filter": [
              {
                "bool": {
                  "should": [
                    {
                      "range": {
                        "system.last_assessment": {
                          "gt": "2022-08-21T23:15:13.570Z",
                          "time_zone": "Australia/Sydney"
                        }
                      }
                    }
                  ],
                  "minimum_should_match": 1
                }
              },
              {
                "bool": {
                  "filter": [
                    {
                      "bool": {
                        "should": [{ "range": { "version": { "gt": 2 } } }],
                        "minimum_should_match": 1
                      }
                    },
                    {
                      "bool": {
                        "should": [{ "match": { "system.status": "P2" } }],
                        "minimum_should_match": 1
                      }
                    }
                  ]
                }
              }
            ]
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  }
}

Describe the solution you'd like

Export existing DQL parser as an npm library to allow DQL integration to any frontend application.

Or

Having a new type of full-text search query_dsl that will take a DQL (or alike) query and does the parsing and search in the backend would simplify future integration with OpenSearch as the learning required to learn DQL is way smaller than the current JSON implementation.

Exemple:

GET movies/_search
{
 "query": {
    "query_dsl": {
      "query": "actors:*Gadot  and release_date>\"2023-01-01T00:00:00.000Z\" and imdb_score>5"
    }
  }
}

Describe alternatives you've considered

query_string covers part of it but does not allow any comparators.

Additional context

@Galardolind Galardolind added enhancement Enhancement or improvement to existing feature or request untriaged labels Aug 23, 2023
@joshpalis joshpalis added the Search Search query, autocomplete ...etc label Aug 23, 2023
@msfroh
Copy link
Collaborator

msfroh commented Aug 30, 2023

Just wanted to call out that query_string does have syntax for range queries, though it's a little clunkier.

Your example query would be:

GET movies/_search
{
 "query": {
    "query_dsl": {
      "query": "actors:*Gadot AND release_date:{\"2023-01-01T00:00:00.000Z\" TO *} AND imdb_score:{5 TO *}"
    }
  }
}

In general, though, a DQL query builder would be pretty nice.

@msfroh
Copy link
Collaborator

msfroh commented Aug 30, 2023

In order to add this to OpenSearch itself, we would need a way to support the .peg grammar in Java in order to convert to OpenSearch QueryBuilder objects.

@austintlee suggested maybe separating the formal grammar from the Peggy implementation.

Basically:

formal grammar -> Peggy grammar (with suggestions) -> Javascript parser`

but also

formal grammar -> (some Java-friendly parser generator grammar, like ANTLR) -> Java parser

@Galardolind - would you be interested/willing to bring this issue up for discussion at OpenSearch Dashboards office hours? See https://www.meetup.com/opensearch/events/294620421/ for details.

@msfroh msfroh removed the untriaged label Aug 30, 2023
@Galardolind
Copy link
Author

Just wanted to call out that query_string does have syntax for range queries, though it's a little clunkier.

Would be great to have that in the query_string documentation, but agree too that DQL would simplify a lot more to the point that it is almost not necessary to learn something as it would be really close to natural language.

@Galardolind - would you be interested/willing to bring this issue up for discussion at OpenSearch Dashboards office hours? See https://www.meetup.com/opensearch/events/294620421/ for details.

Signed up, thanks for the suggestion 👍

@ashwin-pc
Copy link
Member

ashwin-pc commented Sep 7, 2023

A summary from the Dashboards office hours about this topic:

The reason for this request is because DQL is a simple language to use to search with, its desirable to have it accessible directly through an API as opposed to being only available through OSD's UI. There are a few ways to do this

  1. Include it as a part of the core api for OS so that both OSD and other applications can use it
  2. Add a new OS plugin that adds the DQL API so that other applications can use it. The downside here is that since DQL is a part of the minimal distribution of OSD, unless it can be guaranteed in the min distribution of OS too, we cant switch over to using the new API.
  3. Move DQL to a new npm module. Downside being that its another package to maintain and separately keep up to date
  4. Use PPL instead. This already has an API that does not need OSD to work.

Personally the easiest solution here is to see if PPL satisfies the requirements and if not look to see what the gap are. I also like the idea of moving DQL to core since it makes the backend easier to access with a simplified language that already support in Dashboards.

Would love to hear opinions from some of the maintainers of this repo about the solutions here. @nknize @dblock @andrross and even folks working on PPL @anirudha

@msfroh
Copy link
Collaborator

msfroh commented Sep 9, 2023

In the Search Relevance community meeting, @lukas-vlcek made a really good point that Peggy (what DQL uses for its grammar) excels at offering contextual hints on errors and autocomplete suggestions.

If that part, plus the part that converts from DQL into query DSL, were made available to application developers to use in their frontend (essentially option 3 from @ashwin-pc's comment), it could provide a good "power-user" experience.

Adding DQL support to the OpenSearch backend (either as a core feature, a module, or a plugin) might be nice, but we probably wouldn't want to remove support from OSD, because of those nice Peggy features. At that point, we would need to maintain the Peggy JS implemented in OSD and some Java implementation in the backend and make sure their grammars are kept in sync (hence @austintlee's suggestion above of deriving both from some canonical grammar).

@tkaur-ds
Copy link

Hi, are there any plans to implement this in near future?

@getsaurabh02 getsaurabh02 moved this from 🆕 New to Later (6 months plus) in Search Project Board Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc
Projects
Status: Later (6 months plus)
Development

No branches or pull requests

5 participants