Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] add Rare PPL command #461

Closed
Tracked by #408
noCharger opened this issue Jul 24, 2024 · 2 comments
Closed
Tracked by #408

[FEATURE] add Rare PPL command #461

noCharger opened this issue Jul 24, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request Lang:PPL Pipe Processing Language support

Comments

@noCharger
Copy link
Collaborator

noCharger commented Jul 24, 2024

Add PPL rare command described here

Do you have any additional context?

@YANG-DB
Copy link
Member

YANG-DB commented Aug 6, 2024

@noCharger please add the original PPL command here as context

@YANG-DB YANG-DB added the Lang:PPL Pipe Processing Language support label Aug 6, 2024
@YANG-DB YANG-DB moved this to unknown in PPL Commands Aug 6, 2024
@YANG-DB YANG-DB removed the untriaged label Aug 6, 2024
@YANG-DB YANG-DB moved this from unknown to Todo in PPL Commands Aug 6, 2024
@YANG-DB YANG-DB changed the title - Rare [FEATURE] add Rare PPL command Aug 6, 2024
@YANG-DB YANG-DB added the enhancement New feature or request label Aug 6, 2024
@LantaoJin
Copy link
Member

LantaoJin commented Aug 8, 2024

Design Doc

Similar to #536.

Syntax

rare <field-list> [by <by-field-list>]

rare results maximum 10 distinct least frequent values for a specified field list grouped by the group-by fields.
It equals to

source = t | stats count(1) as count by <field-list>+<by-field-list> | sort count | fields  <field-list> | head 10

So the Spark Logical Plan should be

Limit 10
+- Sort [count ASC NULLS FIRST] // Note, here is ASC and FIRST
   +- Aggregate [<field-list>+<by-field-list>], [count(1) AS count]
      +- ...
         +- UnresolvedRelation

@YANG-DB YANG-DB self-assigned this Aug 8, 2024
@YANG-DB YANG-DB moved this from Todo to In Progress in PPL Commands Aug 13, 2024
YANG-DB added a commit to YANG-DB/opensearch-spark that referenced this issue Aug 14, 2024
top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - opensearch-project#461
 - opensearch-project#536
Signed-off-by: YANGDB <[email protected]>
YANG-DB added a commit to YANG-DB/opensearch-spark that referenced this issue Aug 14, 2024
top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - opensearch-project#461
 - opensearch-project#536
Signed-off-by: YANGDB <[email protected]>
YANG-DB added a commit to YANG-DB/opensearch-spark that referenced this issue Aug 14, 2024
top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - opensearch-project#461
 - opensearch-project#536
Signed-off-by: YANGDB <[email protected]>
YANG-DB added a commit to YANG-DB/opensearch-spark that referenced this issue Aug 15, 2024
top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - opensearch-project#461
 - opensearch-project#536
Signed-off-by: YANGDB <[email protected]>
YANG-DB added a commit to YANG-DB/opensearch-spark that referenced this issue Aug 15, 2024
top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - opensearch-project#461
 - opensearch-project#536
Signed-off-by: YANGDB <[email protected]>
YANG-DB added a commit to YANG-DB/opensearch-spark that referenced this issue Aug 15, 2024
top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - opensearch-project#461
 - opensearch-project#536
Signed-off-by: YANGDB <[email protected]>
@YANG-DB YANG-DB moved this from In Progress to InReview in PPL Commands Aug 16, 2024
YANG-DB added a commit that referenced this issue Aug 20, 2024
* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* update scala fmt style
Signed-off-by: YANGDB <[email protected]>

Signed-off-by: YANGDB <[email protected]>

* add additional support for `rare` & `top` commands options

Signed-off-by: YANGDB <[email protected]>

* add additional support for `rare` & `top` commands options including top N ...

Signed-off-by: YANGDB <[email protected]>

* update scalafmtAll style format

Signed-off-by: YANGDB <[email protected]>

* remove unrelated agg test
Signed-off-by: YANGDB <[email protected]>

Signed-off-by: YANGDB <[email protected]>

---------

Signed-off-by: YANGDB <[email protected]>
opensearch-trigger-bot bot pushed a commit that referenced this issue Aug 20, 2024
* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* update scala fmt style
Signed-off-by: YANGDB <[email protected]>

Signed-off-by: YANGDB <[email protected]>

* add additional support for `rare` & `top` commands options

Signed-off-by: YANGDB <[email protected]>

* add additional support for `rare` & `top` commands options including top N ...

Signed-off-by: YANGDB <[email protected]>

* update scalafmtAll style format

Signed-off-by: YANGDB <[email protected]>

* remove unrelated agg test
Signed-off-by: YANGDB <[email protected]>

Signed-off-by: YANGDB <[email protected]>

---------

Signed-off-by: YANGDB <[email protected]>
(cherry picked from commit 4af03c2)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
opensearch-trigger-bot bot pushed a commit that referenced this issue Aug 20, 2024
* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536
Signed-off-by: YANGDB <[email protected]>

* update scala fmt style
Signed-off-by: YANGDB <[email protected]>

Signed-off-by: YANGDB <[email protected]>

* add additional support for `rare` & `top` commands options

Signed-off-by: YANGDB <[email protected]>

* add additional support for `rare` & `top` commands options including top N ...

Signed-off-by: YANGDB <[email protected]>

* update scalafmtAll style format

Signed-off-by: YANGDB <[email protected]>

* remove unrelated agg test
Signed-off-by: YANGDB <[email protected]>

Signed-off-by: YANGDB <[email protected]>

---------

Signed-off-by: YANGDB <[email protected]>
(cherry picked from commit 4af03c2)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@YANG-DB YANG-DB moved this from InReview to Done in PPL Commands Aug 20, 2024
@YANG-DB YANG-DB closed this as completed by moving to Done in PPL Commands Aug 20, 2024
vmmusings pushed a commit that referenced this issue Aug 21, 2024
* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536


* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536


* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536


* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536


* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536


* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536


* update scala fmt style




* add additional support for `rare` & `top` commands options



* add additional support for `rare` & `top` commands options including top N ...



* update scalafmtAll style format



* remove unrelated agg test




---------


(cherry picked from commit 4af03c2)

Signed-off-by: YANGDB <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
vmmusings pushed a commit that referenced this issue Aug 21, 2024
* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536


* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536


* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536


* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536


* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536


* Adding support for Rare & Top PPL

top [N] <field-list> [by-clause]

N: number of results to return. Default: 10
field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------

rare <field-list> [by-clause]

field-list: mandatory. comma-delimited list of field names.
by-clause: optional. one or more fields to group the results by.
-------------------------------------------------------------------------------------------
commands:
 - #461
 - #536


* update scala fmt style




* add additional support for `rare` & `top` commands options



* add additional support for `rare` & `top` commands options including top N ...



* update scalafmtAll style format



* remove unrelated agg test




---------


(cherry picked from commit 4af03c2)

Signed-off-by: YANGDB <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Lang:PPL Pipe Processing Language support
Projects
Status: Done
Development

No branches or pull requests

3 participants