From 54161d8576160cbe787237b5eced56f9c6cdd584 Mon Sep 17 00:00:00 2001 From: YANGDB Date: Mon, 7 Oct 2024 08:54:08 -0700 Subject: [PATCH] update docs & links & add planned commands docs with process suggestion (#745) * update docs broken links Signed-off-by: YANGDB * add planned commands folder Signed-off-by: YANGDB * add ppl proposal docs for future ppl language improvements Signed-off-by: YANGDB * update OpenSearch PPL Command Development Process Signed-off-by: YANGDB --------- Signed-off-by: YANGDB --- .github/ISSUE_TEMPLATE/ppl_command_request.md | 25 +++++ OpenSearch-PPL-Command-Process.md | 91 +++++++++++++++++++ docs/ppl-lang/PPL-Example-Commands.md | 39 +++++--- docs/ppl-lang/README.md | 4 + .../ppl-lang/planning/ppl-fillnull-command.md | 50 ++++++++++ 5 files changed, 196 insertions(+), 13 deletions(-) create mode 100644 .github/ISSUE_TEMPLATE/ppl_command_request.md create mode 100644 OpenSearch-PPL-Command-Process.md create mode 100644 docs/ppl-lang/planning/ppl-fillnull-command.md diff --git a/.github/ISSUE_TEMPLATE/ppl_command_request.md b/.github/ISSUE_TEMPLATE/ppl_command_request.md new file mode 100644 index 000000000..5daf66319 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/ppl_command_request.md @@ -0,0 +1,25 @@ +--- +name: 🎆 PPL Command request +about: Request a new PPL command Or syntax change +title: '[PPL-Lang]' +labels: 'enhancement, untriaged' +assignees: '' +--- +**Is your feature request related to a problem?** +A clear and concise description of what the PPL command/syntax change is about, why is it needed, e.g. _I'm always frustrated when [...]_ + +**What solution would you like?** +A clear and concise description of what you want to happen. + - Add Example new / updated syntax + - [Optional] Add suggested [ANTLR](https://www.antlr.org/) suggested grammar + +**Add Proposal Document** + +Under the [docs/planning](../../docs/ppl-lang/planning) folder add a dedicated page for your suggested command or syntax change + +_**Example Proposal Document**_ + +See [ppl-fillnull-command.md](../../docs/ppl-lang/planning/ppl-fillnull-command.md) example + +**Do you have any additional context?** +Add any other context or screenshots about the feature request here. \ No newline at end of file diff --git a/OpenSearch-PPL-Command-Process.md b/OpenSearch-PPL-Command-Process.md new file mode 100644 index 000000000..8dbaf39f6 --- /dev/null +++ b/OpenSearch-PPL-Command-Process.md @@ -0,0 +1,91 @@ +# OpenSearch PPL Command Development Process +This document outlines the formal process for proposing and implementing new PPL commands or syntax changes in OpenSearch. + +## Phase 1: Proposal + +### 1.1 Create GitHub Issue + +Start by creating a new GitHub issue using the following [template](.github/ISSUE_TEMPLATE/ppl_command_request.md): +``` +name: PPL Command request +about: Request a new PPL command Or syntax change +title: '[PPL-Lang]' +labels: 'enhancement, untriaged' +assignees: '' +--- + +**Is your feature request related to a problem?** +A clear and concise description of what the PPL command/syntax change is about, why is it needed, e.g. _I'm always frustrated when [...]_ + +**What solution would you like?** +A clear and concise description of what you want to happen. +- Add Example new / updated syntax +- [Optional] Add suggested [ANTLR](https://www.antlr.org/) suggested grammar + +**Add Proposal Document** +Under the [docs/planning](../../docs/ppl-lang/planning) folder add a dedicated page for your suggested command or syntax change + +See [ppl-fillnull-command.md](../../docs/ppl-lang/planning/ppl-fillnull-command.md) example + +**Do you have any additional context?** +Add any other context or screenshots about the feature request here. +``` + +### 1.2 Create Planning Document PR +Create a Pull Request that adds a new markdown file under the `docs/ppl-lang/planning folder`. This document should include: + +1) Command overview and motivation +2) Detailed syntax specification +3) Example usage scenarios +4) Implementation considerations +5) Potential limitations or edge cases + +## Phase 2: Review and Approval + +1) Community members and maintainers review the proposal +2) Feedback is incorporated into the planning document / PR comments +3) Proposal is either accepted, rejected, or sent back for revision + +## Phase 3: Experimental Implementation +Once approved, the command enters the experimental phase: + +1) Create implementation PR with: + - Code changes + - Comprehensive test suite + - Documentation updates + +2) Code is clearly marked as experimental using appropriate annotations +3) Documentation indicates experimental status +4) Experimental features are disabled by default in production + +## Phase 4: Maturation +During the experimental phase: + +1) Gather user feedback +2) Address issues and edge cases +3) Refine implementation and documentation +4) Regular review of usage and stability + +## Phase 5: Formal Integration +When the command has matured: + +1) Create PR to remove experimental status +2) Update all documentation to reflect stable status +3) Ensure backward compatibility +4) Merge into main PPL command set + +--- + +## Best Practices + +* Follow existing PPL command patterns and conventions +* Ensure comprehensive test coverage +* Provide clear, detailed documentation with examples +* Consider performance implications +* Maintain backward compatibility when possible + +## Timeline Expectations + +* Proposal Review: 1-2 weeks +* Experimental Phase: 1-3 months +* Maturation to Formal Integration: Based on community feedback and stability diff --git a/docs/ppl-lang/PPL-Example-Commands.md b/docs/ppl-lang/PPL-Example-Commands.md index fbe5f6ace..f68865f75 100644 --- a/docs/ppl-lang/PPL-Example-Commands.md +++ b/docs/ppl-lang/PPL-Example-Commands.md @@ -12,7 +12,7 @@ - `explain simple | describe table` #### **Fields** -[See additional command details](ppl-fields-command) +[See additional command details](ppl-fields-command.md) - `source = table` - `source = table | fields a,b,c` - `source = table | fields + a,b,c` @@ -69,7 +69,7 @@ _- **Limitation: new field added by eval command with a function cannot be dropp #### **Eval**: -[See additional command details](ppl-eval-command) +[See additional command details](ppl-eval-command.md) Assumptions: `a`, `b`, `c` are existing fields in `table` - `source = table | eval f = 1 | fields a,b,c,f` @@ -118,7 +118,7 @@ source = table | where ispresent(a) | - `source = table | eval a = signum(a) | where a < 0` #### **Aggregations** -[See additional command details](ppl-stats-command) +[See additional command details](ppl-stats-command.md) - `source = table | stats avg(a) ` - `source = table | where a < 50 | stats avg(c) ` @@ -145,7 +145,7 @@ source = table | where ispresent(a) | #### **Dedup** -[See additional command details](ppl-dedup-command) +[See additional command details](ppl-dedup-command.md) - `source = table | dedup a | fields a,b,c` - `source = table | dedup a,b | fields a,b,c` @@ -162,20 +162,20 @@ source = table | where ispresent(a) | - `source = table | dedup 1 a consecutive=true| fields a,b,c` (Consecutive deduplication is unsupported) #### **Rare** -[See additional command details](ppl-rare-command) +[See additional command details](ppl-rare-command.md) - `source=accounts | rare gender` - `source=accounts | rare age by gender` #### **Top** -[See additional command details](ppl-top-command) +[See additional command details](ppl-top-command.md) - `source=accounts | top gender` - `source=accounts | top 1 gender` - `source=accounts | top 1 age by gender` #### **Parse** -[See additional command details](ppl-parse-command) +[See additional command details](ppl-parse-command.md) - `source=accounts | parse email '.+@(?.+)' | fields email, host ` - `source=accounts | parse email '.+@(?.+)' | top 1 host ` @@ -186,7 +186,7 @@ source = table | where ispresent(a) | - Limitation: [see limitations](ppl-parse-command.md#limitations) #### **Grok** -[See additional command details](ppl-grok-command) +[See additional command details](ppl-grok-command.md) - `source=accounts | grok email '.+@%{HOSTNAME:host}' | top 1 host` - `source=accounts | grok email '.+@%{HOSTNAME:host}' | stats count() by host` @@ -200,7 +200,7 @@ source = table | where ispresent(a) | - [see limitations](ppl-parse-command.md#limitations) #### **Patterns** -[See additional command details](ppl-patterns-command) +[See additional command details](ppl-patterns-command.md) - `source=accounts | patterns email | fields email, patterns_field ` - `source=accounts | patterns email | where age > 45 | sort - age | fields email, patterns_field` @@ -209,14 +209,14 @@ source = table | where ispresent(a) | - Limitation: [see limitations](ppl-parse-command.md#limitations) #### **Rename** -[See additional command details](ppl-rename-command) +[See additional command details](ppl-rename-command.md) - `source=accounts | rename email as user_email | fields id, user_email` - `source=accounts | rename id as user_id, email as user_email | fields user_id, user_email` #### **Join** -[See additional command details](ppl-join-command) +[See additional command details](ppl-join-command.md) - `source = table1 | inner join left = l right = r on l.a = r.a table2 | fields l.a, r.a, b, c` - `source = table1 | left join left = l right = r on l.a = r.a table2 | fields l.a, r.a, b, c` @@ -230,7 +230,7 @@ _- **Limitation: sub-searches is unsupported in join right side now**_ #### **Lookup** -[See additional command details](ppl-lookup-command) +[See additional command details](ppl-lookup-command.md) - `source = table1 | lookup table2 id` - `source = table1 | lookup table2 id, name` @@ -245,7 +245,7 @@ _- **Limitation: "REPLACE" or "APPEND" clause must contain "AS"**_ #### **InSubquery** -[See additional command details](ppl-inSubquery-command) +[See additional command details](ppl-subquery-command.md) - `source = outer | where a in [ source = inner | fields b ]` - `source = outer | where (a) in [ source = inner | fields b ]` @@ -353,3 +353,16 @@ source = supplier > ppl-correlation-command is an experimental command - it may be removed in future versions +--- +### Planned Commands: + +#### **fillnull** + +```sql + - `source=accounts | fillnull fields status_code=101` + - `source=accounts | fillnull fields request_path='/not_found', timestamp='*'` + - `source=accounts | fillnull using field1=101` + - `source=accounts | fillnull using field1=concat(field2, field3), field4=2*pi()*field5` + - `source=accounts | fillnull using field1=concat(field2, field3), field4=2*pi()*field5, field6 = 'N/A'` +``` +[See additional command details](planning/ppl-fillnull-command.md) diff --git a/docs/ppl-lang/README.md b/docs/ppl-lang/README.md index efbeafe91..16ff636f7 100644 --- a/docs/ppl-lang/README.md +++ b/docs/ppl-lang/README.md @@ -89,6 +89,10 @@ For additional examples see the next [documentation](PPL-Example-Commands.md). See samples of [PPL queries](PPL-Example-Commands.md) --- +### Planned PPL Commands + - [`FillNull`](planning/ppl-fillnull-command.md) + +--- ### PPL Project Roadmap [PPL Github Project Roadmap](https://github.com/orgs/opensearch-project/projects/214) \ No newline at end of file diff --git a/docs/ppl-lang/planning/ppl-fillnull-command.md b/docs/ppl-lang/planning/ppl-fillnull-command.md new file mode 100644 index 000000000..a9897fb4d --- /dev/null +++ b/docs/ppl-lang/planning/ppl-fillnull-command.md @@ -0,0 +1,50 @@ +## fillnull syntax proposal + +1. **Proposed syntax changes with `null` replacement with the same values in various fields** + - `... | fillnull with 0 in field1` + - `... | fillnull with 'N/A' in field1, field2, field3` + - `... | fillnull with 2*pi() + field1 in field2` + - `... | fillnull with concat(field1, field2) in field3, field4` + - `... | fillnull with 'N/A'` + - incorrect syntax + - `... | fillnull with 'N/A' in` + - validation error related to missing columns + +2. **Proposed syntax changes with `null` replacement with the various values in various fields** +* currently implemented, not conform to previous syntax proposal (`fields` addition) + - `... | fillnull fields status_code=101` + - `... | fillnull fields request_path='/not_found', timestamp='*'` +* New syntax proposal + - `... | fillnull using field1=101` + - `... | fillnull using field1=concat(field2, field3), field4=2*pi()*field5` + - `... | fillnull using field1=concat(field2, field3), field4=2*pi()*field5, field6 = 'N/A'` + - `... | fillnull using` + - validation error related to missing columns + +### New syntax definition in ANTLR + +```ANTLR + +fillnullCommand + : FILLNULL (fillNullWithTheSameValue + | fillNullWithFieldVariousValues) + ; + + fillNullWithTheSameValue + : WITH nullReplacement IN nullableField (COMMA nullableField)* + ; + + fillNullWithFieldVariousValues + : USING nullableField EQUAL nullReplacement (COMMA nullableField EQUAL nullReplacement)* + ; + + + nullableField + : fieldExpression + ; + + nullReplacement + : expression + ; + +``` \ No newline at end of file