Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update docs & links & add planned commands docs with process suggestion #745

Merged
merged 4 commits into from
Oct 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions .github/ISSUE_TEMPLATE/ppl_command_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
name: 🎆 PPL Command request
about: Request a new PPL command Or syntax change
title: '[PPL-Lang]'
labels: 'enhancement, untriaged'
assignees: ''
---
**Is your feature request related to a problem?**
A clear and concise description of what the PPL command/syntax change is about, why is it needed, e.g. _I'm always frustrated when [...]_

**What solution would you like?**
A clear and concise description of what you want to happen.
- Add Example new / updated syntax
- [Optional] Add suggested [ANTLR](https://www.antlr.org/) suggested grammar

**Add Proposal Document**

Under the [docs/planning](../../docs/ppl-lang/planning) folder add a dedicated page for your suggested command or syntax change

_**Example Proposal Document**_

See [ppl-fillnull-command.md](../../docs/ppl-lang/planning/ppl-fillnull-command.md) example

**Do you have any additional context?**
Add any other context or screenshots about the feature request here.
91 changes: 91 additions & 0 deletions OpenSearch-PPL-Command-Process.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# OpenSearch PPL Command Development Process
This document outlines the formal process for proposing and implementing new PPL commands or syntax changes in OpenSearch.

## Phase 1: Proposal

### 1.1 Create GitHub Issue

Start by creating a new GitHub issue using the following [template](.github/ISSUE_TEMPLATE/ppl_command_request.md):
```
name: PPL Command request
about: Request a new PPL command Or syntax change
title: '[PPL-Lang]'
labels: 'enhancement, untriaged'
assignees: ''
---

**Is your feature request related to a problem?**
A clear and concise description of what the PPL command/syntax change is about, why is it needed, e.g. _I'm always frustrated when [...]_

**What solution would you like?**
A clear and concise description of what you want to happen.
- Add Example new / updated syntax
- [Optional] Add suggested [ANTLR](https://www.antlr.org/) suggested grammar

**Add Proposal Document**
Under the [docs/planning](../../docs/ppl-lang/planning) folder add a dedicated page for your suggested command or syntax change

See [ppl-fillnull-command.md](../../docs/ppl-lang/planning/ppl-fillnull-command.md) example

**Do you have any additional context?**
Add any other context or screenshots about the feature request here.
```

### 1.2 Create Planning Document PR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to have versioning? Shall we consider backward compatibility with older spark versions? How this process will align with PPL opensearch engine effort?

Create a Pull Request that adds a new markdown file under the `docs/ppl-lang/planning folder`. This document should include:

1) Command overview and motivation
2) Detailed syntax specification
3) Example usage scenarios
4) Implementation considerations
5) Potential limitations or edge cases

## Phase 2: Review and Approval

1) Community members and maintainers review the proposal
2) Feedback is incorporated into the planning document / PR comments
3) Proposal is either accepted, rejected, or sent back for revision

## Phase 3: Experimental Implementation
Once approved, the command enters the experimental phase:

1) Create implementation PR with:
- Code changes
- Comprehensive test suite
- Documentation updates

2) Code is clearly marked as experimental using appropriate annotations
3) Documentation indicates experimental status
4) Experimental features are disabled by default in production

## Phase 4: Maturation
During the experimental phase:

1) Gather user feedback
2) Address issues and edge cases
3) Refine implementation and documentation
4) Regular review of usage and stability

## Phase 5: Formal Integration
When the command has matured:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to measure if the command is matured. You might have a case where a command in experimental phase for months without having any user feedback.


1) Create PR to remove experimental status
2) Update all documentation to reflect stable status
3) Ensure backward compatibility
4) Merge into main PPL command set

---

## Best Practices

* Follow existing PPL command patterns and conventions
* Ensure comprehensive test coverage
* Provide clear, detailed documentation with examples
* Consider performance implications
* Maintain backward compatibility when possible

## Timeline Expectations

* Proposal Review: 1-2 weeks
* Experimental Phase: 1-3 months
* Maturation to Formal Integration: Based on community feedback and stability
39 changes: 26 additions & 13 deletions docs/ppl-lang/PPL-Example-Commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
- `explain simple | describe table`

#### **Fields**
[See additional command details](ppl-fields-command)
[See additional command details](ppl-fields-command.md)
- `source = table`
- `source = table | fields a,b,c`
- `source = table | fields + a,b,c`
Expand Down Expand Up @@ -69,7 +69,7 @@ _- **Limitation: new field added by eval command with a function cannot be dropp


#### **Eval**:
[See additional command details](ppl-eval-command)
[See additional command details](ppl-eval-command.md)

Assumptions: `a`, `b`, `c` are existing fields in `table`
- `source = table | eval f = 1 | fields a,b,c,f`
Expand Down Expand Up @@ -118,7 +118,7 @@ source = table | where ispresent(a) |
- `source = table | eval a = signum(a) | where a < 0`

#### **Aggregations**
[See additional command details](ppl-stats-command)
[See additional command details](ppl-stats-command.md)

- `source = table | stats avg(a) `
- `source = table | where a < 50 | stats avg(c) `
Expand All @@ -145,7 +145,7 @@ source = table | where ispresent(a) |

#### **Dedup**

[See additional command details](ppl-dedup-command)
[See additional command details](ppl-dedup-command.md)

- `source = table | dedup a | fields a,b,c`
- `source = table | dedup a,b | fields a,b,c`
Expand All @@ -162,20 +162,20 @@ source = table | where ispresent(a) |
- `source = table | dedup 1 a consecutive=true| fields a,b,c` (Consecutive deduplication is unsupported)

#### **Rare**
[See additional command details](ppl-rare-command)
[See additional command details](ppl-rare-command.md)

- `source=accounts | rare gender`
- `source=accounts | rare age by gender`

#### **Top**
[See additional command details](ppl-top-command)
[See additional command details](ppl-top-command.md)

- `source=accounts | top gender`
- `source=accounts | top 1 gender`
- `source=accounts | top 1 age by gender`

#### **Parse**
[See additional command details](ppl-parse-command)
[See additional command details](ppl-parse-command.md)

- `source=accounts | parse email '.+@(?<host>.+)' | fields email, host `
- `source=accounts | parse email '.+@(?<host>.+)' | top 1 host `
Expand All @@ -186,7 +186,7 @@ source = table | where ispresent(a) |
- Limitation: [see limitations](ppl-parse-command.md#limitations)

#### **Grok**
[See additional command details](ppl-grok-command)
[See additional command details](ppl-grok-command.md)

- `source=accounts | grok email '.+@%{HOSTNAME:host}' | top 1 host`
- `source=accounts | grok email '.+@%{HOSTNAME:host}' | stats count() by host`
Expand All @@ -200,7 +200,7 @@ source = table | where ispresent(a) |
- [see limitations](ppl-parse-command.md#limitations)

#### **Patterns**
[See additional command details](ppl-patterns-command)
[See additional command details](ppl-patterns-command.md)

- `source=accounts | patterns email | fields email, patterns_field `
- `source=accounts | patterns email | where age > 45 | sort - age | fields email, patterns_field`
Expand All @@ -209,14 +209,14 @@ source = table | where ispresent(a) |
- Limitation: [see limitations](ppl-parse-command.md#limitations)

#### **Rename**
[See additional command details](ppl-rename-command)
[See additional command details](ppl-rename-command.md)

- `source=accounts | rename email as user_email | fields id, user_email`
- `source=accounts | rename id as user_id, email as user_email | fields user_id, user_email`


#### **Join**
[See additional command details](ppl-join-command)
[See additional command details](ppl-join-command.md)

- `source = table1 | inner join left = l right = r on l.a = r.a table2 | fields l.a, r.a, b, c`
- `source = table1 | left join left = l right = r on l.a = r.a table2 | fields l.a, r.a, b, c`
Expand All @@ -230,7 +230,7 @@ _- **Limitation: sub-searches is unsupported in join right side now**_


#### **Lookup**
[See additional command details](ppl-lookup-command)
[See additional command details](ppl-lookup-command.md)

- `source = table1 | lookup table2 id`
- `source = table1 | lookup table2 id, name`
Expand All @@ -245,7 +245,7 @@ _- **Limitation: "REPLACE" or "APPEND" clause must contain "AS"**_


#### **InSubquery**
[See additional command details](ppl-inSubquery-command)
[See additional command details](ppl-subquery-command.md)

- `source = outer | where a in [ source = inner | fields b ]`
- `source = outer | where (a) in [ source = inner | fields b ]`
Expand Down Expand Up @@ -353,3 +353,16 @@ source = supplier

> ppl-correlation-command is an experimental command - it may be removed in future versions

---
### Planned Commands:

#### **fillnull**

```sql
- `source=accounts | fillnull fields status_code=101`
- `source=accounts | fillnull fields request_path='/not_found', timestamp='*'`
- `source=accounts | fillnull using field1=101`
- `source=accounts | fillnull using field1=concat(field2, field3), field4=2*pi()*field5`
- `source=accounts | fillnull using field1=concat(field2, field3), field4=2*pi()*field5, field6 = 'N/A'`
```
[See additional command details](planning/ppl-fillnull-command.md)
4 changes: 4 additions & 0 deletions docs/ppl-lang/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,10 @@ For additional examples see the next [documentation](PPL-Example-Commands.md).
See samples of [PPL queries](PPL-Example-Commands.md)

---
### Planned PPL Commands

- [`FillNull`](planning/ppl-fillnull-command.md)

---
### PPL Project Roadmap
[PPL Github Project Roadmap](https://github.com/orgs/opensearch-project/projects/214)
50 changes: 50 additions & 0 deletions docs/ppl-lang/planning/ppl-fillnull-command.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
## fillnull syntax proposal

1. **Proposed syntax changes with `null` replacement with the same values in various fields**
- `... | fillnull with 0 in field1`
- `... | fillnull with 'N/A' in field1, field2, field3`
- `... | fillnull with 2*pi() + field1 in field2`
- `... | fillnull with concat(field1, field2) in field3, field4`
- `... | fillnull with 'N/A'`
- incorrect syntax
- `... | fillnull with 'N/A' in`
- validation error related to missing columns

2. **Proposed syntax changes with `null` replacement with the various values in various fields**
* currently implemented, not conform to previous syntax proposal (`fields` addition)
- `... | fillnull fields status_code=101`
- `... | fillnull fields request_path='/not_found', timestamp='*'`
* New syntax proposal
- `... | fillnull using field1=101`
- `... | fillnull using field1=concat(field2, field3), field4=2*pi()*field5`
- `... | fillnull using field1=concat(field2, field3), field4=2*pi()*field5, field6 = 'N/A'`
- `... | fillnull using`
- validation error related to missing columns

### New syntax definition in ANTLR

```ANTLR

fillnullCommand
: FILLNULL (fillNullWithTheSameValue
| fillNullWithFieldVariousValues)
;

fillNullWithTheSameValue
: WITH nullReplacement IN nullableField (COMMA nullableField)*
;

fillNullWithFieldVariousValues
: USING nullableField EQUAL nullReplacement (COMMA nullableField EQUAL nullReplacement)*
;


nullableField
: fieldExpression
;

nullReplacement
: expression
;

```
Loading