-
Notifications
You must be signed in to change notification settings - Fork 33
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into p1-antlr-grammar
# Conflicts: # ppl-spark-integration/src/main/antlr4/OpenSearchPPLLexer.g4 # ppl-spark-integration/src/main/antlr4/OpenSearchPPLParser.g4
- Loading branch information
Showing
45 changed files
with
2,483 additions
and
549 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
## PPL `flatten` command | ||
|
||
### Description | ||
Using `flatten` command to flatten a field of type: | ||
- `struct<?,?>` | ||
- `array<struct<?,?>>` | ||
|
||
|
||
### Syntax | ||
`flatten <field>` | ||
|
||
* field: to be flattened. The field must be of supported type. | ||
|
||
### Test table | ||
#### Schema | ||
| col\_name | data\_type | | ||
|-----------|-------------------------------------------------| | ||
| \_time | string | | ||
| bridges | array\<struct\<length:bigint,name:string\>\> | | ||
| city | string | | ||
| coor | struct\<alt:bigint,lat:double,long:double\> | | ||
| country | string | | ||
#### Data | ||
| \_time | bridges | city | coor | country | | ||
|---------------------|----------------------------------------------|---------|------------------------|---------------| | ||
| 2024-09-13T12:00:00 | [{801, Tower Bridge}, {928, London Bridge}] | London | {35, 51.5074, -0.1278} | England | | ||
| 2024-09-13T12:00:00 | [{232, Pont Neuf}, {160, Pont Alexandre III}]| Paris | {35, 48.8566, 2.3522} | France | | ||
| 2024-09-13T12:00:00 | [{48, Rialto Bridge}, {11, Bridge of Sighs}] | Venice | {2, 45.4408, 12.3155} | Italy | | ||
| 2024-09-13T12:00:00 | [{516, Charles Bridge}, {343, Legion Bridge}]| Prague | {200, 50.0755, 14.4378}| Czech Republic| | ||
| 2024-09-13T12:00:00 | [{375, Chain Bridge}, {333, Liberty Bridge}] | Budapest| {96, 47.4979, 19.0402} | Hungary | | ||
| 1990-09-13T12:00:00 | NULL | Warsaw | NULL | Poland | | ||
|
||
|
||
|
||
### Example 1: flatten struct | ||
This example shows how to flatten a struct field. | ||
PPL query: | ||
- `source=table | flatten coor` | ||
|
||
| \_time | bridges | city | country | alt | lat | long | | ||
|---------------------|----------------------------------------------|---------|---------------|-----|--------|--------| | ||
| 2024-09-13T12:00:00 | [{801, Tower Bridge}, {928, London Bridge}] | London | England | 35 | 51.5074| -0.1278| | ||
| 2024-09-13T12:00:00 | [{232, Pont Neuf}, {160, Pont Alexandre III}]| Paris | France | 35 | 48.8566| 2.3522 | | ||
| 2024-09-13T12:00:00 | [{48, Rialto Bridge}, {11, Bridge of Sighs}] | Venice | Italy | 2 | 45.4408| 12.3155| | ||
| 2024-09-13T12:00:00 | [{516, Charles Bridge}, {343, Legion Bridge}]| Prague | Czech Republic| 200 | 50.0755| 14.4378| | ||
| 2024-09-13T12:00:00 | [{375, Chain Bridge}, {333, Liberty Bridge}] | Budapest| Hungary | 96 | 47.4979| 19.0402| | ||
| 1990-09-13T12:00:00 | NULL | Warsaw | Poland | NULL| NULL | NULL | | ||
|
||
|
||
|
||
### Example 2: flatten array | ||
|
||
The example shows how to flatten an array of struct fields. | ||
|
||
PPL query: | ||
- `source=table | flatten bridges` | ||
|
||
| \_time | city | coor | country | length | name | | ||
|---------------------|---------|------------------------|---------------|--------|-------------------| | ||
| 2024-09-13T12:00:00 | London | {35, 51.5074, -0.1278} | England | 801 | Tower Bridge | | ||
| 2024-09-13T12:00:00 | London | {35, 51.5074, -0.1278} | England | 928 | London Bridge | | ||
| 2024-09-13T12:00:00 | Paris | {35, 48.8566, 2.3522} | France | 232 | Pont Neuf | | ||
| 2024-09-13T12:00:00 | Paris | {35, 48.8566, 2.3522} | France | 160 | Pont Alexandre III| | ||
| 2024-09-13T12:00:00 | Venice | {2, 45.4408, 12.3155} | Italy | 48 | Rialto Bridge | | ||
| 2024-09-13T12:00:00 | Venice | {2, 45.4408, 12.3155} | Italy | 11 | Bridge of Sighs | | ||
| 2024-09-13T12:00:00 | Prague | {200, 50.0755, 14.4378}| Czech Republic| 516 | Charles Bridge | | ||
| 2024-09-13T12:00:00 | Prague | {200, 50.0755, 14.4378}| Czech Republic| 343 | Legion Bridge | | ||
| 2024-09-13T12:00:00 | Budapest| {96, 47.4979, 19.0402} | Hungary | 375 | Chain Bridge | | ||
| 2024-09-13T12:00:00 | Budapest| {96, 47.4979, 19.0402} | Hungary | 333 | Liberty Bridge | | ||
| 1990-09-13T12:00:00 | Warsaw | NULL | Poland | NULL | NULL | | ||
|
||
|
||
### Example 3: flatten array and struct | ||
This example shows how to flatten multiple fields. | ||
PPL query: | ||
- `source=table | flatten bridges | flatten coor` | ||
|
||
| \_time | city | country | length | name | alt | lat | long | | ||
|---------------------|---------|---------------|--------|-------------------|------|--------|--------| | ||
| 2024-09-13T12:00:00 | London | England | 801 | Tower Bridge | 35 | 51.5074| -0.1278| | ||
| 2024-09-13T12:00:00 | London | England | 928 | London Bridge | 35 | 51.5074| -0.1278| | ||
| 2024-09-13T12:00:00 | Paris | France | 232 | Pont Neuf | 35 | 48.8566| 2.3522 | | ||
| 2024-09-13T12:00:00 | Paris | France | 160 | Pont Alexandre III| 35 | 48.8566| 2.3522 | | ||
| 2024-09-13T12:00:00 | Venice | Italy | 48 | Rialto Bridge | 2 | 45.4408| 12.3155| | ||
| 2024-09-13T12:00:00 | Venice | Italy | 11 | Bridge of Sighs | 2 | 45.4408| 12.3155| | ||
| 2024-09-13T12:00:00 | Prague | Czech Republic| 516 | Charles Bridge | 200 | 50.0755| 14.4378| | ||
| 2024-09-13T12:00:00 | Prague | Czech Republic| 343 | Legion Bridge | 200 | 50.0755| 14.4378| | ||
| 2024-09-13T12:00:00 | Budapest| Hungary | 375 | Chain Bridge | 96 | 47.4979| 19.0402| | ||
| 2024-09-13T12:00:00 | Budapest| Hungary | 333 | Liberty Bridge | 96 | 47.4979| 19.0402| | ||
| 1990-09-13T12:00:00 | Warsaw | Poland | NULL | NULL | NULL | NULL | NULL | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
## PPL trendline Command | ||
|
||
**Description** | ||
Using ``trendline`` command to calculate moving averages of fields. | ||
|
||
|
||
### Syntax | ||
`TRENDLINE [sort <[+|-] sort-field>] SMA(number-of-datapoints, field) [AS alias] [SMA(number-of-datapoints, field) [AS alias]]...` | ||
|
||
* [+|-]: optional. The plus [+] stands for ascending order and NULL/MISSING first and a minus [-] stands for descending order and NULL/MISSING last. **Default:** ascending order and NULL/MISSING first. | ||
* sort-field: mandatory when sorting is used. The field used to sort. | ||
* number-of-datapoints: mandatory. number of datapoints to calculate the moving average (must be greater than zero). | ||
* field: mandatory. the name of the field the moving average should be calculated for. | ||
* alias: optional. the name of the resulting column containing the moving average. | ||
|
||
And the moment only the Simple Moving Average (SMA) type is supported. | ||
|
||
It is calculated like | ||
|
||
f[i]: The value of field 'f' in the i-th data-point | ||
n: The number of data-points in the moving window (period) | ||
t: The current time index | ||
|
||
SMA(t) = (1/n) * Σ(f[i]), where i = t-n+1 to t | ||
|
||
### Example 1: Calculate simple moving average for a timeseries of temperatures | ||
|
||
The example calculates the simple moving average over temperatures using two datapoints. | ||
|
||
PPL query: | ||
|
||
os> source=t | trendline sma(2, temperature) as temp_trend; | ||
fetched rows / total rows = 5/5 | ||
+-----------+---------+--------------------+----------+ | ||
|temperature|device-id| timestamp|temp_trend| | ||
+-----------+---------+--------------------+----------+ | ||
| 12| 1492|2023-04-06 17:07:...| NULL| | ||
| 12| 1492|2023-04-06 17:07:...| 12.0| | ||
| 13| 256|2023-04-06 17:07:...| 12.5| | ||
| 14| 257|2023-04-06 17:07:...| 13.5| | ||
| 15| 258|2023-04-06 17:07:...| 14.5| | ||
+-----------+---------+--------------------+----------+ | ||
|
||
### Example 2: Calculate simple moving averages for a timeseries of temperatures with sorting | ||
|
||
The example calculates two simple moving average over temperatures using two and three datapoints sorted descending by device-id. | ||
|
||
PPL query: | ||
|
||
os> source=t | trendline sort - device-id sma(2, temperature) as temp_trend_2 sma(3, temperature) as temp_trend_3; | ||
fetched rows / total rows = 5/5 | ||
+-----------+---------+--------------------+------------+------------------+ | ||
|temperature|device-id| timestamp|temp_trend_2| temp_trend_3| | ||
+-----------+---------+--------------------+------------+------------------+ | ||
| 15| 258|2023-04-06 17:07:...| NULL| NULL| | ||
| 14| 257|2023-04-06 17:07:...| 14.5| NULL| | ||
| 13| 256|2023-04-06 17:07:...| 13.5| 14.0| | ||
| 12| 1492|2023-04-06 17:07:...| 12.5| 13.0| | ||
| 12| 1492|2023-04-06 17:07:...| 12.0|12.333333333333334| | ||
+-----------+---------+--------------------+------------+------------------+ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.