Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

P1 Antlr grammar #859

Merged
merged 10 commits into from
Nov 2, 2024
10 changes: 10 additions & 0 deletions ppl-spark-integration/src/main/antlr4/OpenSearchPPLLexer.g4
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@ KMEANS: 'KMEANS';
AD: 'AD';
ML: 'ML';
FILLNULL: 'FILLNULL';
FLATTEN: 'FLATTEN';
EXPAND: 'EXPAND';
TRENDLINE: 'TRENDLINE';

//Native JOIN KEYWORDS
JOIN: 'JOIN';
Expand Down Expand Up @@ -89,6 +92,10 @@ FIELDSUMMARY: 'FIELDSUMMARY';
INCLUDEFIELDS: 'INCLUDEFIELDS';
NULLS: 'NULLS';

//TRENDLINE KEYWORDS
SMA: 'SMA';
WMA: 'WMA';

// ARGUMENT KEYWORDS
KEEPEMPTY: 'KEEPEMPTY';
CONSECUTIVE: 'CONSECUTIVE';
Expand Down Expand Up @@ -396,6 +403,9 @@ ISPRESENT: 'ISPRESENT';
BETWEEN: 'BETWEEN';
CIDRMATCH: 'CIDRMATCH';

// Geo Loction
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: Loction -> Location

GEOIP: 'GEOIP';

// FLOWCONTROL FUNCTIONS
IFNULL: 'IFNULL';
NULLIF: 'NULLIF';
Expand Down
35 changes: 34 additions & 1 deletion ppl-spark-integration/src/main/antlr4/OpenSearchPPLParser.g4
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,9 @@ commands
| renameCommand
| fillnullCommand
| fieldsummaryCommand
| flattenCommand
| expandCommand
| trendlineCommand
;

commandName
Expand Down Expand Up @@ -80,16 +83,20 @@ commandName
| PATTERNS
| LOOKUP
| RENAME
| FLATTEN
| EXPAND
| FILLNULL
| FIELDSUMMARY
| TRENDLINE
| GEOIP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GETOIP is not a command name

;

searchCommand
: (SEARCH)? fromClause # searchFrom
| (SEARCH)? fromClause logicalExpression # searchFromFilter
| (SEARCH)? logicalExpression fromClause # searchFilterFrom
;

fieldsummaryCommand
: FIELDSUMMARY (fieldsummaryParameter)*
;
Expand Down Expand Up @@ -246,6 +253,26 @@ fillnullCommand
: expression
;

flattenCommand
: FLATTEN fieldExpression
;
Comment on lines +256 to +257
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is the indent level correct? (save for below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it is the same with the following
kmeansCommand, adCommand and so on...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no right or wrong about 3 spaces or 4 spaces as indent level. All indent level should be same.


expandCommand
: EXPAND fieldExpression
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use expand instead of expand_field? (ref)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

;

trendlineCommand
: TRENDLINE (SORT sortField)? trendlineClause (trendlineClause)*
;

trendlineClause
: trendlineType LT_PRTHS numberOfDataPoints = integerLiteral COMMA field = fieldExpression RT_PRTHS AS alias = fieldExpression
;

trendlineType
: SMA
| WMA
;

kmeansCommand
: KMEANS (kmeansParameter)*
Expand Down Expand Up @@ -430,6 +457,7 @@ primaryExpression
: evalFunctionCall
| fieldExpression
| literalValue
| geoipFunctionCall
;

positionFunction
Expand Down Expand Up @@ -516,6 +544,11 @@ dataTypeFunctionCall
: CAST LT_PRTHS expression AS convertedDataType RT_PRTHS
;

// geoip function
geoipFunctionCall
: GEOIP LT_PRTHS (datasource = functionArg COMMA)? ipAddress = functionArg (COMMA properties = stringLiteral)? RT_PRTHS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking why not reusing evalFunctionCall? I don't see any special syntax definition here. And IP function sounds not one of the primary expressions.

;

// boolean functions
booleanFunctionCall
: conditionFunctionBase LT_PRTHS functionArgs RT_PRTHS
Expand Down
Loading