Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add antlr grammar for testcase file #2

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

feat: add antlr grammar for testcase file #2

wants to merge 19 commits into from

Conversation

scgkiran
Copy link
Owner

@scgkiran scgkiran commented Sep 2, 2024

Coverage tool output

/extensions/functions_arithmetic_decimal.yaml 		add(dec, dec): 0
/extensions/functions_arithmetic_decimal.yaml 		subtract(dec, dec): 0
/extensions/functions_arithmetic_decimal.yaml 		multiply(dec, dec): 0
/extensions/functions_arithmetic_decimal.yaml 		divide(dec, dec): 0
/extensions/functions_arithmetic_decimal.yaml 		modulus(dec, dec): 0
/extensions/functions_arithmetic_decimal.yaml 		power(dec, dec): 9
. . .
. . .
/extensions/functions_aggregate_decimal_output.yaml 		approx_count_distinct(any): 0
/extensions/functions_aggregate_decimal_output.yaml 		count(any): 0
/extensions/functions_aggregate_decimal_output.yaml 		count(): 0

Total test count: 37, 12/510 function variants are covered

Copy link

github-actions bot commented Sep 2, 2024

ACTION NEEDED

Substrait follows the Conventional Commits
specification
for
release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@scgkiran scgkiran changed the title WIP: feat: add antlr grammar for testcase file feat: add antlr grammar for testcase file Sep 3, 2024
@scgkiran scgkiran marked this pull request as ready for review September 3, 2024 15:54
grammar/SubstraitType.g4 Outdated Show resolved Hide resolved
grammar/SubstraitType.g4 Outdated Show resolved Hide resolved
grammar/SubstraitType.g4 Outdated Show resolved Hide resolved
grammar/SubstraitType.g4 Outdated Show resolved Hide resolved
grammar/SubstraitType.g4 Outdated Show resolved Hide resolved
| PrecisionTimestamp isnull=QMark? Lt precision=numericParameter Gt #precisionTimestamp
| PrecisionTimestampTZ isnull=QMark? Lt precision=numericParameter Gt #precisionTimestampTZ
| Struct isnull=QMark? Lt expr (Comma expr)* Gt #struct
| NStruct isnull=QMark? Lt Identifier expr (Comma Identifier expr)* Gt #nStruct

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see Nstruct in type.proto (here). Where it is coming from ?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| Decimal isnull=QMark? Lt precision=numericParameter Comma scale=numericParameter Gt #decimal
| PrecisionTimestamp isnull=QMark? Lt precision=numericParameter Gt #precisionTimestamp
| PrecisionTimestampTZ isnull=QMark? Lt precision=numericParameter Gt #precisionTimestampTZ
| Struct isnull=QMark? Lt expr (Comma expr)* Gt #struct

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't struct take type. In proto file it takes type (here)

Suggested change
| Struct isnull=QMark? Lt expr (Comma expr)* Gt #struct
| Struct isnull=QMark? Lt type (Comma type)* Gt #struct

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expr is type expression

| PrecisionTimestampTZ isnull=QMark? Lt precision=numericParameter Gt #precisionTimestampTZ
| Struct isnull=QMark? Lt expr (Comma expr)* Gt #struct
| NStruct isnull=QMark? Lt Identifier expr (Comma Identifier expr)* Gt #nStruct
| List isnull=QMark? Lt expr Gt #list

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as struct. Proto file mentions type (here)

Suggested change
| List isnull=QMark? Lt expr Gt #list
| List isnull=QMark? Lt type Gt #list

| Struct isnull=QMark? Lt expr (Comma expr)* Gt #struct
| NStruct isnull=QMark? Lt Identifier expr (Comma Identifier expr)* Gt #nStruct
| List isnull=QMark? Lt expr Gt #list
| Map isnull=QMark? Lt key=expr Comma value=expr Gt #map

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as struct. Proto file mentions type (here)

Suggested change
| Map isnull=QMark? Lt key=expr Comma value=expr Gt #map
| Map isnull=QMark? Lt key=type Comma value=type Gt #map

Copy link

@anshuldata anshuldata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed it fully. I am still halfway

@@ -0,0 +1,330 @@
token literal names:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how others feel but seeing file names that are Test* makes me think they are test files as opposed to files that are responsible parsing a test file format. Maybe some other naming pattern could make that clearer.

Copy link
Owner Author

@scgkiran scgkiran Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about FuncTestSpec as prefix?
This file will become FuncTestSpecLexer

"i8": "i8",
"i16": "i16",
"i32": "i32",
"i64": "i64",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way we can push these patterns into the parser for use across tools? I think we need them to go from in grammar literal to types. Let's just do that with the grammar.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a change to use rule names, and added a test to assert to ensure names do not change. Grammar is not using simple literals for type names to allow case-insensitive names.

short_type_to_type = {st: lt for lt, st in type_to_short_type.items()}


class Extension:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it isn't clear why we need this stuff. seems like the grammar could handle and move us straight to appropriate objects in many (most?) cases.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to build function registry from extensions which will be used to compute test coverage for each function variant.

tests/coverage/test_coverage.py Outdated Show resolved Hide resolved
tests/coverage/test_coverage.py Outdated Show resolved Hide resolved
@@ -0,0 +1,12 @@
# SPDX-License-Identifier: Apache-2.0
def test_read_substrait_extensions():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should add to github precommit tests.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

tests/coverage/coverage.py Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants