-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text and Keyword aggregation integration tests #176
base: Integ-newDataTypeForTextAggregations
Are you sure you want to change the base?
Text and Keyword aggregation integration tests #176
Conversation
Signed-off-by: MitchellGale-BitQuill <[email protected]>
Codecov Report
@@ Coverage Diff @@
## Integ-newDataTypeForTextAggregations #176 +/- ##
==========================================================================
- Coverage 98.28% 95.84% -2.44%
- Complexity 3450 3523 +73
==========================================================================
Files 345 358 +13
Lines 8580 9442 +862
Branches 547 676 +129
==========================================================================
+ Hits 8433 9050 +617
- Misses 142 334 +192
- Partials 5 58 +53
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
} }, | ||
"typeTextFieldData" : { | ||
"type": "text", | ||
"fielddata": true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be interesting to know how it works if we have a text
field with "fielddata": true
but doesn't have any fields
defined. Does this behave just like typeTextFieldData
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in 2a8fbac. It passes.
import static org.opensearch.sql.util.MatcherUtils.schema; | ||
import static org.opensearch.sql.util.MatcherUtils.verifySchema; | ||
|
||
public class TextTypeIT extends SQLIntegTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any other tests that you tried? These only look like aggregation tests, but it would be good to know if some of the following would also work:
WHERE field LIKE "keyFD??"
WHERE wildcard("field", "keyFD??")
SELECT field LIKE "keyFD??"
and maybe a couple of string-like functions:SELECT LOCATE("FD", field)
SELECT SUBSTRING(field, 3, 2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also try
SELECT POSITION(substring IN field)
since it seems to fail for text fields (as Margarit demonstrated)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SELECT POSITION(substring IN field)
Passed:
- selectPositionKeyword
- selectPositionText
- selectPositionTextKeywordFieldNoFieldData
- selectPositionTypeTextFieldData
- selectLocateTextDataFieldNoFields
- selectPositionTextDataFieldNoFields
Failed:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check selectPositionTextDataFieldNoFields - its giving you a parser error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, this looks exclusively like an issue with aggregation on text fields.
Strangely, I thought POSITION was going to fail on text fields, since it was failing for Margarit earlier.
@@ -0,0 +1,16 @@ | |||
{"index": {}} | |||
{"typeKeyword": "key00", "typeText": "text00", "typeKeywordFieldData": "keyFD00", "typeTextFieldData": "textFD00", "int0": 0} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
none of these documents have text fields larger than 256, so we aren't testing the ignore_above sub-field. Can we do that? Can we create a field with ignore_above of around 10, and add data (with spaces) larger than 10 characters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2a8fbac Added and it passes.
Signed-off-by: MitchellGale-BitQuill <[email protected]>
{"typeKeyword": "key00", "typeText": "text00", "typeKeywordFieldNoFieldData": "keyword00","typeTextFieldData": "keyFD00", "typeKeywordFieldData": "textFD00", "textDataFieldNoFields": "textFDNF00","int0": 0} | ||
{"index": {}} | ||
{"typeKeyword": "key01", "typeText": "text01", "typeKeywordFieldNoFieldData": "keyword01", "typeTextFieldData": "keyFD01", "typeTextFieldData": "textFD01OverTen", "textDataFieldNoFields": "textFDNF01", "int0": 1} | ||
{"index": {},typeKeywordFieldData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A typo? How it works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved. Issue created opensearch-project#1110
"fields": { | ||
"keyword": { | ||
"type": "keyword", | ||
"ignore_above": 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try adding a row with more than 10 words. Interesting to test this too.
Signed-off-by: MitchellGale-BitQuill <[email protected]>
Signed-off-by: MitchellGale-BitQuill <[email protected]>
import static org.opensearch.sql.util.MatcherUtils.schema; | ||
import static org.opensearch.sql.util.MatcherUtils.verifySchema; | ||
|
||
public class TextTypeIT extends SQLIntegTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, this looks exclusively like an issue with aggregation on text fields.
Strangely, I thought POSITION was going to fail on text fields, since it was failing for Margarit earlier.
aggregateOnText: FAILURE
|
selectPositionText: Failure
|
selectPositionTextKeywordFieldNoFieldData: FAILURE
|
selectPositionTypeTextFieldData: FAILURE
|
|
||
@Test | ||
public void selectPositionKeyword() { | ||
executeJdbcRequest(String.format("select POSITION(key IN typeKeyword) from %s", TEST_INDEX_TEXTKEYWORD)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you need quotes around the text:
POSITION(\\\"key\\\" IN typeKeyword)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved in a47e3ee
schema("SUBSTRING(textDataFieldNoFields, 1, 1)", null, "keyword")); | ||
} | ||
|
||
// protected JSONObject executeQuery(String query) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove commented code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved in 1371d40.
1371d40
to
7bf3753
Compare
Signed-off-by: MitchellGale-BitQuill <[email protected]>
Signed-off-by: MitchellGale-BitQuill <[email protected]>
Signed-off-by: MitchellGale-BitQuill <[email protected]>
Signed-off-by: MitchellGale-BitQuill <[email protected]>
Signed-off-by: MitchellGale-BitQuill <[email protected]>
7bf3753
to
0222df6
Compare
* Import h3 library (#154) Made following changes to make it compatible: 1. Rename package from elasticsearch to opensearch.geospatial 2. Update License headers 3. Update build file 4. Update settings to include sub projects * Use Transport Request (#164) Remove usage of deprecated BaseNodeRequest * Update http client package to resolve build failure (#168) (#171) * Introduce H3 min resolution constant (#165) H3 version 1 has 16 resolutions, numbered 0 through 15. Introduced a constant to represent min value, similar to max value. * Add geohex aggregation (#160) This aggregation will use uber's h3 to group coordinates into H3 cell. Created new aggregation type geohex_grid. The precision will be between 0 and 15. This aggreation has default precision as 5, similar to geohash and geotile. Signed-off-by: Vijayan Balasubramanian <[email protected]> * Add integration test (#176) Included integration test for geohex_grid. Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: MitchellGale-BitQuill [email protected]
Description
Adds integration tests for:
type keyword:
type text:
type keyword with field data false:
type keyword and field data true:
Integration test failure:
Only
aggregateOnText
failsIssues Resolved
Part of investigation for #1038
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.