You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem?
When the COUNT function is applied to a multivalued field, it will return the count of all scalar values associated with that field. Consider this table:
x | y
--+----------
1 | {1, 2, 3}
2 | {4, 5}
3 | NULL
COUNT(y) will return 5. Most SQL engines would return 2 in this case, since there are 2 rows containing a value for y.
What solution would you like?
The aggregate requested from the OpenSearch server could be more like:
There are a few forms of COUNT that need to be considered:
COUNT(<FIELD>) - can use an aggregate like above
COUNT(<LITERAL>) - can use an aggregate like above, but use the field _id to count all rows
COUNT(*) - can use an aggregate like above, but use the filed _id to count all rows
COUNT(NULL) - will always return 0
What alternatives have you considered?
The alternative is to keep the existing behaviour. This is likely to produce results that are unexpected for users, especially if they are used to other SQL engines.
Do you have any additional context?
Recently a change was made to support multivalued fields in the SQL plugin. Some of the aggregates may need to be updated to handle the new multivalued fields.
The text was updated successfully, but these errors were encountered:
Is it documented what the precise semantics of multivalued fields are for the SQL plugin? I imagine this isn't the only case where passing many values to a function that expects one value is going to be tricky. (count is easy enough, what about max or sum?)
Is it documented what the precise semantics of multivalued fields are for the SQL plugin? I imagine this isn't the only case where passing many values to a function that expects one value is going to be tricky. (count is easy enough, what about max or sum?)
This is also a bug with max/min. Multi-value results are generally not handled well with aggregation because OpenSearch doesn't store these values as arrays like SQL does.
Is your feature request related to a problem?
When the
COUNT
function is applied to a multivalued field, it will return the count of all scalar values associated with that field. Consider this table:COUNT(y)
will return5
. Most SQL engines would return2
in this case, since there are 2 rows containing a value fory
.What solution would you like?
The aggregate requested from the OpenSearch server could be more like:
There are a few forms of
COUNT
that need to be considered:COUNT(<FIELD>)
- can use an aggregate like aboveCOUNT(<LITERAL>)
- can use an aggregate like above, but use the field_id
to count all rowsCOUNT(*)
- can use an aggregate like above, but use the filed_id
to count all rowsCOUNT(NULL)
- will always return 0What alternatives have you considered?
The alternative is to keep the existing behaviour. This is likely to produce results that are unexpected for users, especially if they are used to other SQL engines.
Do you have any additional context?
Recently a change was made to support multivalued fields in the SQL plugin. Some of the aggregates may need to be updated to handle the new multivalued fields.
The text was updated successfully, but these errors were encountered: