-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[followup] Refactor JSON function and add TO_JSON_STRING, ARRAY_LENGHT functions #870
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,6 +28,7 @@ | |
|
||
import static org.opensearch.sql.expression.function.BuiltinFunctionName.ADD; | ||
import static org.opensearch.sql.expression.function.BuiltinFunctionName.ADDDATE; | ||
import static org.opensearch.sql.expression.function.BuiltinFunctionName.ARRAY_LENGTH; | ||
import static org.opensearch.sql.expression.function.BuiltinFunctionName.DATEDIFF; | ||
import static org.opensearch.sql.expression.function.BuiltinFunctionName.DATE_ADD; | ||
import static org.opensearch.sql.expression.function.BuiltinFunctionName.DATE_SUB; | ||
|
@@ -58,6 +59,7 @@ | |
import static org.opensearch.sql.expression.function.BuiltinFunctionName.SYSDATE; | ||
import static org.opensearch.sql.expression.function.BuiltinFunctionName.TIMESTAMPADD; | ||
import static org.opensearch.sql.expression.function.BuiltinFunctionName.TIMESTAMPDIFF; | ||
import static org.opensearch.sql.expression.function.BuiltinFunctionName.TO_JSON_STRING; | ||
import static org.opensearch.sql.expression.function.BuiltinFunctionName.TRIM; | ||
import static org.opensearch.sql.expression.function.BuiltinFunctionName.UTC_TIMESTAMP; | ||
import static org.opensearch.sql.expression.function.BuiltinFunctionName.WEEK; | ||
|
@@ -102,7 +104,9 @@ public interface BuiltinFunctionTransformer { | |
.put(COALESCE, "coalesce") | ||
.put(LENGTH, "length") | ||
.put(TRIM, "trim") | ||
.put(ARRAY_LENGTH, "array_size") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We choose |
||
// json functions | ||
.put(TO_JSON_STRING, "to_json") | ||
.put(JSON_KEYS, "json_object_keys") | ||
.put(JSON_EXTRACT, "get_json_object") | ||
.build(); | ||
|
@@ -126,26 +130,12 @@ public interface BuiltinFunctionTransformer { | |
.put( | ||
JSON_ARRAY_LENGTH, | ||
args -> { | ||
// Check if the input is an array (from json_array()) or a JSON string | ||
if (args.get(0) instanceof UnresolvedFunction) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The current logic in this But imaging following two cases: |
||
// Input is a JSON array | ||
return UnresolvedFunction$.MODULE$.apply("json_array_length", | ||
seq(UnresolvedFunction$.MODULE$.apply("to_json", seq(args), false)), false); | ||
} else { | ||
// Input is a JSON string | ||
return UnresolvedFunction$.MODULE$.apply("json_array_length", seq(args.get(0)), false); | ||
} | ||
return UnresolvedFunction$.MODULE$.apply("json_array_length", seq(args.get(0)), false); | ||
}) | ||
.put( | ||
JSON, | ||
args -> { | ||
// Check if the input is a named_struct (from json_object()) or a JSON string | ||
if (args.get(0) instanceof UnresolvedFunction) { | ||
return UnresolvedFunction$.MODULE$.apply("to_json", seq(args.get(0)), false); | ||
} else { | ||
return UnresolvedFunction$.MODULE$.apply("get_json_object", | ||
seq(args.get(0), Literal$.MODULE$.apply("$")), false); | ||
} | ||
return UnresolvedFunction$.MODULE$.apply("get_json_object", seq(args.get(0), Literal$.MODULE$.apply("$")), false); | ||
}) | ||
.put( | ||
JSON_VALID, | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @LantaoJin did we cover the next issue? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With #780 , we can access the first element in a array by
or
But it requires the input is a json string. If the input is json array (ArrayType), I think we need a new function such as
But as a workaround, we could rewrite by
So keep #675 open until we have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Let merge this first. From the description of #675 , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the naming:
to_json
is good name for Spark since there is no json object data type. Soto_json
in Spark is converting StructType to json String. Butto_json
might not a good name for PPL since it's easy to cause misunderstanding as converting something to json object.to_json_string
could be more straightforward.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes sounds like a better distinction ...