-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE]New expand
PPL Command
#657
Comments
The main objective of Given the next ip->product Id purchase table,
we would like to expand each individual product, or rows, for each value in a multi-value field.
The following spark sql command should allow our expanding of the multi-valued field SELECT ipaddress, total_purchases, total_products, exploded_productId
FROM your_table
LATERAL VIEW explode(productId) AS exploded_productId Notes This functionality will enhance our ability to analyze product purchases by providing more granular data for each product associated with an IP address. Example ppl - source = table | expand productId as product_Id |
expand_field
PPL Commandexpand
PPL Command
Is your feature request related to a problem?
Adding a PPL new
expand
command which adds array and nested object expansion functionality to PPLIs your feature request related to a problem? Please describe.
OpenSearch's Piped Processing Language (PPL) currently lacks an efficient way to expand arrays and nested objects into separate events, similar to SQL's UNNEST or JSON expansion functions. This limitation hinders the analysis of complex data structures, particularly when working with JSON logs or documents containing arrays or nested objects.
Describe the solution you'd like
We propose adding a new command to PPL that would allow users to expand arrays and nested objects into separate events, similar to SQL's UNNEST function, but with additional flexibility.
The functionality should:
SQL-like example and comparison:
Consider this SQL-like syntax:
The proposed OpenSearch PPL equivalent might look like:
Key differences and advantages:
Describe alternatives you've considered
Current alternatives include:
Additional context
This feature would bridge the gap between SQL's structured data handling and the need for flexible, real-time analysis of semi-structured log data. It combines the power of SQL's UNNEST with the flexibility required for log and event processing.
Potential Impact
Proposed Implementation
The new command (e.g.,
expand
) could be implemented as a new command in the PPL engine, combining the concepts of SQL's UNNEST with the flexibility needed for unstructured log data.The text was updated successfully, but these errors were encountered: