Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]Add fillnull command to PPL #670

Closed
YANG-DB opened this issue Sep 16, 2024 · 2 comments
Closed

[FEATURE]Add fillnull command to PPL #670

YANG-DB opened this issue Sep 16, 2024 · 2 comments
Labels
0.6 enhancement New feature or request Lang:PPL Pipe Processing Language support

Comments

@YANG-DB
Copy link
Member

YANG-DB commented Sep 16, 2024

Description:
We propose adding a fillnull command to OpenSearch's Piped Processing Language (PPL) to provide a convenient way to handle null or missing values in query results. This feature would be similar to the fillnull command in Splunk's SPL, enhancing PPL's data cleaning and preparation capabilities.

Proposed Functionality:

  1. The 'fillnull' command should allow users to replace null values with a specified value.
  2. It should support filling nulls for specific fields or all fields.
  3. The command should allow different fill values for different fields.
  4. It should support conditional filling based on other field values or expressions.

Example Usage:

... | fillnull value=0

This would replace all null values in all fields with 0.

... | fillnull value=N/A field1, field2

This would replace null values in field1 and field2 with "N/A".

... | fillnull field1=0 field2="Unknown" field3=false

This would fill null values in different fields with different values.

... | eval new_field = if(field1 == "category1", field2, null) | fillnull value=0 new_field

This example uses eval to create a new field (or overwrite an existing one) based on a condition, and then use fillnull to handle the null values

...
| eval field1 = if(field1 == "category1", field1, null), field2 = if(field2 == "category2", field2, null)
| fillnull field1=0 field2="Unknown"

This example uses multiple eval expressions to handle different conditions for multiple fields, followed by fillnull


implementation Considerations:

  1. Ensure compatibility with existing PPL commands and syntax
  2. Optimize performance for large datasets with many null values
  3. Provide clear documentation and examples for users
  4. Consider type-checking or type-conversion for filled values
@YANG-DB YANG-DB added enhancement New feature or request untriaged Lang:PPL Pipe Processing Language support labels Sep 16, 2024
@YANG-DB YANG-DB moved this to Todo in PPL Commands Sep 16, 2024
@salyh salyh moved this from Todo to In Progress in PPL Commands Sep 24, 2024
@dblock dblock removed the untriaged label Oct 7, 2024
@dblock
Copy link
Member

dblock commented Oct 7, 2024

[Catch All Triage - 1, 2, 3, 4]

@salyh salyh moved this from In Progress to Done in PPL Commands Oct 8, 2024
@YANG-DB YANG-DB added the 0.6 label Oct 9, 2024
@anasalkouz
Copy link
Member

@salyh @YANG-DB can we close the issue?

@YANG-DB YANG-DB closed this as completed Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.6 enhancement New feature or request Lang:PPL Pipe Processing Language support
Projects
Status: Done
Development

No branches or pull requests

3 participants