Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix ambiguous exception when type mismatch in fillnull command #960

Merged
merged 1 commit into from
Dec 3, 2024

Conversation

qianheng-aws
Copy link
Contributor

Description

Fix ambiguous exception when type mismatch in fillnull command

Related Issues

Resolves #959

Check List

  • Updated documentation (docs/ppl-lang/README.md)
  • Implemented unit tests
  • Implemented tests for combination with other commands
  • New added source code should include a copyright header
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -452,10 +453,30 @@ public LogicalPlan visitFillNull(FillNull fillNull, CatalystPlanContext context)
Seq<NamedExpression> projectExpressions = context.retainAllNamedParseExpressions(p -> (NamedExpression) p);
// build the plan with the projection step
context.apply(p -> new org.apache.spark.sql.catalyst.plans.logical.Project(projectExpressions, p));
LogicalPlan resultWithoutDuplicatedColumns = context.apply(logicalPlan -> DataFrameDropColumns$.MODULE$.apply(seq(toDrop), logicalPlan));
LogicalPlan resultWithoutDuplicatedColumns = context.apply(dropOriginalColumns(p -> p.children().head(), toDrop));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you test this patch with the latest Spark version locally to double confirm as a long term fixing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified with Spark branch-3.5, which includes this fix: apache/spark#48240.

As expected, it will throw ambiguous exception for all cases and could be addressed with this PR.

Copy link
Member

@YANG-DB YANG-DB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qianheng-aws thanks !

@YANG-DB YANG-DB added Lang:PPL Pipe Processing Language support 0.7 labels Dec 2, 2024
@YANG-DB
Copy link
Member

YANG-DB commented Dec 2, 2024

@qianheng-aws @LantaoJin What is the expected GA spark version we can assume this to be released with (~3.5 ? ) ?
Should we add a notice in the release for 0.7 regarding this ?

@LantaoJin
Copy link
Member

@qianheng-aws @LantaoJin What is the expected GA spark version we can assume this to be released with (~3.5 ? ) ? Should we add a notice in the release for 0.7 regarding this ?

This patch should be released in Spark v3.5.4 and v4.0.0. I have no idea about the date of release. I think this PR resolves the ambiguous issue so we don't need a notice in release any more.

@LantaoJin LantaoJin merged commit 63222a7 into opensearch-project:main Dec 3, 2024
6 checks passed
kenrickyap pushed a commit to Bit-Quill/opensearch-spark that referenced this pull request Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.7 Lang:PPL Pipe Processing Language support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Fillnull command throw AMBIGUOUS_REFERENCE exception
3 participants