Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-6089] Handle default insert behaviour to ingest duplicates #10728

Conversation

wombatu-kun
Copy link
Contributor

Change Logs

Made default value of "hoodie.merge.allow.duplicate.on.inserts" as True and fixed tests.

Impact

low

Risk level (write none, low medium or high below)

low

Documentation Update

Documentation update is needed as default value of HoodieWriteConfig property "hoodie.merge.allow.duplicate.on.inserts" changed to true

  • The config description must be updated if new configs are added or the default value of the configs are changed
  • Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
    ticket number here and follow the instruction to make
    changes to the website.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@wombatu-kun wombatu-kun marked this pull request as draft February 22, 2024 05:02
@wombatu-kun wombatu-kun force-pushed the HUDI-6089_duplicate_on_insert_default_true branch from 6348547 to 22f8752 Compare February 23, 2024 12:02
@wombatu-kun wombatu-kun marked this pull request as ready for review February 23, 2024 14:00
@@ -562,7 +562,7 @@ public class HoodieWriteConfig extends HoodieConfig {

public static final ConfigProperty<String> MERGE_ALLOW_DUPLICATE_ON_INSERTS_ENABLE = ConfigProperty
.key("hoodie.merge.allow.duplicate.on.inserts")
.defaultValue("false")
.defaultValue("true")
.markAdvanced()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any real use case to illustrate this switch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's rebase with the latest master and make all the tests pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, PR is ready to merge.

@apache apache deleted a comment from hudi-bot Feb 25, 2024
@wombatu-kun wombatu-kun force-pushed the HUDI-6089_duplicate_on_insert_default_true branch from 22f8752 to 11a6e3d Compare February 26, 2024 02:15
@wombatu-kun wombatu-kun marked this pull request as draft February 26, 2024 06:46
@wombatu-kun wombatu-kun force-pushed the HUDI-6089_duplicate_on_insert_default_true branch from 11a6e3d to 6ea9dfe Compare February 26, 2024 07:53
@github-actions github-actions bot added the size:S PR with lines of changes in (10, 100] label Feb 26, 2024
@wombatu-kun wombatu-kun force-pushed the HUDI-6089_duplicate_on_insert_default_true branch from 6ea9dfe to 337a99c Compare February 27, 2024 02:20
@wombatu-kun wombatu-kun marked this pull request as ready for review February 27, 2024 05:23
@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Copy link
Contributor

@bvaradar bvaradar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. This is needed to make the insert behavior consistent.

@bvaradar
Copy link
Contributor

bvaradar commented Mar 2, 2024

cc @nsivabalan : this needs update in documentation.

@danny0405 danny0405 merged commit 3a864ec into apache:master Mar 3, 2024
31 checks passed
@wombatu-kun
Copy link
Contributor Author

wombatu-kun commented Mar 3, 2024

@bvaradar : update in documentation is already made and merged #10739

yihua pushed a commit that referenced this pull request May 14, 2024
yihua pushed a commit that referenced this pull request May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-0.15.0 size:S PR with lines of changes in (10, 100]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants