-
Notifications
You must be signed in to change notification settings - Fork 796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: parquet 53.0.0
feature branch
#6050
Comments
If we can't merge breaking changes to main I agree this is the path forward. It's very generous of you to take up that burden. For what it's worth I don't think #6000 should include breaking changes, if there are any I think they're accidental. I'll double check. I do agree it might cause merge conflicts. |
FWIW I just tried a merge of #5486 and #6000. The only significant conflict was missing (new) arguments for the thrift The bloom filter changes seem to be orthogonal to the other two, so I don't anticipate any issues there. |
Thanks @adriangb -- I agree the first PR won't have API conflicts. However, I expect some iteration on the APIs, so once we have merged / released new API (Stats builder) then any changes we make to the new API will require "breaking changes" so we would have to remember what APIs we have released / haven't released. Maybe we could put it behind a feature flag or something that made it clear the API would change 🤔 |
I guess I was imagining the parquet metadata encoder would also potentially have the ability to write bloom filters and thus may be affected by #6000 |
🤔 maybe I can simply make a |
The more I think about this the more sense I think it makes to have a 53 feature branch for the next few weeks so we can not build up a massive set of conclits. I plan to create one tomorrow unless there are objections and start merging stuff there to clear the review queue |
Sounds good to me. In addition to #6045, I have 3 stacked PRs ready to go, and I also have a plan for integrating with the changes from #6000. |
I have created https://github.com/apache/arrow-rs/tree/53.0.0-dev as the branch and will now begin retargeting and merging PRs. This will likely require me to do some release note finagling but we'll handle that when we get there |
|
|
|
|
|
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We are now being careful about breaking changes (see https://github.com/apache/arrow-rs/blob/master/CONTRIBUTING.md#breaking-changes)
This means we can't merger PRs with breaking API changes to
main
until early AugustHowever there are now three potentially large parquet changes that could conflict with each other and have API changes:
ParquetMetaData
introduced in PARQUET-2261 #5486 from @etseidlParquetMetadataWriter
allow ad-hoc encoding ofParquetMetadata
#6000 from @adriangbDescribe the solution you'd like
Some way to avoid a massive set of merge conflicts when we start merging changes to
master
for parquet 53I would also love to be able to review and merge smaller PRs rather than keep several large ones outstanding
Describe alternatives you've considered
I would like to propose we create a feature branch (e.g.
parquet-53.0.0
) in the arrow-rs repo that we can merge parquet API changes to and develop new featuresOnce main opens for 53 (in early August) we can merge the branch to main
This approach does require maintenance of the parquet 53 branch and runs the risk of accumulating merge conflicts as it diverges from master. I am willing to help do the proces
Additional context
The text was updated successfully, but these errors were encountered: