Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Campaign] Ensure 1.x and 2.x branches (main should be 3.0) #142

Closed
27 tasks done
Tracked by #2271 ...
dblock opened this issue Apr 25, 2022 · 58 comments
Closed
27 tasks done
Tracked by #2271 ...

[Campaign] Ensure 1.x and 2.x branches (main should be 3.0) #142

dblock opened this issue Apr 25, 2022 · 58 comments
Assignees
Labels
campaign Parent issues of OpenSearch release campaigns.

Comments

@dblock
Copy link
Member

dblock commented Apr 25, 2022

What kind of business use case are you trying to solve? What are your requirements?

Currently plugins follow a branching strategy where they work on main for the next development iteration, effectively working on 2 versions at the same time. For example, at the time of writing plugins are working on 2.0 (main) and 1.3.2 (1.3). In contrast, OpenSearch works on 3 releases at the same time. Currently on 3.0 (main), 2.0 (2.x), and 1.3.2 (1.3).

What is the problem? What is preventing you from meeting the requirements?

  • There's no OpenSearch 3.0 with all plugins continuously integrating, causing late integration, and bugs discovered during integration instead continuously.
  • Features and fixes (e.g. CVEs) from main branches may need to be selectively backported into multiple 1.x and 2.x releases. Without an intermediate 1.x or 2.x branch there are as many merge conflicts as branches.
  • Without a 2.x branch, developers don't have an easy way to express "backport to any next 2.x release" and must be aware of the state of the next 2.x (e.g. does it accept new features?).
  • Workflows are different between OpenSearch core and plugins, which is confusing.
  • Authors of breaking changes in OpenSearch for the next major release don't have a place to fix downstream plugins, even if they wanted to.
  • https://github.com/opensearch-project/.github/blob/main/RELEASING.md#plugin-branching is incorrect.

What are you proposing? What do you suggest we do to solve the problem or improve the existing situation?

Follow OpenSearch core branching, described in #35. Create 1.x and 2.x branches, do not create 2.0 as a branch of main, instead create main -> 2.x -> 2.0. Maintain working CI for 3 releases at any given time.

Exit Criteria

@dblock dblock added the campaign Parent issues of OpenSearch release campaigns. label Apr 25, 2022
@peternied
Copy link
Member

@dblock One day ago security choose to delete its 2.x branches because there is no active 3.x feature development, creating a 2.x when the need arises.

The aim of deleting this branch was to was to

  • Control when 3.x breaking changes are scheduled to be worked on
  • Reduced the amount of backporting management, considerations such questioning did a change go to the correct branches?
  • Reduced the number of pull request branch targets to check for in a pull request to one location

While 3.0 will be released in the future and its important, these 'papercuts' impact a smaller project. If we had better mechanisms or potentially a better starting posture we would more quickly adopt the 2.x branch

@dblock
Copy link
Member Author

dblock commented Apr 25, 2022

@peternied Do you not feel that having a working 3.0 build of the distribution is a valuable goal?

@peternied
Copy link
Member

@dblock The maintainers of security had a discussion yesterday where this topic was raised. While there are pain points as mentioned, ultimately we are in agreement we should move in this direction. By running through the process and connecting data we can generate more topics for the retro(s) if we have recommendations or the process isn't tenable.

@cliu123
Copy link
Member

cliu123 commented Jun 28, 2022

I like the idea of building 3 different versions in parallel to let OpenSearch breaking changes fail plugins early.
Security plugin tried building OpenSearch 3.0 on main branch. But yesterday some breaking changes in OpenSearch core 3.0 breaks the BWC, so the security plugin main branch CI failed and blocked the release 2.1.0. With building OpenSearch 3.0 on plugin main branch, all PRs in plugins repos need to be merged to main first -> 2.x branch-> 2.1 branch. Breaking changes in OpenSearch 3.0 makes plugin main branch fragile and potentially block all PRs until failures are resolved. So security plugin had to downgrade from 3.0 to 2.1 to unblock the main branch CI and 2.1.0 release. We need a better mechanism to prevent this moving forward.

@navneet1v
Copy link

@dblock One day ago security choose to delete its 2.x branches because there is no active 3.x feature development, creating a 2.x when the need arises.

The aim of deleting this branch was to was to

  • Control when 3.x breaking changes are scheduled to be worked on
  • Reduced the amount of backporting management, considerations such questioning did a change go to the correct branches?
  • Reduced the number of pull request branch targets to check for in a pull request to one location

While 3.0 will be released in the future and its important, these 'papercuts' impact a smaller project. If we had better mechanisms or potentially a better starting posture we would more quickly adopt the 2.x branch

For K-NN and geo-spatial we also moved to the approach where we will cut 3.0 branches only when we know that there are breaking changes being introduced. This keeps the back-porting to minimum and requires less maintenance.

@dblock the idea suggested in this proposal aligns with the OpenSearch Core but will that be an overkill for plugins that don't have regular changes for different versions?

Also, at what level we are confident that this is the right strategy going forward?

@dblock
Copy link
Member Author

dblock commented Jun 29, 2022

@dblock the idea suggested in this proposal aligns with the OpenSearch Core but will that be an overkill for plugins that don't have regular changes for different versions?

Let's say there are no changes in the plugin ever at all. How do you propose we build a 3.0, 2.2 and 2.1 distribution that includes the plugin today?

@dblock
Copy link
Member Author

dblock commented Jun 29, 2022

For K-NN and geo-spatial we also moved to the approach where we will cut 3.0 branches only when we know that there are breaking changes being introduced. This keeps the back-porting to minimum and requires less maintenance.

Core introduces breaking changes into main (now 3.0) all the time. What branch of k-nn should it include in the 3.0 daily build?

@navneet1v
Copy link

@dblock the idea suggested in this proposal aligns with the OpenSearch Core but will that be an overkill for plugins that don't have regular changes for different versions?

Let's say there are no changes in the plugin ever at all. How do you propose we build a 3.0, 2.2 and 2.1 distribution that includes the plugin today?

The idea was simple, before the time of release cut the branch, as there were no changes required the cutting a branch will be simple.

The problem as our team saw with multiple branches was back-porting only and second it brings the plugins and core 1:1 from a branching standpoint which has its own pros and cons, apart from that I am very much aligned with proposal is being provided.

@navneet1v
Copy link

navneet1v commented Jun 29, 2022

For K-NN and geo-spatial we also moved to the approach where we will cut 3.0 branches only when we know that there are breaking changes being introduced. This keeps the back-porting to minimum and requires less maintenance.

Core introduces breaking changes into main (now 3.0) all the time. What branch of k-nn should it include in the 3.0 daily build?

The main branch of K-NN and if that breaks some how then reaching out to the maintainers for fixing it and fix could be cutting a specific branch. The idea is just to delay when we need to cut a branch, for plugins with not a lot of changes.

Even with branches in K-NN we still need to go ahead and fix the breaking changes. It mainly about back-porting a change from main to 2.x branches with every PR.

@downsrob
Copy link

The main pain point which I have experienced with breaking changes is that a breaking change will be merged into 3.0 on core, and then the plugins will run CI later on during some PR and experience the breaking change. It can be a time consuming process to track down the breaking change in core versus other dependencies and then longer to find out how to fix this breaking change in the plugins. Additionally, we have seen breaking changes go in, plugins make changes, then core modifies the breaking change and then plugins need to revert. On backporting, I am not hugely concerned, though it is more burden. For example with the non inclusive language breaking changes. When I raise a PR for 3.0 using anything related to master then I need to modify my PR for backports. Not a big deal for the non inclusive language changes but there are breaking changes which could be more burdensome, and if I am trying to get a fix into a plugin repo on code freeze date then I don't want a small change to take a day of backporting or have CI on the main branch be compromised because of some tangent breaking change which blocks my PR from getting merged and backported.

Something to help with the slow feedback loop to help plugins quickly know what kind of changes to make is if core could have another github action which builds all of the plugins. Then we can open a github issue for failing plugins explaining the breaking change and required changes. Running all tests is flaky and slow so we can skip that for now, but the build alone can catch many issues.

@dblock
Copy link
Member Author

dblock commented Jun 29, 2022

Let's say there are no changes in the plugin ever at all. How do you propose we build a 3.0, 2.2 and 2.1 distribution that includes the plugin today?

The idea was simple, before the time of release cut the branch, as there were no changes required the cutting a branch will be simple.

That means having to catch up with a lot of changes in core "in the end" or "last minute". You're only looking at it from the perspective of a plugin (e.g. index management). Right now there are tons of changes on core 3.0 and there's no index-management 3.0 AFAIK. You should have incremented main of index-management to 3.0 a long, long time ago in sync with core 3.0. So, which branch of index-management should be consumed right now by the 3.0 distribution?

@dblock
Copy link
Member Author

dblock commented Jun 29, 2022

The main branch of K-NN and if that breaks some how then reaching out to the maintainers for fixing it and fix could be cutting a specific branch. The idea is just to delay when we need to cut a branch, for plugins with not a lot of changes.

This is backwards. Core cannot "communicate" downstream about how it's going to possibly break its dependencies all the time, it can only try via campaigns for planned changes. If you adopted the same branching as core and did 3.0 development on main rn, you would be building against 3.0 core all the time, constantly fixing any breakage caused by core breaking changes, and backporting to 2.x ensuring changes work on the next 2.x.

@navneet1v
Copy link

navneet1v commented Jun 29, 2022

Let's say there are no changes in the plugin ever at all. How do you propose we build a 3.0, 2.2 and 2.1 distribution that includes the plugin today?

The idea was simple, before the time of release cut the branch, as there were no changes required the cutting a branch will be simple.

That means having to catch up with a lot of changes in core "in the end" or "last minute". You're only looking at it from the perspective of a plugin (e.g. index management). Right now there are tons of changes on core 3.0 and there's no index-management 3.0 AFAIK. You should have incremented main of index-management to 3.0 a long, long time ago in sync with core 3.0. So, which branch of index-management should be consumed right now by the 3.0 distribution?

Yes, I looked from a perspective of a plugin owner who wants to maintain the plugin with minimal effort or at-least move towards doing minimal upgrades. Keep fixing the plugins as when core introduce a breaking change thats too much expectation from plugins standpoint. I would really like to see how this churn can be reduced.

Nevertheless, I am not against a common branching strategy, but I feel there are many sharp edges to this too and we should try to find those first and see how we can reduce it.

Suggestion:
At the time when Core creates a new 3.0 branch why cannot we create version file as we have for lucene which can provide which branch of plugins to use and let plugin owner provide the exact branch name to use there. This will let them decide how they want to develop.
Pros:

  1. This will solve the problem keeping branch names consistent and allow plugin owners to keep whatever names they want.
  2. When build breaks due to breaking changes cut automated issues on the repos.

@dblock
Copy link
Member Author

dblock commented Jun 29, 2022

The main pain point which I have experienced with breaking changes is that a breaking change will be merged into 3.0 on core, and then the plugins will run CI later on during some PR and experience the breaking change.

I think we may be able to do something about it, please add your comments to opensearch-project/OpenSearch#3740. That said, breaking changes are breaking changes, and they have a "cost" and will require "work" either way.

@dblock
Copy link
Member Author

dblock commented Jun 29, 2022

@navneet1v Yes! That's what we want. And this is exactly what's asked in this proposal.

First note that plugins must have versions that match core. This is a limitation of the product. Plugin 2.1 cannot start on core 3.0. So at a minimum plugins need to provide a branch where the version is 3.0. The distribution builds everything and runs integration, bcw and performance tests, so the software needs to work.

We do what you suggest with manifests today (read this). When the first 2.0 release start development, core does it on a 2.x branch, and increments main to 3.0. A manifest is created for 3.0 that now includes core 3.0 from main. Plugins are then asked to do the same and add themselves to the manifest pointing to a branch that has 3.0. When the build breaks, plugin tickets are cut.

Instead, today, plugins are saying "we want to reduce backports, so we plan to increment to 3.0 when the last 2.x version, e.g. 2.5 ships, i.e. in 6 months to a year". We're not going to work on 3.0 at all, and not increment main to 3.0 at the same time as core. And the branching strategy proposed doesn't need to match because we don't work on 3.0 anyway, we just need main = 2.0 right now, and tomorrow main = 2.1 and we cut 2.0.

@prudhvigodithi
Copy link
Member

Suggestion: At the time when Core creates a new 3.0 branch why cannot we create version file as we have for lucene which can provide which branch of plugins to use and let plugin owner provide the exact branch name to use there. This will let them decide how they want to develop. Pros:

  1. This will solve the problem keeping branch names consistent and allow plugin owners to keep whatever names they want.
  2. When build breaks due to breaking changes cut automated issues on the repos.

Hey @navneet1v two problems I see with this:
One, the distribution builds cannot run automatically until the plugin owners feed in the details of the branch for a specific release and to automate, it might not be the same branch pattern for next release.
Two, the version increment is still manual here, which right now takes lot of human effort to raise a version increment PR's across multiple plugin repos (Related GH issue: opensearch-project/opensearch-build#1375), since the plugin owner provide the exact branch name to use for each release its hard to the automation the workflows.

@peternied
Copy link
Member

since the plugin owner provide the exact branch name to use for each release its hard to the automation the workflows

We have branching guidance published, can we can we automate workflows and raise issues with repositories that cannot be supported?

@tianleh
Copy link
Member

tianleh commented Jun 30, 2022

@prudhvigodithi
Copy link
Member

Shall https://github.com/opensearch-project/.github/blob/main/RELEASING.md#plugin-branching be deprecated since it is for 2 releases? @prudhvigodithi

Hey @tianleh the idea is to incline with Core and have 3 branches at given time, main -> 2.x -> 2.0. Maintain working CI for 3 releases at any given time., @dblock can you add you thoughts on the document please?

@prudhvigodithi
Copy link
Member

We have branching guidance published, can we can we automate workflows and raise issues with repositories that cannot be supported?

Hey @peternied can you add some more details please?

@peternied
Copy link
Member

It was mentioned that automation is difficult because the branch naming patterns are not consistent. Rather than having some branches name out of standard to block automation efforts, instead make aligning to the standard patterns of value for processes like automated version incrementing. Then you can file bugs on the plugins/repositories that need to update the naming standard to get the value from the automating version incrementing work.

@dblock
Copy link
Member Author

dblock commented Jul 6, 2022

Shall https://github.com/opensearch-project/.github/blob/main/RELEASING.md#plugin-branching be deprecated since it is for 2 releases? @prudhvigodithi

Hey @tianleh the idea is to incline with Core and have 3 branches at given time, main -> 2.x -> 2.0. Maintain working CI for 3 releases at any given time., @dblock can you add you thoughts on the document please?

That doc needs to be updated to say that plugins align on core.

@amitgalitz
Copy link
Member

Moving main branch on plugins to point to 3.0.0-SNAPSHOT makes sense to me however I had a few questions on the standard for development with this setup:

If main is pointing to 3.0 and we are developing features for 2.2 for example that might use methods that have name changes in 3.0 or utilizes dependencies that have changed in 3.0.

Do we develop this feature from the 2.x branch and merge into 2.x only? What is the standard then for bringing that feature to main? A separate PR that addresses the breaking changes + new feature or we forward port and then address potential breaking changes there?

Or should the standard be to develop to main and for every feature that we develop for a 2.x release we should already be mindful that it should pass the 3.0 CI which sometimes isn't possible on things like core name changes in 3.x vs 2.x?

@gaiksaya gaiksaya removed the v2.2.0 label Sep 9, 2022
@dblock dblock changed the title [Campaign] Ensure 1.x and 2.x branches [Campaign] Ensure 1.x and 2.x branches (main should be 3.0) Sep 29, 2022
@dblock
Copy link
Member Author

dblock commented Sep 29, 2022

Let's close this when main in all components is building 3.0, and 2.x is building next 2.x. I copy-pasted this into all related tickets:

@lezzago
Copy link
Member

lezzago commented Oct 24, 2022

The Notification plugin is facing constant blockers to bump the main branch to 3.0 due to its dependencies. The Notification plugin has a hard dependency on the OpenSearch core and Common-utils packages. If they do not build successfully, Notification fails. Additionally, we have security tests we need to pass that is executed by running a docker image with the security plugin. To bring up the docker image, we then need OpenSearch core, Common-utils, Job Scheduler, Security, and Performance-Analyzer packages to build successfully in 3.0.

This means for Notification to have a PR to bump the version to 3.0 in main, we need OpenSearch core, Common-utils, Job Scheduler, Security, and Performance-Analyzer packages to build successfully.

However it is hard to get all these packages to build successfully, in 3.0 as there are consistent breaking changes coming from OpenSearch core and the other packages dont have enough time to fix the issues to finally have a successful build in all of those packages before another breaking change occurs.

I suggest that the build manifest for 3.0 points to a specific commit in the OpenSearch core package and gets updated to the latest commit every 2 weeks. By doing this, it will give the owners of the other packages enough time to fix the issues in their packages without another issue popping up later again.

@dblock @bbarani @CEHENKLE @peterzhuamazon
Please let me know your thoughts on the above.

@cliu123
Copy link
Member

cliu123 commented Oct 24, 2022

Build on 3.0 fails pretty frequently, which blocks development on main branch. It is failing right now. That actually blocks all development when it happens as all the development is done on main and backported to other branches. This actually slows down everyone.

I like @lezzago's idea to point 3.0 to a stable commit in OpenSearch core package to stablize main branch build.

@seraphjiang
Copy link
Member

+1 @lezzago 's stable commit

there should be golden build from nightly build we identify as stable build.

@bbarani
Copy link
Member

bbarani commented Oct 25, 2022

@lezzago I understand your concern but running your CI against the stale core code commit defeats the purpose of the continuous CI. The whole idea of aligning with core branching strategy is to fail fast but I do see it impacting the developer velocity when using a new major branch (3.0 in this case) for development. We are already discussing possible options to reduce the blast radius on this issue and I would recommend you to add your comments there as well.

@ohltyler
Copy link
Member

Perhaps having a separate stable build link that isn't updated as often would ease the plugin developer pain.

@qreshi
Copy link

qreshi commented Oct 25, 2022

The stable build does sound like a good way to have alternatives for the developer to be unblocked.

Another perspective to consider is that if the plugins are able to bump immediately after the version bump from core and all be part of the new tracking distribution, then ideally, subsequent breaking changes wouldn't make it past the distribution workflow and update in maven if it is introducing a breaking change to the build. This won't prevent the work/potential blockers to upgrade the very first time. However, the distribution is more accurate the more components are immediately added to the manifest after a new version upgrade begins being tracked.

That isn't mutually exclusive with the suggestions for a stable build but I think it would help.

@dblock
Copy link
Member Author

dblock commented Oct 25, 2022

@lezzago @qreshi @ohltyler @seraphjiang You are all highlighting the same problem: several transitive dependencies, everything broken, causing pain all the time on main. I think that does prevent your contributors to making changes on main regularly so a stable(r) build makes sense.

Is this something you can enable in your component CI? Instead of using "latest", hardcode build numbers of your dependencies and try that out? I think this would achieve both quick discovery of something broken in integration that builds with latest everything, and leave your main stable.

If infra/opensearch-build dictates what "stable" means, you'll have the same problem I think. I am thinking an Engineer on each team could take on "increment to the latest 3.0 on main" as a routine weekly task.

@navneet1v
Copy link

Created a github issue to track the progress of this campaign for a repo : Neural Search.

Issue: opensearch-project/neural-search#74

@dblock
Copy link
Member Author

dblock commented Jan 26, 2023

@peterzhuamazon Do you know of any repos that don't have this? Otherwise let's close!

@prudhvigodithi
Copy link
Member

prudhvigodithi commented Feb 2, 2023

From the above list I see opensearch-project/asynchronous-search#159, opensearch-project/alerting#493 and opensearch-project/alerting-dashboards-plugin#290 pending, also the above list is old, there are more new repos are added recently we should just cross check before closing this issue. Regarding this how do we ensure the that new repos created do follow this branching in future, the document is already updated, is this good enough?
@dblock @bbarani

@bbarani
Copy link
Member

bbarani commented Apr 6, 2023

Alerting dashboards is still pending. PR is open - opensearch-project/alerting-dashboards-plugin#471 CC: @lezzago

@bbarani
Copy link
Member

bbarani commented May 2, 2023

Closing this issue as all the teams have completed the change and all are currently following the branching strategy listed on the exit criteria of this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
campaign Parent issues of OpenSearch release campaigns.
Projects
None yet
Development

No branches or pull requests