Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Snapshot forward compatibility for patch version updates #13676

Open
tan3-netapp opened this issue May 15, 2024 · 5 comments
Open

[BUG] Snapshot forward compatibility for patch version updates #13676

tan3-netapp opened this issue May 15, 2024 · 5 comments
Labels
bug Something isn't working feedback needed Issue or PR needs feedback Storage:Snapshots

Comments

@tan3-netapp
Copy link

tan3-netapp commented May 15, 2024

Describe the bug

I have an OpenSearch cluster running version 1.3.11, preparing to upgrade to version 1.3.15. The below steps are what I did:

  1. Snapshot of all indices in the cluster with OpenSearch 1.3.11;
  2. Upgraded the whole cluster from OpenSearch 1.3.11 to 1.3.15;
  3. Snapshot with OpenSearch 1.3.15 to the same snapshot repository as step 1;
  4. Restored snapshot taken in step 3 to an OpenSearch 1.3.11 cluster – Failed;
  5. Restored snapshot taken in step 1 to an OpenSearch 1.3.11 cluster – Succeeded;

Although semantic versioning should ensure some sort of compatibility, I cannot restore to a new cluster in step 4.
[Question 1] Is this behavior expected? That OpenSearch does not guarantee snapshot backward compatibility between minor versions?

I cannot directly restore from newer version to older version like in step 4 but I can still manage to restore the last old-version snapshot like in step 5 in case of failed upgrade. However, I have a concern:
[Question 2] Are older versions of OpenSearch guaranteed to be able to access a repository that has been modified by a newer version? Do I have to keep testing this behavior in future releases, and do we need another backup plan if it does not work?
I suppose, on a cluster with a single backup repository, Step 5 is the only way we can roll back to an older version in the event of failed upgrade (including unexpected breaking change). If what I’m concerned about in Question 2 is not guaranteed, I plan to use a separate snapshot repository for each minor version, which would be a lot to manage.

Although the strategy like in step 5 works, all changes after the upgrade (at step 2) such as creating, updating and deleting indices will surely be lost.
[Question 3] Is there any way to roll back to the older version that includes writes performed in the new version? Does not have to be an in-place rollback, restoring to a new cluster is fine as well.

If the behavior in step 5 always works,
[Question 4] is it worth documenting in the OpenSearch official documentation?

Related component

Storage:Snapshots

To Reproduce

  1. Snapshot of all indices in the cluster with OpenSearch 1.3.11;
  2. Upgraded the whole cluster from OpenSearch 1.3.11 to 1.3.15;
  3. Snapshot with OpenSearch 1.3.15 to the same snapshot repository as step 1;
  4. Restored snapshot taken in step 3 to an OpenSearch 1.3.11 cluster – Failed;
  5. Restored snapshot taken in step 1 to an OpenSearch 1.3.11 cluster – Succeeded;

Expected behavior

As described in the Describe the bug section, I am not sure what I'm concerned about are bugs or not but the following summarized points are my expectations if they are:

  1. In Question 1, according to the semantic versioning definition, I expect that we can directly restore the new version snapshot to an older-version cluster.
  2. In Question 2, it's not a bug now, but I expect to have a guarantee that older versions of OpenSearch always access a snapshot repository modified by a newer version. This helps me not come up with a new backup plan, which requires much effort to manage.
  3. In Question 3, I expect to have a way to restore the changes made after the upgrade to an older-version cluster. If not, I need to have downtime to avoid any write operations during the upgrade.
  4. In Question 4, although I'm not sure what the answers to the above questions are, I expect we have an official documentation for them.

Additional Details

Host/Environment (please complete the following information):

  • OS: Debian
  • Version 1.3.15, 2.12.0

Additional context
Add any other context about the problem here.

@peternied
Copy link
Member

[Triage - attendees 1 2 3 4 5 6 7 8]
@tan3-netapp Thanks for creating this issue, this looks like an important and complex issue.

@peternied peternied changed the title [BUG] Snapshot backward compatibility for minor version upgrades [BUG] Snapshot forward compatibility for patch version updates May 15, 2024
@Bukhtawar
Copy link
Collaborator

That OpenSearch does not guarantee snapshot backward compatibility between minor versions?

Lucene doesn't support segments written in higher version to be read by lower version, the reverse is however true i.e. higher versions supports reads of older segments in the minor version.

[Question 2] Are older versions of OpenSearch guaranteed to be able to access a repository that has been modified by a newer version? Do I have to keep testing this behavior in future releases, and do we need another backup plan if it does not work?

Yes this is guaranteed to work. We can doubly confirm on an integ test that verifies that behaviour

[Question 3] Is there any way to roll back to the older version that includes writes performed in the new version? Does not have to be an in-place rollback, restoring to a new cluster is fine as well.

No not supported, Please refer to the first answer

[Question 4] is it worth documenting in the OpenSearch official documentation?

Snapshot compatibility is well documented

@Bukhtawar Bukhtawar added the feedback needed Issue or PR needs feedback label May 15, 2024
@tan3-netapp
Copy link
Author

tan3-netapp commented May 16, 2024

Thank you so much for your quick and detailed reply, @Bukhtawar . I still have some minor follow-up questions:

Lucene doesn't support segments written in higher version to be read by lower version, the reverse is however true i.e. higher versions supports reads of older segments in the minor version.

Could you please give me a reference or a doc from Lucene confirming this fact?

We can doubly confirm on an integ test that verifies that behaviour

Could you please show me this integration test? I'm curious to know how it tests this behavior.

No not supported [Question 3]

By confirming this, I think I need to have downtime to avoid any write operations during the upgrade.

@tan3-netapp
Copy link
Author

tan3-netapp commented May 28, 2024

Hi @peternied and @Bukhtawar, do we have any other updates on this issue?
According to the document @Bukhtawar provided, in the conflicts and compatibility section, it reads

Snapshots are only forward-compatible by one major version. If you have an old snapshot, you can sometimes restore it into an intermediate cluster, reindex all indexes, take a new snapshot, and repeat until you arrive at your desired version, but you might find it easier to just manually index your data in the new cluster.

This is not really clear about what I mentioned about the minor version upgrade. I plan to create a doco update PR to make that compability a little clearer given what it currently says in the compatbility and conflicts section isn't really explicit about older snapshots and the repositories that contain them continue to be compatible with older versions of opensearch

@tan3-netapp
Copy link
Author

I just proposed a change in the documentation to clarify more about the snapshots. @peternied @Bukhtawar do you have any further comment on this? And if you have time, please have a eye on some of my follow-up questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feedback needed Issue or PR needs feedback Storage:Snapshots
Projects
Status: 🆕 New
Development

No branches or pull requests

3 participants