Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database versioning and migration #3255

Closed
wants to merge 34 commits into from
Closed

Conversation

sudo-shashank
Copy link
Contributor

@sudo-shashank sudo-shashank commented Jul 21, 2023

Summary of changes

Changes introduced in this pull request:

  • Implemented database versioning and migration
  • Add how to write new db migration doc
  • Added CI to test migration by syncing from an older version of forest to latest version of forest
  • Use dev database if it exist

Reference issue to close (if applicable)

Closes #3203

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

@sudo-shashank sudo-shashank marked this pull request as ready for review July 26, 2023 12:09
@sudo-shashank sudo-shashank requested a review from a team as a code owner July 26, 2023 12:09
@sudo-shashank sudo-shashank requested review from lemmih and LesnyRumcajs and removed request for a team July 26, 2023 12:09
Copy link
Contributor

@lemmih lemmih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You struggle with setting up a framework for database migrations because we have yet to have any database migrations, imo.

Try setting up a database migration from version Unknown to version 0.12.0-rc (unreleased version, you can pretend it exists) to version dev. Migrating between three different versions should give you a lot of experience.

PathBuf::from(&config.client.data_dir).join(config.chain.network.to_string())
let chain_path = PathBuf::from(&config.client.data_dir).join(config.chain.network.to_string());
// Use the dev database if it exists, else use versioned database
let dev_path = chain_path.join("dev");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is the database ever put in the dev folder?

/// Database version for each forest version which supports db migration
#[derive(Debug, Eq, PartialEq)]
pub enum DBVersion {
V0, // Default DBVersion for any unknown db
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the version is unknown, let's call it Unknown rather than V0.

use std::sync::Arc;
use tracing::info;

pub const LATEST_DB_VERSION: DBVersion = DBVersion::V11;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why manually define this? It could be inferred from the version number of Forest.

#[derive(Debug, Eq, PartialEq)]
pub enum DBVersion {
V0, // Default DBVersion for any unknown db
V11,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this version 11? Forest is at version 0.11.1. How about you just use https://docs.rs/semver/latest/semver/struct.Version.html to exactly capture the Forest version?

db_root(existing_chain_data_root),
config.db_config().clone(),
)?);
let genesis = read_genesis_header(None, config.chain.genesis_bytes(), &db).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want ts.genesis().


let ts = state_manager.chain_store().heaviest_tipset();
let height = ts.epoch();
assert!(height.is_positive());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the purpose of this assertion. More documentation is needed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not required, removed

Comment on lines 76 to 79
let next_version = match current_version {
DBVersion::V0 => DBVersion::V11,
_ => break,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need help understanding the details of this code. The list of Forest versions is static, and we shouldn't have to specify the next_version manually. For example, the last three Forest versions are: ["0.10.0", "0.11.0", "0.11.1"]. Keeping the versions in a vector tells you about the known versions and their ordering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am maintaining the list of versions now


// TODO: Add Steps required for new migration
/// Migrate to an intermediate db version
fn migrate(_existing_db_path: &Path, next_version: &DBVersion) -> anyhow::Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you want the migrations to happen inside the database folder. That's not safe. If something goes wrong, we want to revert back to the last known good state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am currently using temp_dir for intermediate db migrations, if something goes wrong we can revert back to last good state easily

Copy link
Member

@LesnyRumcajs LesnyRumcajs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Database versioning and migration are crucial to having Forest working and resolving issues on running services. This is a feature that needs fancy diagrams and team wide-understanding.

How about you make a proposal (not in the form of code) and present it to the team after daily?

@LesnyRumcajs LesnyRumcajs marked this pull request as draft August 8, 2023 07:09
@lemmih lemmih closed this Aug 31, 2023
@sudo-shashank sudo-shashank deleted the shashank/db-migration branch March 18, 2024 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

version the database and automatically migrate data
3 participants