Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support v2 volume live migration #3323

Merged
merged 1 commit into from
Dec 14, 2024
Merged

Conversation

PhanLe1010
Copy link
Contributor

Copy link

coderabbitai bot commented Dec 7, 2024

Walkthrough

The pull request introduces significant modifications to the VolumeController and related components in the Longhorn project, focusing on the management of volume replicas and engines during migration and upgrades. Key updates include enhanced handling of different data engine types (V1 and V2), improved error handling, and the introduction of a new field in the ReplicaSpec for tracking the migrating engine. Additionally, the Custom Resource Definitions (CRDs) have been updated for clarity and consistency, including versioning changes from v1beta1 to v1beta2.

Changes

File Change Summary
controller/volume_controller.go - Updated deleteInvalidMigrationReplicas to accept a volume parameter.
- Modified createAndStartMatchingReplicas for V2 engine handling.
- Enhanced error handling during migration.
k8s/crds.yaml - Updated descriptions for multiple resource specifications.
- Added migrationEngineName to Replica specification.
- Changed versioning from v1beta1 to v1beta2.
k8s/pkg/apis/longhorn/v1beta2/replica.go - Added MigrationEngineName field to ReplicaSpec struct for indicating the migrating engine.

Possibly related PRs

Suggested reviewers

  • shuo-wu
  • c3y1huang

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad49553 and b86cc10.

📒 Files selected for processing (3)
  • controller/volume_controller.go (7 hunks)
  • k8s/crds.yaml (70 hunks)
  • k8s/pkg/apis/longhorn/v1beta2/replica.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • k8s/pkg/apis/longhorn/v1beta2/replica.go
🔇 Additional comments (4)
k8s/crds.yaml (1)

2992-2996: LGTM! Well documented new field for v2 data engine live migration

The new migrationEngineName field in ReplicaSpec is properly defined with a clear description explaining its purpose for v2 data engine live migration.

controller/volume_controller.go (3)

4025-4035: LGTM! Well structured v2 data engine support in createAndStartMatchingReplicas

The function now properly handles both v1 and v2 data engines. For v2, it reuses existing replicas instead of creating new clones, which is the expected behavior for v2 engine live migration.


4065-4070: LGTM! Proper handling of invalid migration replicas for v2 data engine

For v2 data engine, the function clears the migrationEngineName instead of deleting the replica, which is the correct behavior for v2 engine live migration.


4154-4163: LGTM! Correct active replica handling for v2 data engine

The function properly updates the migrationEngineName and engineName for v2 data engine replicas, while maintaining the existing active state switching behavior for v1 engine.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Experiment)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@PhanLe1010 PhanLe1010 requested a review from a team December 9, 2024 18:14
@PhanLe1010 PhanLe1010 marked this pull request as ready for review December 11, 2024 00:22
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
k8s/pkg/apis/longhorn/v1beta2/replica.go (1)

28-31: Fix grammatical errors in the comment for MigratingEngineName

The comment should be corrected for clarity and grammar.

Apply this diff to fix the grammatical errors:

 // +optional
-// MigratingEngineName is indicating the migrating engine which current connected to this replica. This is only
-// used for live migration of v2 data engine
+// MigratingEngineName indicates the migrating engine which is currently connected to this replica. This is only
+// used for live migration of v2 data engines.
 MigratingEngineName string `json:"migratingEngineName"`
k8s/crds.yaml (1)

2993-2996: Fix grammatical errors in the description of migratingEngineName

The description should be corrected for clarity and grammar.

Apply this diff to fix the grammatical errors:

 description: |-
-  MigratingEngineName is indicating the migrating engine which current connected to this replica. This is only
-  used for live migration of v2 data engine
+  MigratingEngineName indicates the migrating engine which is currently connected to this replica. This is only
+  used for live migration of v2 data engines.
 type: string
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2846190 and c5f0710.

📒 Files selected for processing (3)
  • controller/volume_controller.go (7 hunks)
  • k8s/crds.yaml (70 hunks)
  • k8s/pkg/apis/longhorn/v1beta2/replica.go (1 hunks)
🔇 Additional comments (6)
controller/volume_controller.go (6)

4025-4035: LGTM!

The code correctly handles Data Engine V2 by reusing existing replicas without cloning. This ensures efficiency by avoiding unnecessary duplication when matching replicas during migration.


4065-4070: LGTM!

Properly differentiates between Data Engine V1 and V2 during migration replica deletion. For V1, it deletes the invalid migration replicas, and for V2, it resets the MigratingEngineName to disconnect the replica from the migrating engine.


4154-4163: LGTM!

Correctly handles the switch of active replicas for Data Engine V1 and V2 during migration completion. It switches active replicas based on the data engine type, ensuring that V1 replicas are appropriately managed and V2 replicas have their MigratingEngineName reset.


4200-4212: LGTM!

Properly reverts migration for Data Engine V1 and V2. In the case of V1, it deletes migration replicas not associated with the current engine. For V2, it clears the MigratingEngineName to clean up migration state from the replicas.


4339-4339: LGTM!

The call to deleteInvalidMigrationReplicas with the volume parameter ensures that invalid replicas are correctly handled based on the data engine type during migration preparation.


4344-4349: LGTM!

Appropriately sets EngineName or MigratingEngineName based on DataEngineType during replica creation. This ensures that replicas are correctly associated with the appropriate engine during migration for both V1 and V2 data engines.

james-munson
james-munson previously approved these changes Dec 12, 2024
Copy link
Contributor

@james-munson james-munson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One question.

k8s/crds.yaml Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
controller/volume_controller.go (1)

4025-4035: Data Engine V2: Replica handling optimization during migration

The implementation correctly handles replica management differently for data engine v2 by reusing existing replicas instead of cloning them. This is more efficient for v2 volumes.

Consider adding a comment explaining why v2 data engine can reuse replicas directly without cloning, to help future maintainers understand the architectural difference.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c5f0710 and 9e3236f.

📒 Files selected for processing (3)
  • controller/volume_controller.go (7 hunks)
  • k8s/crds.yaml (70 hunks)
  • k8s/pkg/apis/longhorn/v1beta2/replica.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • k8s/pkg/apis/longhorn/v1beta2/replica.go
🔇 Additional comments (4)
k8s/crds.yaml (2)

2992-2996: LGTM: Well-documented migration tracking field

The addition of migratingEngineName field with clear documentation is a good implementation for tracking engine connections during v2 data engine live migration.


2993-2995: LGTM: Consistent description formatting across CRDs

The description formatting is consistent across all CRDs, following the same pattern for:

  • Field descriptions
  • Transition messages
  • Error conditions
    This consistency improves maintainability and readability of the CRDs.

Also applies to: 3028-3031, 3644-3647, 3764-3767, 3891-3894

controller/volume_controller.go (2)

4065-4070: Data Engine V2: Safe cleanup of migration replicas

The code appropriately handles v2 engine cleanup by clearing migratingEngineName instead of deleting replicas. This is safer than the v1 approach.

✅ Verification successful

Let me try a different search pattern to verify the handling of v2 engine replicas during migration.


Data Engine V2: Replica handling during migration is correctly implemented

The code search confirms that the implementation correctly handles replicas differently based on the data engine type:

  • For V1: Replicas are deleted using deleteReplica
  • For V2: Only the MigratingEngineName field is cleared, preserving the replicas

The implementation aligns with the expected behavior mentioned in the review comment and follows a safer approach for V2 engines.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify that we don't accidentally delete v2 replicas during migration
ast-grep --pattern 'if v.Spec.DataEngine == longhorn.DataEngineTypeV2 {
  $$$
  r.Spec.MigratingEngineName = ""
  $$$
}'

Length of output: 122


Script:

#!/bin/bash
# Search for data engine type checks and replica handling patterns
rg -U "DataEngine.*Type" -A 5 controller/volume_controller.go

# Also check for any replica deletion patterns
rg -U "deleteReplica.*DataEngine" -A 5 controller/volume_controller.go

Length of output: 1252


4200-4213: Data Engine V2: Migration revert handling

The implementation correctly handles migration revert for v2 by clearing migratingEngineName on replicas instead of deleting them.

✅ Verification successful

Let me verify if there are any other places in the codebase where migration revert is handled differently for v2 engines.


Data Engine V2: Migration revert handling is consistent

The implementation correctly handles migration revert for both v1 and v2 data engines. For v1, it deletes replicas, while for v2, it clears the migratingEngineName field. This behavior is consistent across the codebase, with similar patterns found in other migration-related operations where v2 replicas are handled by managing the migratingEngineName field instead of deletion.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify consistent handling of v2 replicas during migration revert
ast-grep --pattern 'if v.Spec.DataEngine == longhorn.DataEngineTypeV1 {
  $$$
} else {
  for _, r := range rs {
    r.Spec.MigratingEngineName = ""
  }
}'

Length of output: 1138


Script:

#!/bin/bash
# Search for other instances of migration revert handling
rg -A 5 -B 5 "DataEngine.*DataEngineTypeV1" --type go

# Also check for any other places where MigratingEngineName is modified
rg "MigratingEngineName" --type go

Length of output: 10929

@derekbit
Copy link
Member

Hello @PhanLe1010
Conventional Commits / commit-lint (pull_request_target) check failed.

derekbit
derekbit previously approved these changes Dec 13, 2024
Copy link
Member

@derekbit derekbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The implementation is quite straightforward. Thank you.

controller/volume_controller.go Outdated Show resolved Hide resolved
controller/volume_controller.go Outdated Show resolved Hide resolved
controller/volume_controller.go Outdated Show resolved Hide resolved
controller/volume_controller.go Outdated Show resolved Hide resolved
controller/volume_controller.go Outdated Show resolved Hide resolved
@derekbit
Copy link
Member

Hello @PhanLe1010
We can merge the PR after addressing the comments. Thank you.

@derekbit derekbit self-requested a review December 13, 2024 23:58
@PhanLe1010
Copy link
Contributor Author

Hello @PhanLe1010 We can merge the PR after addressing the comments. Thank you.

Thank you. I am fixing it

@PhanLe1010 PhanLe1010 dismissed stale reviews from derekbit and james-munson via 5afd4b6 December 14, 2024 00:06
@PhanLe1010 PhanLe1010 force-pushed the 6361 branch 3 times, most recently from 57de909 to 6f2569c Compare December 14, 2024 00:17
@PhanLe1010
Copy link
Contributor Author

All fixed. Thanks @derekbit

derekbit
derekbit previously approved these changes Dec 14, 2024
Copy link
Member

@derekbit derekbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@PhanLe1010 PhanLe1010 changed the title Support v2 volume live migration feat: support v2 volume live migration Dec 14, 2024
longhorn-6361

Signed-off-by: Phan Le <[email protected]>
@derekbit derekbit merged commit 84c5a79 into longhorn:master Dec 14, 2024
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants