-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(engine): suspend and flush IO before snapshot #306
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe pull request modifies the snapshot operation logic in the SPDK engine, specifically focusing on how filesystem synchronization is handled before creating a snapshot. Instead of using a system-wide sync command, the new implementation pauses I/O operations and flushes outstanding I/O by suspending the NVMe initiator. This approach aims to provide a more targeted method of ensuring data integrity during snapshot creation, potentially reducing overhead on other volumes. Changes
Assessment against linked issues
Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pkg/spdk/engine.go
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Build AMD64 binaries
- GitHub Check: Build ARM64 binaries
- GitHub Check: Summary
Longhorn 9023 Signed-off-by: Damiano Cipriani <[email protected]>
4b60c8c
to
12172e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pkg/spdk/engine.go (1)
1648-1662
: Document the initiator suspension flagsThe purpose and implications of the
false, false
flags passed toinitiator.Suspend()
should be documented. Consider using named constants to improve code readability.Add comments and constants:
+ // suspendFlags represents the flags for initiator suspension + const ( + // syncFileSystem indicates whether to sync the filesystem + syncFileSystem = false + // flushCache indicates whether to flush the cache + flushCache = false + ) if snapshotOp == SnapshotOperationCreate { + // Pause IO operations and attempt to synchronize filesystem + // syncFileSystem=false: Skip filesystem sync as it's handled by the snapshot operation + // flushCache=false: Skip cache flush as it's handled by the snapshot operation if e.Frontend == types.FrontendSPDKTCPBlockdev && e.Endpoint != "" { e.log.Infof("Requesting initiator suspend before to create snapshot %s", snapshotName) - if err = e.initiator.Suspend(false, false); err != nil { + if err = e.initiator.Suspend(syncFileSystem, flushCache); err != nil {
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pkg/spdk/engine.go
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Build ARM64 binaries
- GitHub Check: Build AMD64 binaries
- GitHub Check: Summary
🔇 Additional comments (2)
pkg/spdk/engine.go (2)
1648-1662
: Implementation aligns with PR objectivesThe code successfully implements the requirement to suspend and flush IO operations before taking a snapshot. The implementation:
- Correctly suspends IO operations using the NVMe initiator
- Ensures proper cleanup by resuming the initiator in a deferred function
- Provides appropriate error handling and logging
1655-1661
:⚠️ Potential issueFix error handling in deferred function
The deferred function modifies the named return variable
err
which can lead to unintended behavior. The error fromResume()
should be captured in a separate variable to avoid overwriting any previous error.Apply this diff to fix the error handling:
defer func() { - if err = e.initiator.Resume(); err != nil { + if resumeErr := e.initiator.Resume(); resumeErr != nil { e.log.Errorf("Error resuming initiator after the creation of snapshot %s: %v", snapshotName, resumeErr) return } }()Likely invalid or redundant comment.
The CI fails because it isn't able to start SPDK target:
This happen for all the the tests: I have run the same code on a Equinix machine and all is ok, what can be the problem here? |
CI env issue. Checking how to fix the hugepage issue |
Which issue(s) this PR fixes:
Issue longhorn/longhorn#9023
What this PR does / why we need it:
Special notes for your reviewer:
Additional documentation or context