-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: mainly refactor engine codes #260
Conversation
WalkthroughThe pull request introduces modifications primarily to the Changes
Assessment against linked issues
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
pkg/spdk/engine.go (1)
2170-2216
: Simplify error handling logic inSwitchOverTarget
The error handling in the deferred function within
SwitchOverTarget
is complex due to nestedif
statements.To improve readability and maintainability, consider refactoring the nested
if
statements. Here's a suggested change:defer func() { if err != nil { e.log.WithError(err).Warnf("Failed to switch over engine to target address %s", newTargetAddress) disconnected, errCheck := e.isTargetDisconnected() if errCheck != nil { e.log.WithError(errCheck).Warnf("Failed to check if target %s is disconnected", newTargetAddress) return } if disconnected && currentTargetAddress != "" { if errConnect := e.connectTarget(currentTargetAddress); errConnect != nil { e.log.WithError(errConnect).Warnf("Failed to connect target back to %s", currentTargetAddress) return } e.log.Infof("Connected target back to %s", currentTargetAddress) if errReload := e.reloadDevice(); errReload != nil { e.log.WithError(errReload).Warnf("Failed to reload device mapper") } else { e.log.Infof("Reloaded device mapper for connecting target back to %s", currentTargetAddress) } } } else { e.ErrorMsg = "" e.log.Infof("Switched over target to %s", newTargetAddress) } e.UpdateCh <- nil }()
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
⛔ Files ignored due to path filters (9)
go.mod
is excluded by!go.mod
go.sum
is excluded by!**/*.sum
,!go.sum
vendor/github.com/longhorn/go-spdk-helper/pkg/nvme/initiator.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/nvme/nvme.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/nvme/nvmecli.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/spdk/client/advanced.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/types/types.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/util/device.go
is excluded by!vendor/**
vendor/modules.txt
is excluded by!vendor/**
📒 Files selected for processing (5)
pkg/client/client.go
(1 hunks)pkg/spdk/engine.go
(29 hunks)pkg/spdk/replica.go
(4 hunks)pkg/spdk/server.go
(1 hunks)pkg/spdk_test.go
(6 hunks)
✅ Files skipped from review due to trivial changes (1)
- pkg/spdk/replica.go
🔇 Additional comments (4)
pkg/spdk/engine.go (1)
486-491
: Verify port allocation when both initiator and target are created
When both initiatorCreationRequired
and targetCreationRequired
are true, the same port
is assigned to both e.Port
and e.TargetPort
. This could lead to conflicts if the initiator and target cannot share the same port.
Please ensure that using the same port for both the initiator and target is acceptable in your network configuration. If not, consider allocating separate ports for each to prevent potential issues.
pkg/client/client.go (1)
477-477
: Simplification of EngineCreate
method signature
The removal of the upgradeRequired
parameter from the EngineCreate
method reduces complexity and clarifies the method's purpose.
pkg/spdk/server.go (1)
881-881
: Consistency in EngineCreate
method update
The modifications to the EngineCreate
method in the Server
align with the updated method signature in the client. This ensures consistency across the codebase.
pkg/spdk_test.go (1)
271-271
: Update test cases to match new EngineCreate
signature
The test cases have been correctly updated to reflect the removal of the upgradeRequired
parameter in the EngineCreate
method. The tests remain valid and continue to verify the intended functionality.
Also applies to: 354-354, 530-530, 1228-1228, 1243-1243, 1354-1354
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #260 +/- ##
========================================
- Coverage 8.38% 8.31% -0.07%
========================================
Files 21 21
Lines 7159 7239 +80
========================================
+ Hits 600 602 +2
- Misses 6478 6557 +79
+ Partials 81 80 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
This pull request is now in conflict. Could you fix it @derekbit? 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
pkg/spdk/engine.go (2)
121-147
: Consider improving error message clarityThe error message in the
checkInitiatorAndTargetCreationRequirements
method could be more descriptive.- err = fmt.Errorf("invalid initiator and target addresses for engine %s creation with initiator address %v and target address %v", e.Name, initiatorIP, targetIP) + err = fmt.Errorf("pod IP %v does not match either initiator address %v or target address %v for engine %s creation", podIP, initiatorIP, targetIP, e.Name)
2170-2209
: Improve error recovery in SwitchOverTargetThe error recovery logic in
SwitchOverTarget
could be more robust by adding retries for target reconnection.Consider adding a retry mechanism for target reconnection:
if currentTargetAddress != "" { + maxRetries := 3 + for i := 0; i < maxRetries; i++ { if errConnect := e.connectTarget(currentTargetAddress); errConnect != nil { e.log.WithError(errConnect).Warnf("Failed to connect target back to %s (attempt %d/%d)", currentTargetAddress, i+1, maxRetries) + if i < maxRetries-1 { + time.Sleep(time.Second * time.Duration(i+1)) + continue + } } else { e.log.Infof("Connected target back to %s", currentTargetAddress) if errReload := e.reloadDevice(); errReload != nil { e.log.WithError(errReload).Warnf("Failed to reload device mapper") } else { e.log.Infof("Reloaded device mapper for connecting target back to %s", currentTargetAddress) } + break } + } }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
⛔ Files ignored due to path filters (9)
go.mod
is excluded by!go.mod
go.sum
is excluded by!**/*.sum
,!go.sum
vendor/github.com/longhorn/go-spdk-helper/pkg/nvme/initiator.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/nvme/nvme.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/nvme/nvmecli.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/spdk/client/advanced.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/types/types.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/util/device.go
is excluded by!vendor/**
vendor/modules.txt
is excluded by!vendor/**
📒 Files selected for processing (5)
pkg/client/client.go
(1 hunks)pkg/spdk/engine.go
(29 hunks)pkg/spdk/replica.go
(4 hunks)pkg/spdk/server.go
(1 hunks)pkg/spdk_test.go
(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/spdk/server.go
- pkg/spdk/replica.go
🔇 Additional comments (8)
pkg/client/client.go (1)
477-477
: LGTM: Simplified EngineCreate
method signature
The removal of the upgradeRequired
parameter streamlines the method signature while maintaining core functionality. This change aligns with the refactoring objectives.
pkg/spdk_test.go (4)
271-271
: LGTM: Updated test call site for EngineCreate
The test has been correctly updated to match the new method signature.
354-354
: LGTM: Updated test call site for EngineCreate
in re-attach scenario
The test has been correctly updated to match the new method signature in the volume re-attach scenario.
1228-1228
: LGTM: Updated test call sites for EngineCreate
in snapshot revert
Both calls to EngineCreate
in the snapshot revert scenario have been correctly updated to match the new method signature.
Also applies to: 1243-1243
1354-1354
: LGTM: Updated test call site for EngineCreate
in target-only scenario
The test has been correctly updated to match the new method signature in the target-only test scenario.
pkg/spdk/engine.go (3)
117-119
: LGTM: Added clear helper method for engine state check
The isNewEngine
method provides a clear way to determine if an engine instance is new based on IP addresses.
2439-2450
: LGTM: Clean port release implementation
The releaseTargetPort
method provides a clean way to release target ports and update state.
596-610
:
Potential double port release issue
The releasePorts
function could attempt to release the same port twice if e.Port
equals e.TargetPort
.
Apply this fix:
func (e *Engine) releasePorts(superiorPortAllocator *commonbitmap.Bitmap) (err error) {
- ports := map[int32]struct{}{
- e.Port: {},
- e.TargetPort: {},
- }
+ ports := make(map[int32]struct{})
+ if e.Port != 0 {
+ ports[e.Port] = struct{}{}
+ }
+ if e.TargetPort != 0 && e.TargetPort != e.Port {
+ ports[e.TargetPort] = struct{}{}
+ }
if errRelease := releasePortIfExists(superiorPortAllocator, ports, &e.Port); errRelease != nil {
err = multierr.Append(err, errRelease)
}
if errRelease := releasePortIfExists(superiorPortAllocator, ports, &e.TargetPort); errRelease != nil {
err = multierr.Append(err, errRelease)
}
return err
}
Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
pkg/spdk/engine.go (2)
143-143
: Improve error message clarityThe error message could be more descriptive by including the pod IP for easier debugging.
- err = fmt.Errorf("invalid initiator and target addresses for engine %s creation with initiator address %v and target address %v", e.Name, initiatorIP, targetIP) + err = fmt.Errorf("invalid initiator and target addresses for engine %s creation: pod IP %v does not match initiator address %v or target address %v", e.Name, podIP, initiatorIP, targetIP)
232-235
: Improve error handling in target creation pathThe error handling here could be more specific. When
BdevRaidGet
fails with an error other than "no such device", we should log the specific error before returning._, err := spdkClient.BdevRaidGet(e.Name, 0) if err != nil && !jsonrpc.IsJSONRPCRespErrorNoSuchDevice(err) { + e.log.WithError(err).Errorf("Failed to get raid bdev %v during engine creation", e.Name) return nil, errors.Wrapf(err, "failed to get raid bdev %v during engine creation", e.Name) }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (5)
pkg/client/client.go
(1 hunks)pkg/spdk/engine.go
(29 hunks)pkg/spdk/replica.go
(4 hunks)pkg/spdk/server.go
(1 hunks)pkg/spdk_test.go
(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- pkg/client/client.go
- pkg/spdk/replica.go
- pkg/spdk/server.go
- pkg/spdk_test.go
🔇 Additional comments (2)
pkg/spdk/engine.go (2)
597-600
: Prevent double release of ports
As noted in a previous review, there's a risk of attempting to release the same port twice if e.Port
and e.TargetPort
are equal.
2404-2435
: 🛠️ Refactor suggestion
Improve error handling in DeleteTarget
Consider collecting all errors during cleanup instead of returning on the first error. This ensures we attempt to clean up all resources even if some operations fail.
func (e *Engine) DeleteTarget(spdkClient *spdkclient.Client, superiorPortAllocator *commonbitmap.Bitmap) (err error) {
e.log.Infof("Deleting target with target port %d", e.TargetPort)
+ var errs []error
if err := spdkClient.StopExposeBdev(e.Nqn); err != nil {
- return errors.Wrapf(err, "failed to stop expose bdev while deleting target instance for engine %s", e.Name)
+ errs = append(errs, errors.Wrapf(err, "failed to stop expose bdev"))
}
if err := e.releaseTargetPort(superiorPortAllocator); err != nil {
- return errors.Wrapf(err, "failed to release target port while deleting target instance for engine %s", e.Name)
+ errs = append(errs, errors.Wrapf(err, "failed to release target port"))
}
e.log.Infof("Deleting raid bdev %s while deleting target instance", e.Name)
if _, err := spdkClient.BdevRaidDelete(e.Name); err != nil && !jsonrpc.IsJSONRPCRespErrorNoSuchDevice(err) {
- return errors.Wrapf(err, "failed to delete raid bdev after engine %s while deleting target instance", e.Name)
+ errs = append(errs, errors.Wrapf(err, "failed to delete raid bdev"))
}
for replicaName, replicaStatus := range e.ReplicaStatusMap {
e.log.Infof("Disconnecting replica %s while deleting target instance", replicaName)
if err := disconnectNVMfBdev(spdkClient, replicaStatus.BdevName); err != nil {
e.log.WithError(err).Warnf("Engine failed to disconnect replica %s while deleting target instance", replicaName)
replicaStatus.Mode = types.ModeERR
+ errs = append(errs, errors.Wrapf(err, "failed to disconnect replica %s", replicaName))
}
replicaStatus.BdevName = ""
}
+ if len(errs) > 0 {
+ return fmt.Errorf("failed to delete target instance for engine %s: %v", e.Name, errs)
+ }
return nil
}
Likely invalid or redundant comment.
Longhorn 9919 Signed-off-by: Derek Su <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
pkg/spdk/engine.go (1)
2441-2452
: Good separation of target port release logicThe
releaseTargetPort
method cleanly encapsulates port release logic, but consider adding a check to prevent double release of ports.func (e *Engine) releaseTargetPort(superiorPortAllocator *commonbitmap.Bitmap) error { - releaseTargetPortRequired := e.TargetPort != 0 + if e.TargetPort == 0 { + return nil + } - // Release the target port - if releaseTargetPortRequired { - if err := superiorPortAllocator.ReleaseRange(e.TargetPort, e.TargetPort); err != nil { - return errors.Wrapf(err, "failed to release target port %d", e.TargetPort) - } + if err := superiorPortAllocator.ReleaseRange(e.TargetPort, e.TargetPort); err != nil { + return errors.Wrapf(err, "failed to release target port %d", e.TargetPort) } + e.TargetPort = 0 return nil
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
⛔ Files ignored due to path filters (9)
go.mod
is excluded by!go.mod
go.sum
is excluded by!**/*.sum
,!go.sum
vendor/github.com/longhorn/go-spdk-helper/pkg/nvme/initiator.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/nvme/nvme.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/nvme/nvmecli.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/spdk/client/advanced.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/types/types.go
is excluded by!vendor/**
vendor/github.com/longhorn/go-spdk-helper/pkg/util/device.go
is excluded by!vendor/**
vendor/modules.txt
is excluded by!vendor/**
📒 Files selected for processing (5)
pkg/client/client.go
(1 hunks)pkg/spdk/engine.go
(29 hunks)pkg/spdk/replica.go
(4 hunks)pkg/spdk/server.go
(1 hunks)pkg/spdk_test.go
(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- pkg/client/client.go
- pkg/spdk/server.go
- pkg/spdk/replica.go
🔇 Additional comments (7)
pkg/spdk_test.go (1)
271-271
: LGTM: Consistent removal of upgradeRequired parameter
The changes consistently remove the upgradeRequired
parameter from all EngineCreate
calls across test functions, aligning with the refactoring in the engine implementation.
Also applies to: 354-354, 530-530, 1228-1228, 1243-1243
pkg/spdk/engine.go (6)
58-59
: Improved field naming clarity
The rename from dmDeviceBusy
to dmDeviceIsBusy
makes the boolean field's purpose more explicit.
117-119
: LGTM: Clear and concise helper method
The isNewEngine()
method provides a clear way to check if an engine is newly created by examining both IP fields.
121-147
: Well-structured requirements check method
The checkInitiatorAndTargetCreationRequirements
method effectively:
- Encapsulates complex logic for determining creation requirements
- Provides clear logging for each scenario
- Returns explicit boolean flags for required actions
Line range hint 2229-2238
: Improved error handling for target switchover
Good addition of suspension check before target switchover to prevent potential issues. This helps ensure the engine is in a safe state before proceeding.
2404-2435
: 🛠️ Refactor suggestion
Consider improving error handling in DeleteTarget
The error handling could be more granular to ensure proper cleanup even when some operations fail.
func (e *Engine) DeleteTarget(spdkClient *spdkclient.Client, superiorPortAllocator *commonbitmap.Bitmap) (err error) {
e.log.Infof("Deleting target with target port %d", e.TargetPort)
+ var errs []error
+
err = spdkClient.StopExposeBdev(e.Nqn)
if err != nil {
- return errors.Wrapf(err, "failed to stop expose bdev while deleting target instance for engine %s", e.Name)
+ errs = append(errs, errors.Wrapf(err, "failed to stop expose bdev"))
}
err = e.releaseTargetPort(superiorPortAllocator)
if err != nil {
- return errors.Wrapf(err, "failed to release target port while deleting target instance for engine %s", e.Name)
+ errs = append(errs, errors.Wrapf(err, "failed to release target port"))
}
e.log.Infof("Deleting raid bdev %s while deleting target instance", e.Name)
_, err = spdkClient.BdevRaidDelete(e.Name)
if err != nil && !jsonrpc.IsJSONRPCRespErrorNoSuchDevice(err) {
- return errors.Wrapf(err, "failed to delete raid bdev after engine %s while deleting target instance", e.Name)
+ errs = append(errs, errors.Wrapf(err, "failed to delete raid bdev"))
}
for replicaName, replicaStatus := range e.ReplicaStatusMap {
e.log.Infof("Disconnecting replica %s while deleting target instance", replicaName)
err = disconnectNVMfBdev(spdkClient, replicaStatus.BdevName)
if err != nil {
e.log.WithError(err).Warnf("Engine failed to disconnect replica %s while deleting target instance", replicaName)
replicaStatus.Mode = types.ModeERR
+ errs = append(errs, errors.Wrapf(err, "failed to disconnect replica %s", replicaName))
}
replicaStatus.BdevName = ""
}
+
+ if len(errs) > 0 {
+ return fmt.Errorf("failed to delete target instance for engine %s: %v", e.Name, errs)
+ }
return nil
}
Likely invalid or redundant comment.
2238-2244
: Enhanced NVMe device info loading
The code now properly handles the case where NVMe device is not found, distinguishing between expected and unexpected errors.
✅ Verification successful
Let me analyze the error handling pattern further by checking the implementation of LoadNVMeDeviceInfo
and the error constant.
Let me verify the consistency of error handling by checking the implementation in the vendor package.
Based on the verification results, I can now provide the final response:
Error handling for NVMe device not found is consistently implemented
The error handling pattern is correctly and consistently implemented across the codebase:
- The error constant
ErrorMessageCannotFindValidNvmeDevice
is properly defined in the helper types package - The error check using
ErrorIsValidNvmeDeviceNotFound
is used consistently in multiple locations inengine.go
- The pattern distinguishes between expected device-not-found scenarios and other unexpected errors
- Similar error handling is also present in the vendor package's initiator implementation
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for similar error handling patterns in other files
rg "ErrorIsValidNvmeDeviceNotFound" -A 2
Length of output: 1571
Script:
#!/bin/bash
# Check LoadNVMeDeviceInfo implementation and error constant
ast-grep --pattern 'func LoadNVMeDeviceInfo($$$) $$ {
$$$
}'
# Check for the error constant definition
rg "ErrorMessageCannotFindValidNvmeDevice.*="
Length of output: 248
Script:
#!/bin/bash
# Check the implementation in vendor package
rg -A 10 "func LoadNVMeDeviceInfo" vendor/
# Check for any other usage of this error message
rg -B 2 "cannot find a valid NVMe device"
Length of output: 355
Longhorn 9919 Signed-off-by: Derek Su <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (5)
pkg/spdk/engine.go (5)
117-152
: Add godoc comments for the new methodsWhile the methods are well-structured and have good internal logging, they would benefit from godoc comments explaining their purpose, parameters, and return values.
Add documentation like this:
+// isNewEngine returns true if both IP and TargetIP are empty, indicating a new engine instance func (e *Engine) isNewEngine() bool { return e.IP == "" && e.TargetIP == "" } +// checkInitiatorAndTargetCreationRequirements determines whether initiator and target instances +// need to be created based on the pod IP and provided addresses. +// Returns (initiatorRequired, targetRequired, error) func (e *Engine) checkInitiatorAndTargetCreationRequirements(podIP, initiatorIP, targetIP string) (bool, bool, error) {
448-477
: Consider making retry parameters configurableThe retry logic for loading NVMe device info is a good addition, but the magic numbers for retries and intervals should be configurable parameters.
Consider making these parameters configurable:
+const ( + // DefaultMaxNumRetries is the default number of retries for loading NVMe device info + DefaultMaxNumRetries = 3 + // DefaultRetryInterval is the default interval between retries + DefaultRetryInterval = time.Second * 2 +) func (e *Engine) handleFrontend(...) { - for r := 0; r < maxNumRetries; r++ { + retries := DefaultMaxNumRetries + interval := DefaultRetryInterval + if e.config != nil { + retries = e.config.MaxNumRetries + interval = e.config.RetryInterval + } + for r := 0; r < retries; r++ { ... - time.Sleep(retryInterval) + time.Sleep(interval) } }
Line range hint
2180-2275
: Consider breaking down complex target switchover logicThe SwitchOverTarget method handles multiple responsibilities. Consider extracting some logic into helper methods for better maintainability.
Suggested refactoring:
+func (e *Engine) validateSwitchOverPrerequisites(newTargetAddress string) (string, error) { + if newTargetAddress == "" { + return "", fmt.Errorf("invalid empty target address for engine %s target switchover", e.Name) + } + + podIP, err := commonnet.GetIPForPod() + if err != nil { + return "", errors.Wrapf(err, "failed to get IP for pod") + } + + return podIP, nil +} +func (e *Engine) updateTargetAddresses(newTargetIP string, newTargetPort int32, podIP string) { + e.IP = newTargetIP + e.Port = newTargetPort + + if newTargetIP == podIP { + e.TargetPort = newTargetPort + } else { + e.TargetPort = 0 + } +} func (e *Engine) SwitchOverTarget(spdkClient *spdkclient.Client, newTargetAddress string) (err error) { - if newTargetAddress == "" { - return fmt.Errorf("invalid empty target address for engine %s target switchover", e.Name) - } - - podIP, err := commonnet.GetIPForPod() + podIP, err := e.validateSwitchOverPrerequisites(newTargetAddress) if err != nil { - return errors.Wrapf(err, "failed to get IP for pod") + return err } ... - e.IP = newTargetIP - e.Port = newTargetPort - if newTargetIP == podIP { - e.TargetPort = newTargetPort - } else { - e.TargetPort = 0 - } + e.updateTargetAddresses(newTargetIP, newTargetPort, podIP) ... }
2117-2119
: Enhance error handling documentationThe comments explain why the engine is not marked as error, but could be more detailed about the retry mechanism and potential failure scenarios.
Add more detailed comments:
-// Engine is still alive and running and is not really in error state. -// longhorn-manager will retry to suspend the engine. +// Engine is still operational despite the suspend/resume failure. +// This is a transient error that can be retried by longhorn-manager. +// Common causes: +// - Temporary network issues +// - Resource contention +// - Race conditions with other operations +// The engine will remain in its current state and longhorn-manager +// will automatically retry the operation with exponential backoff.Also applies to: 2150-2152
2449-2460
: Add validation in releaseTargetPort methodThe method could benefit from additional validation to ensure the port is within valid range before attempting release.
Suggested improvement:
func (e *Engine) releaseTargetPort(superiorPortAllocator *commonbitmap.Bitmap) error { + if superiorPortAllocator == nil { + return fmt.Errorf("invalid nil port allocator") + } + releaseTargetPortRequired := e.TargetPort != 0 if releaseTargetPortRequired { + // Validate port is within acceptable range (1-65535) + if e.TargetPort < 1 || e.TargetPort > 65535 { + return fmt.Errorf("invalid target port %d: must be between 1 and 65535", e.TargetPort) + } + if err := superiorPortAllocator.ReleaseRange(e.TargetPort, e.TargetPort); err != nil { return errors.Wrapf(err, "failed to release target port %d", e.TargetPort) } } e.TargetPort = 0 return nil }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (5)
pkg/client/client.go
(1 hunks)pkg/spdk/engine.go
(29 hunks)pkg/spdk/replica.go
(4 hunks)pkg/spdk/server.go
(1 hunks)pkg/spdk_test.go
(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- pkg/client/client.go
- pkg/spdk/server.go
- pkg/spdk/replica.go
🔇 Additional comments (9)
pkg/spdk_test.go (6)
271-271
: LGTM: Parameter removal is consistent with the API changes
The removal of the upgradeRequired
parameter from EngineCreate
aligns with the refactoring mentioned in the AI summary.
354-354
: LGTM: Parameter removal is consistent with the API changes
The removal of the upgradeRequired
parameter from EngineCreate
maintains consistency with the previous change.
530-530
: LGTM: Parameter removal is consistent with the API changes
The removal of the upgradeRequired
parameter from EngineCreate
in the snapshot operations test maintains consistency.
1228-1228
: LGTM: Parameter removal is consistent in helper function
The removal of the upgradeRequired
parameter from both EngineCreate
calls in the revertSnapshot
helper function maintains consistency with the API changes.
Also applies to: 1243-1243
Line range hint 1-1354
: Consider adding test cases for upgrade scenarios
While the removal of the upgradeRequired
parameter simplifies the API, we should ensure that upgrade scenarios are still properly tested through other means.
Let's check if there are any existing upgrade-related tests:
Consider adding dedicated test cases for version upgrade scenarios if they don't exist elsewhere in the codebase.
1354-1354
: LGTM: Parameter removal is consistent in target-only test
The removal of the upgradeRequired
parameter from EngineCreate
in the target-only test case maintains consistency with the API changes.
Let's verify that we haven't missed any instances of EngineCreate
calls:
✅ Verification successful
Let me search specifically for EngineCreate
calls in the codebase to verify the parameter changes.
Let me verify the EngineCreateRequest
structure to confirm the parameter changes.
Let me try searching for the EngineCreateRequest structure using ripgrep instead.
Based on the search results, I can now generate the final response since I have found all the necessary information. The EngineCreateRequest
structure shows that upgradeRequired
is still a valid field in the protobuf definition, but it's being removed from the test calls. Let me verify all test calls to ensure consistency.
LGTM: All test cases consistently use the updated EngineCreate signature
All test cases in pkg/spdk_test.go
have been updated to use the new signature of EngineCreate
with false
for the salvageRequested
parameter (previously upgradeRequired
). The changes are consistent across all test cases, maintaining the API compatibility.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for any remaining instances of EngineCreate with the old signature
rg "EngineCreate.*,.*,.*,.*,.*,.*,.*,.*,.*true|false\)"
Length of output: 20253
Script:
#!/bin/bash
# Search for EngineCreate calls with surrounding context
rg "EngineCreate\(" -A 2 -B 2
Length of output: 7257
Script:
#!/bin/bash
# Search for EngineCreateRequest structure definition
ast-grep --pattern 'message EngineCreateRequest {
$$$
}'
Length of output: 60
Script:
#!/bin/bash
# Search for EngineCreateRequest structure definition
rg "EngineCreateRequest" -A 15 -B 2
Length of output: 41171
Script:
#!/bin/bash
# Search for all EngineCreate test calls
rg "EngineCreate\(" pkg/spdk_test.go -A 2 -B 2
Length of output: 1746
pkg/spdk/engine.go (3)
13-13
: LGTM: Good addition of multierr package
The addition of go.uber.org/multierr
is appropriate for combining multiple errors in the error handling code, particularly in the new port release logic.
45-47
: LGTM: Improved field documentation and naming
Good improvements:
- Added clear documentation for Port and TargetPort fields
- Renamed dmDeviceBusy to dmDeviceIsBusy for better clarity
- Added initiator field for better encapsulation of NVMe initiator functionality
Also applies to: 58-59
2412-2443
: 🛠️ Refactor suggestion
Improve error handling to collect all errors
The current implementation returns on the first error. Consider collecting all errors to provide a complete picture of what failed during target deletion.
Suggested improvement:
func (e *Engine) DeleteTarget(spdkClient *spdkclient.Client, superiorPortAllocator *commonbitmap.Bitmap) (err error) {
e.log.Infof("Deleting target with target port %d", e.TargetPort)
+ var errs []error
if err := spdkClient.StopExposeBdev(e.Nqn); err != nil {
- return errors.Wrapf(err, "failed to stop expose bdev while deleting target instance for engine %s", e.Name)
+ errs = append(errs, errors.Wrapf(err, "failed to stop expose bdev"))
}
if err := e.releaseTargetPort(superiorPortAllocator); err != nil {
- return errors.Wrapf(err, "failed to release target port while deleting target instance for engine %s", e.Name)
+ errs = append(errs, errors.Wrapf(err, "failed to release target port"))
}
if _, err := spdkClient.BdevRaidDelete(e.Name); err != nil && !jsonrpc.IsJSONRPCRespErrorNoSuchDevice(err) {
- return errors.Wrapf(err, "failed to delete raid bdev after engine %s while deleting target instance", e.Name)
+ errs = append(errs, errors.Wrapf(err, "failed to delete raid bdev"))
}
for replicaName, replicaStatus := range e.ReplicaStatusMap {
if err := disconnectNVMfBdev(spdkClient, replicaStatus.BdevName); err != nil {
replicaStatus.Mode = types.ModeERR
+ errs = append(errs, errors.Wrapf(err, "failed to disconnect replica %s", replicaName))
}
replicaStatus.BdevName = ""
}
+ if len(errs) > 0 {
+ return fmt.Errorf("failed to delete target instance for engine %s: %v", e.Name, errs)
+ }
return nil
}
Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest running regression and negative test cases to ensure this PR doesn't introduce regressions for v1.7.x and v1.8.0 features.
For refactoring, do not suggest backporting this to 1.7.x but changing to master/1.8 is good enough. |
We won't backport it to v1.7.x. Currently, there are no regression test cases, but I've tested the attachment, detachment and salvage. |
Got it. We should still run regression and negative tests to ensure this change doesn't impact the other features in v1.8.0. |
@c3y1huang |
Which issue(s) this PR fixes:
Issue longhorn/longhorn#9919
Signed-off-by: Derek Su [email protected]
What this PR does / why we need it:
Special notes for your reviewer:
Additional documentation or context