test(robot): volume should not reattach after node eviction #2164

c3y1huang · 2024-11-14T09:05:57Z

Which issue(s) this PR fixes:

Issue longhorn/longhorn#9781, longhorn/longhorn#9811

What this PR does / why we need it:

Adding a test case to reproduce longhorn/longhorn#9781.

Special notes for your reviewer:

None

Additional documentation or context

None

Summary by CodeRabbit

Release Notes

New Features
- Introduced new keywords for managing nodes and volumes, including Set node ${node_id} with and Assert volume ${volume_id} remains detached for at least ${period} seconds.
- Added methods for scaling stateful sets and managing storage classes.
- New test cases added to enhance volume management scenarios, particularly related to node eviction and ensuring volumes remain detached.
Bug Fixes
- Improved error handling and control flow in volume operations and test cases.
Documentation
- Updated test cases to ensure proper state transitions and validations for volume operations.

coderabbitai · 2024-11-14T09:06:05Z

Walkthrough

The pull request introduces enhancements to the e2e testing framework, adding new keywords for managing nodes and volumes, improving logging functionality, and expanding test cases. Key additions include a method for setting node configurations, a verification step for detached volumes, and new classes for managing storage classes and replicas. The changes streamline existing functionality, particularly in volume management, while maintaining the integrity of existing methods and structures.

Changes

File	Change Summary
e2e/keywords/node.resource	Added keyword `Set node ${node_id} with` to configure nodes.
e2e/keywords/volume.resource	Introduced keyword `Assert volume ${volume_id} remains detached for at least ${period} seconds`.
e2e/libs/keywords/backing_image_keywords.py	Added method `wait_all_backing_image_managers_running`.
e2e/libs/keywords/backup_keywords.py	Added logging to `check_restored_volume_checksum`.
e2e/libs/keywords/backupstore_keywords.py	Adjusted import statements; no functional changes.
e2e/libs/keywords/common_keywords.py	Added methods `init_k8s_api_client` and `generate_name_with_suffix`.
e2e/libs/keywords/engine_image_keywords.py	Removed unused import for `logging`.
e2e/libs/keywords/engine_keywords.py	Removed unused import for `logging`.
e2e/libs/keywords/host_keywords.py	Minor formatting changes; no functional changes.
e2e/libs/keywords/k8s_keywords.py	Reorganized import statements; no functional changes.
e2e/libs/keywords/longhorn_deploy_keywords.py	Minor formatting changes; no functional changes.
e2e/libs/keywords/network_keywords.py	Added method `wait_for_block_network_pod_completed`.
e2e/libs/keywords/node_keywords.py	Added method `set_node` for node configuration.
e2e/libs/keywords/persistentvolume_keywords.py	Added logging to `delete_persistentvolume`.
e2e/libs/keywords/replica_keywords.py	Introduced new class `replica_keywords` with method `validate_replica_setting`.
e2e/libs/keywords/setting_keywords.py	Introduced new class `setting_keywords` with methods for managing settings.
e2e/libs/keywords/sharemanager_keywords.py	Minor formatting changes; no functional changes.
e2e/libs/keywords/statefulset_keywords.py	Added method `scale_statefulset_up` and modified `scale_statefulset_down`.
e2e/libs/keywords/storageclass_keywords.py	Introduced new class `storageclass_keywords` with methods for managing storage classes.
e2e/libs/keywords/workload_keywords.py	Removed redundant import for `Node`; no functional changes.
e2e/libs/node/node.py	Added method `set_node` for updating node properties and enhanced error handling in `set_node_scheduling`.
e2e/libs/volume/crd.py	Modified `detach` method to simplify control flow and improve error handling.
e2e/tests/negative/network_disconnect.robot	Added step to wait for volume 0 to be detached in a test case.
e2e/tests/negative/stress_cpu.robot	Added step to wait for volume 0 to be detached in a test case.
e2e/tests/negative/stress_filesystem.robot	Added step to wait for volume 0 to be detached in a test case.
e2e/tests/negative/stress_memory.robot	Added step to wait for volume 0 to be detached in a test case.
e2e/tests/regression/test_backing_image.robot	Added step to wait for volume 0 to be detached in a test case.
e2e/tests/regression/test_backup.robot	Added step to wait for volume 3 to be detached in a test case.
e2e/tests/regression/test_basic.robot	Added steps to wait for volume detachment in multiple test cases.
e2e/tests/regression/test_settings.robot	Added step to wait for volume 1 to be detached in a test case.
e2e/tests/regression/test_v2.robot	Added step to wait for volume 0 to be detached in a test case.
e2e/tests/regression/test_volume.robot	Added new test case for detached volume management and included new resources.

Assessment against linked issues

Objective	Addressed	Explanation
Volume should return to detached state during node eviction (9781)	✅

Possibly related PRs

test(robot): v2 volume should block trim when volume is degraded #2114: The main PR introduces a new keyword related to node configuration, while this PR adds a keyword for trimming workloads, indicating a focus on volume management and node operations, which may overlap in functionality.
test(robot): add test case Test Longhorn components recovery #2143: This PR adds test cases related to the recovery of Longhorn components, which may involve node management and configuration, similar to the changes in the main PR.
fix(node): cleanup node exec in test case teardown #2157: The addition of a cleanup method for node execution in this PR relates to the management of nodes, which is a core aspect of the main PR's focus on setting node configurations.
test(robot): migrate test_soft_anti_affinity_scheduling #2158: The migration of a test case related to soft anti-affinity scheduling involves node management, which aligns with the main PR's focus on node configuration.
test(robot): migrate test_replica_rebuild_per_volume_limit #2159: This PR focuses on managing volume replicas, which is closely related to the main PR's changes regarding node configurations and their impact on volume management.

Suggested reviewers

chriscchien
yangchiu

Poem

🐰 In the fields where volumes play,
A node's fate is set today.
With tests that wait and check with care,
Detached they stay, free from despair.
So let us hop and cheer with glee,
For every change brings harmony! 🌼

Warning

Rate limit exceeded

@c3y1huang has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 10 minutes and 10 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between a132a4e and a6b63ea.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a132a4e and a6b63ea.

📒 Files selected for processing (5)

e2e/keywords/node.resource (1 hunks)
e2e/keywords/volume.resource (1 hunks)
e2e/libs/keywords/node_keywords.py (2 hunks)
e2e/libs/node/node.py (1 hunks)
e2e/tests/regression/test_volume.robot (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (5)

e2e/keywords/node.resource
e2e/keywords/volume.resource
e2e/libs/keywords/node_keywords.py
e2e/libs/node/node.py
e2e/tests/regression/test_volume.robot

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

c3y1huang · 2024-11-14T09:06:21Z

@coderabbitai review

coderabbitai · 2024-11-14T09:06:28Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (25)

e2e/libs/keywords/replica_keywords.py (3)
Line range hint 3-7: Add class-level documentation

The class name follows Robot Framework's keyword naming convention correctly. However, adding a docstring would improve maintainability by documenting the class's purpose and usage in the E2E testing framework.
 class replica_keywords:
+    """Keywords for managing and validating replica settings in E2E tests.
+
+    This class provides Robot Framework keywords for interacting with Longhorn
+    volume replicas, particularly focusing on settings validation.
+    """
Line range hint 8-9: Enhance method documentation and type safety

The method would benefit from type hints and docstring to improve maintainability and IDE support.
-    def validate_replica_setting(self, volume_name, setting_name, value):
-        return self.replica.validate_replica_setting(volume_name, setting_name, value)
+    def validate_replica_setting(self, volume_name: str, setting_name: str, value: str) -> bool:
+        """Validate a specific setting for a volume's replica.
+
+        Args:
+            volume_name: Name of the volume to validate
+            setting_name: Name of the replica setting to check
+            value: Expected value of the setting
+
+        Returns:
+            bool: True if the setting matches the expected value
+        """
+        return self.replica.validate_replica_setting(volume_name, setting_name, value)
Error handling improvements needed in replica validation

The verification reveals minimal error handling in the replica validation chain:

The CRD implementation only uses a basic assertion for value comparison

No exception handling for missing settings, invalid values, or API failures

The Base class defines abstract methods but no error handling patterns

Key locations needing error handling:

e2e/libs/replica/crd.py: Add try-except blocks around replica retrieval and validation

e2e/libs/keywords/replica_keywords.py: Wrap replica validation calls in error handling

e2e/libs/replica/base.py: Define common error handling patterns in the base class

🔗 Analysis chain

Line range hint 1-9: Verify error handling in the Replica class

The code delegates validation to the Replica class without any error handling. We should verify that the underlying Replica class properly handles error cases.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Check error handling in the Replica class implementation

# Look for error handling in the Replica class
rg -A 5 "class Replica" 
rg -A 10 "validate_replica_setting" 

# Look for potential error cases and exceptions
rg -l "raise" $(fd -t f "replica.py")
Length of output: 1493

Script:
#!/bin/bash
# Let's check for error handling patterns in the CRD implementation and related files

# Check for assertions and error handling in CRD implementation
rg -A 5 "assert|raise|try|except" e2e/libs/replica/crd.py

# Look for the Base class implementation since Replica inherits from it
rg -A 5 "class Base" 

# Check for any error handling in the volume-related operations
rg -A 5 "def get" e2e/libs/replica/crd.py
Length of output: 3999
e2e/libs/keywords/persistentvolume_keywords.py (1)
Line range hint 11-13: Consider enhancing error handling and documentation

While the logging addition is good for observability, consider these improvements:

Add a docstring describing the method's purpose and parameters

Add error handling for the delete operation

Log the operation result

Here's a suggested implementation:
     def delete_persistentvolume(self, name):
+        """Delete a persistent volume.
+        
+        Args:
+            name: Name of the persistent volume to delete
+        
+        Returns:
+            bool: True if successful, False otherwise
+        """
         logging(f'Deleting persistentvolume {name}')
-        return self.pv.delete(name)
+        try:
+            result = self.pv.delete(name)
+            logging(f'Deleted persistentvolume {name} successfully')
+            return result
+        except Exception as e:
+            logging(f'Failed to delete persistentvolume {name}: {str(e)}')
+            raise
e2e/libs/keywords/setting_keywords.py (1)
1-1: Add class documentation and type hints

The class would benefit from:

A docstring explaining its purpose and usage in Robot Framework tests

Type hints for better IDE support and code maintainability
from setting import Setting
+from typing import Optional

class setting_keywords:
+    """
+    Robot Framework keyword implementations for managing Longhorn settings.
+    
+    This class provides keywords for updating and managing Longhorn settings
+    during e2e tests, including backup store configuration and settings reset.
+    """
Also applies to: 3-4
e2e/libs/keywords/storageclass_keywords.py (5)
4-6: Add class-level documentation.

Consider adding a docstring to document the purpose and usage of this class, especially since it's part of an E2E testing framework.
 class storageclass_keywords:
+    """Keywords for managing storage classes in E2E tests.
+
+    This class provides Robot Framework keywords for creating and cleaning up
+    storage classes during E2E testing, particularly for volume management scenarios.
+    """
Line range hint 7-8: Add method documentation and type hints.

Consider adding a docstring and type hints to improve code maintainability.
     def __init__(self):
+        """Initialize storageclass_keywords with a new StorageClass instance."""
         self.storageclass = StorageClass()
Line range hint 10-12: Enhance method documentation and parameter validation.

The method would benefit from:

A comprehensive docstring describing parameters and their valid values

Type hints for better code maintainability

Input validation for critical parameters

Use of actual boolean values instead of strings
-    def create_storageclass(self, name, numberOfReplicas="3", migratable="false", dataLocality="disabled", fromBackup="", nfsOptions="", dataEngine="v1"):
+    def create_storageclass(
+        self,
+        name: str,
+        numberOfReplicas: str = "3",
+        migratable: bool = False,
+        dataLocality: str = "disabled",
+        fromBackup: str = "",
+        nfsOptions: str = "",
+        dataEngine: str = "v1"
+    ) -> None:
+        """Create a storage class with specified configuration.
+
+        Args:
+            name: Name of the storage class
+            numberOfReplicas: Number of replicas (default: "3")
+            migratable: Whether volumes can be migrated (default: False)
+            dataLocality: Data locality setting (default: "disabled")
+            fromBackup: Backup source if restoring (default: "")
+            nfsOptions: NFS-specific options (default: "")
+            dataEngine: Data engine version (default: "v1")
+
+        Raises:
+            ValueError: If name is empty or numberOfReplicas is invalid
+        """
+        if not name:
+            raise ValueError("Storage class name cannot be empty")
+        if not numberOfReplicas.isdigit() or int(numberOfReplicas) < 1:
+            raise ValueError("Number of replicas must be a positive integer")
+
         logging(f'Creating storageclass with {locals()}')
-        self.storageclass.create(name, numberOfReplicas, migratable, dataLocality, fromBackup, nfsOptions, dataEngine)
+        self.storageclass.create(
+            name,
+            numberOfReplicas,
+            str(migratable).lower(),  # Convert boolean to "true"/"false"
+            dataLocality,
+            fromBackup,
+            nfsOptions,
+            dataEngine
+        )
Line range hint 14-15: Add error handling and documentation.

Consider adding error handling and documentation to make the cleanup process more robust.
-    def cleanup_storageclasses(self):
-        self.storageclass.cleanup()
+    def cleanup_storageclasses(self) -> None:
+        """Clean up all storage classes created during testing.
+
+        This method ensures proper cleanup of storage class resources
+        after test completion.
+        """
+        try:
+            self.storageclass.cleanup()
+        except Exception as e:
+            logging(f"Error during storage class cleanup: {e}")
+            raise
Line range hint 4-15: Consider adding verification methods for storage class states.

Given that this PR focuses on testing volume detachment scenarios, consider adding methods to verify storage class states during node eviction tests. This could include methods to:

Verify storage class configuration

Check volume binding mode

Validate storage class deletion state

This would enhance the testing capabilities for the volume detachment scenarios described in longhorn/longhorn#9781.
e2e/libs/keywords/common_keywords.py (2)
Line range hint 12-17: Add docstrings to new methods

While the implementation is correct, adding docstrings would improve maintainability and help other developers understand the purpose and usage of these methods.

Consider adding docstrings like this:
 def init_k8s_api_client(self):
+    """Initialize the Kubernetes API client for test automation.
+    This method should be called before performing any Kubernetes API operations.
+    """
     init_k8s_api_client()

 def generate_name_with_suffix(self, kind, suffix):
+    """Generate a name with a suffix for test resources.
+    
+    Args:
+        kind (str): The kind of resource (e.g., 'node', 'volume')
+        suffix (str): The suffix to append to the generated name
+        
+    Returns:
+        str: The generated name with the suffix
+    """
     return generate_name_with_suffix(kind, suffix)
Line range hint 1-24: Well-integrated changes supporting test automation needs

The new methods integrate well with the existing node management functionality and support the PR's objective of testing volume behavior during node eviction. The class maintains good cohesion and single responsibility.

Consider creating a separate configuration class if more Kubernetes client setup methods are needed in the future, to maintain the single responsibility principle of this keywords class.
e2e/keywords/node.resource (1)
14-17: Add documentation for the new keyword

The implementation follows the established pattern of other node-related keywords, but lacks documentation. Please add documentation to specify:

Purpose of the keyword

Expected format of the config dictionary

Example usage

Apply this diff:
 Set node ${node_id} with
+    [Documentation]    Updates node configuration with the provided parameters
+    ...    
+    ...    Arguments:
+    ...    - config: Dictionary containing node configuration parameters (e.g., allowScheduling, evictionRequested)
+    ...    
+    ...    Example:
+    ...    | Set node 1 with | evictionRequested=true | allowScheduling=false |
     [Arguments]    &{config}
     ${node_name} =    get_node_by_index    ${node_id}
     set_node   ${node_name}    &{config}
e2e/tests/regression/test_backing_image.robot (1)
32-34: Consider adding backing image state verification after detachment.

While the test properly waits for volume detachment, it might be valuable to verify the backing image state before proceeding with deletion. This could help catch any potential state inconsistencies.

Consider adding this step after volume detachment:
    And Detach volume 0
    And Wait for volume 0 detached
+   And Verify all disk file status of backing image bi are ready
    And Delete volume 0
e2e/libs/keywords/node_keywords.py (1)
37-40: LGTM! Consider adding docstring documentation.

The implementation is clean and aligns well with the PR's objective of testing volume behavior during node eviction. The method provides the necessary controls for node scheduling and eviction states.

Consider adding a docstring to document the parameters and their effects:
 def set_node(self, node_name, allowScheduling=True, evictionRequested=False):
+    """Configure node scheduling and eviction states.
+    
+    Args:
+        node_name: Name of the node to configure
+        allowScheduling: Whether to allow scheduling on this node
+        evictionRequested: Whether to request node eviction
+    """
     logging(f"Setting node {node_name}; scheduling={allowScheduling}; evictionRequested={evictionRequested}")
     self.node.set_node(node_name, allowScheduling, evictionRequested)
e2e/libs/keywords/network_keywords.py (2)
Line range hint 49-50: Enhance the new method with proper error handling and logging

The new method should include:

Logging for better debugging

Error handling

Timeout parameter

Input validation

Documentation

Consider this implementation:
-    def wait_for_block_network_pod_completed(self, pod_name, status, namespace='default'):
-        wait_for_pod_status(pod_name, status, namespace)
+    def wait_for_block_network_pod_completed(self, pod_name, status, namespace='default', timeout=300):
+        """Wait for a network block pod to reach the specified status.
+
+        Args:
+            pod_name: Name of the pod to wait for
+            status: Expected status of the pod
+            namespace: Kubernetes namespace (default: 'default')
+            timeout: Maximum time to wait in seconds (default: 300)
+
+        Raises:
+            ValueError: If pod_name or status is empty
+            TimeoutError: If pod doesn't reach the desired status within timeout
+        """
+        if not pod_name or not status:
+            raise ValueError("Pod name and status must not be empty")
+
+        logging(f"Waiting for pod {pod_name} to reach status {status} in namespace {namespace}")
+        try:
+            wait_for_pod_status(pod_name, status, namespace, timeout)
+            logging(f"Pod {pod_name} reached status {status}")
+        except Exception as e:
+            logging(f"Failed waiting for pod {pod_name}: {str(e)}")
+            raise
47-47: Remove unnecessary blank line

The blank line between methods is not needed as there's already proper spacing.
-
e2e/tests/regression/test_volume.robot (1)
54-62: Consider enhancing test robustness with additional verifications

While the current implementation covers the basic scenario, consider adding these verifications for better test coverage:

Verify node eviction status after setting evictionRequested=true

Check for the presence of stopped replicas on the evicted node

Add a verification step to ensure the volume's attachment mode is actually "detached" in the backend

Example additions:
    And Wait for node 1 evicted
    And Verify no active replicas on node 1
    And Volume 0 should have attachment mode detached
    And Assert volume 0 remains detached for at least 60 seconds
e2e/tests/negative/stress_cpu.robot (1)

44-44: LGTM! Important reliability improvement.

The addition of Wait for volume 0 detached is a crucial improvement that ensures proper state transitions between volume detachment and attachment operations. This explicit wait helps prevent race conditions and makes the test more reliable, especially important when testing under CPU stress conditions.

This change follows good testing practices by:

Ensuring deterministic state transitions

Preventing potential false positives from premature attachment attempts

Making the test more resilient to timing variations under CPU stress

e2e/tests/negative/stress_filesystem.robot (1)

45-45: Excellent addition of wait step to ensure proper volume state transition!

This change is crucial as it ensures the volume is fully detached before attempting to reattach it. Without this wait step, the test could potentially mask issues (like the one in longhorn#9781) where volumes get stuck in an incorrect state. This addition follows testing best practices by explicitly verifying state transitions and preventing race conditions.

This pattern of explicit state verification should be consistently applied across all volume detachment operations in the test suite to ensure reliable test results.

e2e/tests/regression/test_v2.robot (1)

35-35: Consider adding a dedicated test case for volume detachment during node eviction.

While this change improves the basic volume test, it doesn't fully cover the scenario described in longhorn#9781 where a detached volume remains stuck in the attached state during node eviction. Consider adding a new test case that:

Sets up a 3-node cluster with a volume having 3 replicas

Attaches and then detaches the volume

Evicts a node containing one of the replicas

Verifies the volume properly transitions to detached state

Would you like me to help draft the new test case that covers this scenario?

e2e/tests/regression/test_backup.robot (1)

105-105: LGTM! Essential wait step for reliable volume detachment testing.

This addition improves test reliability by ensuring volume 3 is fully detached before proceeding with PV/PVC creation. The change aligns well with the PR's objective of testing volume detachment behavior, particularly in scenarios where volumes might get stuck in an attached state.

This pattern of explicitly waiting for state transitions should be consistently applied across all volume detachment operations in the test suite to prevent race conditions and flaky tests.
e2e/tests/negative/network_disconnect.robot (2)
167-168: Remove duplicate wait condition.

These two consecutive lines are performing the same wait operation:
-    And Wait for volume 0 detached
-    Then Wait for volume 0 detached
+    Then Wait for volume 0 detached
Line range hint 136-168: Test case doesn't fully cover the reported bug scenario.

The test case "Node Disconnect With Statefulset" doesn't fully address the bug reported in longhorn/longhorn#9781 where a detached volume remains stuck in attached state during node eviction. Consider adding a new test case or extending this one to:

Create a three-node cluster setup

Create a volume with three replicas

Attach and then detach the volume

Evict a node that has a stopped replica

Verify that the volume properly transitions to detached state

This will help ensure the specific bug scenario is properly tested and prevent regression.

Would you like me to help create a new test case that specifically addresses this scenario?
e2e/keywords/volume.resource (1)
122-128: LGTM with suggestions for improvement

The implementation correctly verifies that a volume remains detached for the specified duration. However, consider these enhancements for better robustness and observability:
 Assert volume ${volume_id} remains detached for at least ${period} seconds
     ${volume_name} =    generate_name_with_suffix    volume    ${volume_id}
+    Log    Monitoring volume ${volume_name} detached state for ${period} seconds
     FOR    ${i}    IN RANGE    ${period}
         wait_for_volume_detached    ${volume_name}
+        Log    Volume ${volume_name} remained detached for ${i+1} seconds
         Sleep    1
     END
+    Log    Successfully verified volume ${volume_name} remained detached for ${period} seconds
The suggested improvements:

Add logging statements to track the progress and final state

This will help in debugging test failures by providing clear timeline of events

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a360760 and de9835c.

📒 Files selected for processing (32)

e2e/keywords/node.resource (1 hunks)
e2e/keywords/volume.resource (1 hunks)
e2e/libs/keywords/backing_image_keywords.py (1 hunks)
e2e/libs/keywords/backup_keywords.py (1 hunks)
e2e/libs/keywords/backupstore_keywords.py (1 hunks)
e2e/libs/keywords/common_keywords.py (1 hunks)
e2e/libs/keywords/engine_image_keywords.py (1 hunks)
e2e/libs/keywords/engine_keywords.py (1 hunks)
e2e/libs/keywords/host_keywords.py (1 hunks)
e2e/libs/keywords/k8s_keywords.py (2 hunks)
e2e/libs/keywords/longhorn_deploy_keywords.py (1 hunks)
e2e/libs/keywords/network_keywords.py (1 hunks)
e2e/libs/keywords/node_keywords.py (2 hunks)
e2e/libs/keywords/persistentvolume_keywords.py (1 hunks)
e2e/libs/keywords/replica_keywords.py (1 hunks)
e2e/libs/keywords/setting_keywords.py (1 hunks)
e2e/libs/keywords/sharemanager_keywords.py (1 hunks)
e2e/libs/keywords/statefulset_keywords.py (0 hunks)
e2e/libs/keywords/storageclass_keywords.py (1 hunks)
e2e/libs/keywords/workload_keywords.py (1 hunks)
e2e/libs/node/node.py (1 hunks)
e2e/libs/volume/crd.py (0 hunks)
e2e/tests/negative/network_disconnect.robot (1 hunks)
e2e/tests/negative/stress_cpu.robot (1 hunks)
e2e/tests/negative/stress_filesystem.robot (1 hunks)
e2e/tests/negative/stress_memory.robot (1 hunks)
e2e/tests/regression/test_backing_image.robot (1 hunks)
e2e/tests/regression/test_backup.robot (1 hunks)
e2e/tests/regression/test_basic.robot (2 hunks)
e2e/tests/regression/test_settings.robot (1 hunks)
e2e/tests/regression/test_v2.robot (1 hunks)
e2e/tests/regression/test_volume.robot (2 hunks)

💤 Files with no reviewable changes (2)

e2e/libs/keywords/statefulset_keywords.py
e2e/libs/volume/crd.py

✅ Files skipped from review due to trivial changes (7)

e2e/libs/keywords/backupstore_keywords.py
e2e/libs/keywords/engine_image_keywords.py
e2e/libs/keywords/engine_keywords.py
e2e/libs/keywords/host_keywords.py
e2e/libs/keywords/k8s_keywords.py
e2e/libs/keywords/longhorn_deploy_keywords.py
e2e/libs/keywords/sharemanager_keywords.py

🧰 Additional context used

🪛 Ruff

e2e/libs/keywords/workload_keywords.py

5-5: Redefinition of unused Node from line 4

(F811)

🔇 Additional comments (24)

e2e/libs/keywords/persistentvolume_keywords.py (2)

1-4: LGTM: Clean import organization

The imports are well-organized and the logging utility is appropriately imported from the common utility module.

Line range hint 1-13: Verify integration with node eviction test scenario

The PV deletion functionality appears to be a supporting operation for the node eviction test case. Please ensure that this implementation adequately supports the test scenario where volumes remain stuck in attached state during node eviction.

✅ Verification successful

Let me gather more information about the test implementation to verify the integration.

Let me check one more aspect to understand how the PV deletion is used in the actual test cleanup.

PV deletion functionality correctly supports node eviction test scenarios

The implementation aligns well with the node eviction test requirements:

The PV deletion operation is used as part of the test infrastructure through csi_pv fixture
The test scenarios validate volume behavior during node eviction, including:
- Basic node eviction with replica rescheduling
- Eviction with no schedulable nodes
- Eviction with soft anti-affinity
- Multiple volume eviction cases

The logging addition in the PV deletion operation will help track volume cleanup during these test scenarios.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for related test files that use this keyword
rg -l "delete_persistentvolume" --type py

# Look for the actual test implementation
rg "test.*node.*eviction" --type py

Length of output: 862

Script:

#!/bin/bash
# Check the test implementation details
rg "test_node_eviction" -A 10 manager/integration/tests/test_node.py

# Check how PV deletion is used in the test
rg "delete_persistentvolume" -A 5 -B 5 e2e/libs/keywords/persistentvolumeclaim_keywords.py

Length of output: 3298

Script:

#!/bin/bash
# Check test cleanup and PV usage
rg "def cleanup|def setup" -A 5 manager/integration/tests/test_node.py

# Check PV creation in the test
rg "csi_pv" -A 5 -B 5 manager/integration/tests/test_node.py

Length of output: 4648

e2e/libs/keywords/setting_keywords.py (1)

Line range hint 6-17: Add method documentation and error handling

The methods would benefit from:

Type hints and docstrings to document their purpose and parameters
Error handling for potential exceptions from the Setting instance

     def __init__(self):
+        """Initialize the settings keyword implementation."""
         self.setting = Setting()

-    def update_setting(self, key, value, retry=True):
-        self.setting.update_setting(key, value, retry)
+    def update_setting(self, key: str, value: str, retry: bool = True) -> None:
+        """
+        Update a Longhorn setting with the specified key and value.
+
+        Args:
+            key: The setting key to update
+            value: The new value to set
+            retry: Whether to retry the update on failure
+        """
+        try:
+            self.setting.update_setting(key, value, retry)
+        except Exception as e:
+            raise RuntimeError(f"Failed to update setting {key}: {str(e)}")

Let's verify how these settings are used in the volume reattachment test:

e2e/libs/keywords/common_keywords.py (1)

3-5: LGTM! Well-organized imports

The new imports are properly organized with clear separation, following Python conventions.

e2e/keywords/node.resource (2)

14-17: Implementation pattern looks good

The implementation:

Follows the established pattern of other node-related keywords
Reuses existing functions appropriately
Maintains consistency with the codebase

14-17: Consider adding parameter validation

The keyword accepts a config dictionary without validation. Consider adding validation for:

Required config parameters
Parameter types
Valid value ranges

Let's check what parameters are supported by looking at the node configuration code:

e2e/tests/regression/test_backing_image.robot (1)

33-33: LGTM! Important addition for test reliability.

The added wait step ensures proper volume detachment verification before proceeding with deletion, which is crucial for test reliability and aligns with the PR's focus on volume detachment behavior.

Let's verify this is a consistent pattern across other test files:

✅ Verification successful

Consistent pattern confirmed: Wait steps are properly used before volume operations

The codebase analysis shows a clear and consistent pattern where:

Wait steps are used before volume deletion operations (Wait for volume X detached)
Wait steps are used after replica deletion for rebuilding (Wait until volume X replica rebuilding)

The added wait step in test_backing_image.robot follows this established pattern, as seen in multiple test files:

test_v2.robot: "Wait for volume 0 detached" before deletion
test_basic.robot: "Wait for volume 0 detached" before deletion
Similar patterns in test_backup_listing.robot

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for consistent usage of "Wait for volume" steps before deletion
# Expected: Find similar patterns in other test files to confirm this is a standard practice

# Search for volume deletion patterns in robot files
rg -A 2 "Delete volume" --type robot

# Search for wait-before-delete patterns
rg -B 2 "Delete volume" --type robot

Length of output: 13365

e2e/libs/keywords/backing_image_keywords.py (2)

26-26: LGTM!

The added empty line improves code readability by separating method groups.

28-29: LGTM! Method follows established patterns.

The new method wait_all_backing_image_managers_running follows the class's consistent delegation pattern and will be useful for ensuring backing image stability during volume operations, which is particularly relevant for the volume detachment testing scenario being implemented.

Let's verify the implementation of the delegated method:

✅ Verification successful

Implementation verified and follows expected patterns

The implementation in both classes is correct and follows good practices:

The keyword class properly delegates to the CRD class
The CRD implementation correctly:
- Lists all backing image managers
- Checks their states in a loop with retry logic
- Verifies each manager is in "running" state
- Uses appropriate logging

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation of wait_all_backing_image_managers_running in the BackingImage class

# Search for the method implementation in the backing_image.py file
fd --type f "backing_image.py" --exec rg -A 5 "wait_all_backing_image_managers_running"

Length of output: 460

Script:

#!/bin/bash
# Let's verify the implementation in the CRD class
fd --type f "crd.py" --exec rg -A 10 "wait_all_backing_image_managers_running"

Length of output: 721

e2e/libs/keywords/node_keywords.py (1)

37-40: Verify Node class implementation of set_node method.

Let's verify that the Node class properly implements the set_node method with eviction support.

✅ Verification successful

Node class implementation properly handles eviction support

The Node class in e2e/libs/node/node.py correctly implements the set_node method with proper handling of both allowScheduling and evictionRequested parameters:

Updates node settings via Longhorn client with both parameters
Verifies the changes through assertions
Maintains parameter consistency with the calling code

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify Node class implementation of set_node method
# Expected: Find the implementation showing proper handling of evictionRequested parameter

# Search for Node class implementation
ast-grep --pattern 'class Node {
  $$$
  set_node($$$) {
    $$$
  }
  $$$
}'

# Backup: Search using ripgrep if ast-grep doesn't find it
rg -A 10 'def set_node.*node_name.*allowScheduling.*evictionRequested'

Length of output: 1825

e2e/libs/keywords/network_keywords.py (1)

Line range hint 49-50: Verify integration with test framework

Since this method is part of a test case for volume detachment during node eviction (longhorn/longhorn#9781), let's verify its integration with the test framework.

✅ Verification successful

Integration with test framework is properly implemented

The verification shows that:

wait_for_pod_status is a well-established utility function from workload.pod module, used across multiple core components
network_keywords.py is correctly integrated as a Robot Framework keyword library, providing network-related test functionality
The method wait_for_block_network_pod_completed is properly implemented as a wrapper around wait_for_pod_status, maintaining consistent behavior with other network operations in the test framework
The implementation follows the same pattern as other network-related operations in the class, such as drop_pod_egress_traffic and disconnect_node_network

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of wait_for_block_network_pod_completed in test files
# Expected: The method should be used in test cases related to volume detachment during node eviction

# Search for test files using this keyword
echo "Test files using wait_for_block_network_pod_completed:"
rg -l "wait_for_block_network_pod_completed" --type robot

# Search for related test scenarios
echo -e "\nRelated test scenarios:"
rg "Test Volume.*[Dd]etach.*[Ee]vict" --type robot

Length of output: 292

Script:

#!/bin/bash
# Let's verify the integration from multiple angles

# 1. Check if wait_for_pod_status is defined and its usage
echo "=== Checking wait_for_pod_status definition and usage ==="
rg "wait_for_pod_status" -A 3

# 2. Check for network-related test files and keywords
echo -e "\n=== Checking network-related test files ==="
fd "network.*\.py|network.*\.robot" 

# 3. Look for test cases involving pod status checks
echo -e "\n=== Checking test cases with pod status verification ==="
rg "Test.*Pod.*Status|Test.*Node.*Eviction" --type robot

# 4. Check the entire network_keywords.py file for context
echo -e "\n=== Content of network_keywords.py ==="
fd "network_keywords.py" --exec cat {}

Length of output: 5668

e2e/tests/regression/test_volume.robot (2)

12-13: LGTM: Resource imports are appropriate and well-organized

The addition of node.resource and volume.resource is necessary for the new test case and follows the existing import pattern.

48-53: LGTM: Test case properly implements reproduction steps for issue #9781

The test case is well-structured with appropriate tags and documentation reference.

e2e/libs/keywords/backup_keywords.py (2)

2-2: LGTM! Clean import addition

The import statement for logging functionality is properly placed and grouped with other utility imports.

Line range hint 43-44: LGTM! Good logging enhancement

The added logging statement improves test observability by clearly indicating which volume and backup are being verified. This will be helpful for debugging test failures, especially in the context of volume attachment issues being tested in this PR.

e2e/tests/negative/stress_memory.robot (1)

46-46: LGTM! Good addition of explicit wait state.

The addition of Wait for volume 0 detached step improves test reliability by ensuring proper state transition before proceeding with the attachment. This change:

Prevents potential race conditions
Makes the test more deterministic
Aligns with similar changes in other test files
Follows testing best practices by explicitly waiting for state changes

e2e/tests/regression/test_v2.robot (1)

35-35: LGTM! The added wait step improves test reliability.

Adding an explicit wait for volume detachment before deletion is a good practice as it ensures the volume has completely transitioned to the detached state.

e2e/tests/regression/test_settings.robot (1)

67-67: LGTM! The added verification step improves test reliability.

The addition of Wait for volume 1 detached is a good practice as it ensures the volume detachment operation completes before proceeding with subsequent steps, preventing potential race conditions.

Let's verify this pattern is consistently applied across other test files:

✅ Verification successful

The added verification step follows the established pattern across the test suite

The search results confirm that the Wait for volume X detached verification step is consistently used after volume detachment operations throughout the test suite, appearing in multiple test files including:

Regression tests (test_volume.robot, test_v2.robot, test_basic.robot)
Negative tests (stress_memory.robot, node_drain.robot, network_disconnect.robot)
Other related tests (test_backup.robot, test_backing_image.robot)

This verifies that the added line follows the established best practice for ensuring reliable volume detachment verification across the test suite.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if similar volume detachment verification steps are present in other test files
# Expected: Find similar patterns in other test files as mentioned in the AI summary

rg -A 2 "Detach volume" "e2e/tests/**/*.robot" | rg -B 2 "Wait for volume.*detached"

Length of output: 147

Script:

#!/bin/bash
# Let's try a different approach to find the pattern
# First, let's locate all robot test files
fd ".robot$" e2e/tests/

# Then search for both patterns separately to understand the context better
echo "=== Searching for 'Detach volume' patterns ==="
rg "Detach volume" e2e/tests/

echo -e "\n=== Searching for 'Wait for volume.*detached' patterns ==="
rg "Wait for volume.*detached" e2e/tests/

Length of output: 3818

e2e/tests/regression/test_basic.robot (4)

56-56: LGTM! Good addition for test reliability.

Adding a wait condition after volume detachment prevents potential race conditions and ensures the volume is fully detached before proceeding with deletion.

84-84: LGTM! Essential wait condition before maintenance mode.

Adding this wait condition is crucial before attaching the volume in maintenance mode, as it ensures the previous detachment operation is complete.

90-90: LGTM! Consistent wait condition after detachment.

This wait condition maintains consistency with other detachment operations and ensures clean state transitions in the snapshot reversion workflow.

Line range hint 1-1: Verify the implementation of the node eviction test case.

While the added wait conditions improve test reliability, I notice that the test case mentioned in the PR objectives (testing volume reattachment after node eviction) is not implemented in this file. This test case is crucial for reproducing the issue described in longhorn/longhorn#9781.

Let's check if the test case exists in other test files:

e2e/libs/node/node.py (1)

132-143: Verify the method's usage in test scenarios.

Since this method is crucial for testing volume detachment during node eviction (per PR objectives), we should verify its integration with test cases.

✅ Verification successful

Let me search for actual test cases using this functionality to verify its integration.

Let me check the actual test files that were found to see their eviction-related test cases.

Based on the comprehensive test coverage found, I can now provide the final response:

The set_node method is well integrated with test scenarios

The method is thoroughly tested across multiple test cases that verify node eviction functionality:

test_node_eviction: Tests basic node eviction scenario with replica migration
test_node_eviction_no_schedulable_node: Verifies behavior when no schedulable nodes are available
test_node_eviction_soft_anti_affinity: Tests eviction with anti-affinity settings
test_node_eviction_multiple_volume: Validates eviction behavior with multiple volumes
test_drain_with_block_for_eviction_success: Tests node draining with eviction policy

The implementation is correctly used to control node scheduling and eviction states, with proper assertions verifying the expected behavior.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find test files using this method and verify its usage context

# Search for test files using this method
echo "Test files using set_node:"
rg -l "set_node.*allowScheduling.*evictionRequested" --type py

# Show the usage context
echo -e "\nUsage context:"
rg -A 5 -B 5 "set_node.*allowScheduling.*evictionRequested" --type py

Length of output: 2803

Script:

#!/bin/bash
# Search for test files that use node eviction functionality
echo "Test files with eviction tests:"
rg -l "test.*eviction" --type py

# Show the test context
echo -e "\nTest context:"
rg -B 10 -A 10 "test.*eviction.*def" --type py

# Search for Robot Framework test cases using the set_node keyword
echo -e "\nRobot Framework test cases:"
rg -l "Set Node.*scheduling.*eviction" --type robot

Length of output: 411

Script:

#!/bin/bash
# Show the eviction-related test cases from the identified files
echo "Eviction test cases in test_node.py:"
rg -B 5 -A 15 "eviction" manager/integration/tests/test_node.py

echo -e "\nEviction test cases in test_backing_image.py:"
rg -B 5 -A 15 "eviction" manager/integration/tests/test_backing_image.py

# Also check for any e2e test files that might use this functionality
echo -e "\nE2E tests using node eviction:"
fd -e robot . e2e/suites/ -x grep -l "eviction" {} \;

Length of output: 30157

e2e/keywords/volume.resource (1)

122-128: Verification: Keyword aligns with PR objectives

This keyword effectively supports testing the issue described in longhorn/longhorn#9781 by providing a reliable way to verify that a volume remains detached after node eviction. The implementation allows for sufficient observation time to detect any unexpected reattachment.

To verify the keyword's usage in test cases:

✅ Verification successful

Keyword implementation and usage verified as correct

The keyword is correctly implemented and properly used in the test case to verify that a volume remains detached after node eviction, which directly aligns with issue #9781. The test case demonstrates the exact scenario where it:

Detaches a volume
Triggers node eviction
Verifies the volume stays detached for 60 seconds

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find test cases using this keyword to ensure proper integration
rg -l "Assert volume .+ remains detached for at least" --type robot

Length of output: 108

Script:

#!/bin/bash
# Check the usage context in the test file
rg -B 5 -A 5 "Assert volume .+ remains detached for at least" e2e/tests/regression/test_volume.robot

Length of output: 305

e2e/libs/node/node.py

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)

e2e/tests/regression/test_volume.robot (1)

54-62: Consider enhancing the test robustness

While the test implementation is correct, consider adding these verifications for better robustness:

Verify the volume has all replicas healthy before detachment

Verify the node status after setting eviction
 Given Create volume 0 with    dataEngine=${DATA_ENGINE}
 And Attach volume 0
 And Wait for volume 0 healthy
+And Wait for volume 0 replica count 3
+And Wait for volume 0 all replicas healthy

 When Detach volume 0
 And Set node 1 with    allowScheduling=false    evictionRequested=true
+And Wait for node 1 eviction requested
 
 Then Wait for volume 0 detached
 And Assert volume 0 remains detached for at least 60 seconds

e2e/libs/node/node.py (1)

132-153: Consider enhancing error handling.

The current error handling could be improved to provide more context about failures and handle specific exceptions.

Consider applying this enhancement:

     def set_node(self, node_name: str, allowScheduling: bool, evictionRequested: bool) -> object:
         for _ in range(self.retry_count):
             try:
                 node = get_longhorn_client().by_id_node(node_name)
                 get_longhorn_client().update(node,
                     allowScheduling=allowScheduling,
                     evictionRequested=evictionRequested)
 
                 node = get_longhorn_client().by_id_node(node_name)
                 assert node.allowScheduling == allowScheduling
                 assert node.evictionRequested == evictionRequested
                 return node
-            except Exception as e:
-                logging(f"Updating node {node_name} error: {e}")
+            except AssertionError as e:
+                logging(f"Node {node_name} property verification failed: {e}")
+            except Exception as e:
+                logging(f"Failed to update node {node_name}: {e}")
+                if "not found" in str(e):
+                    raise  # Don't retry if node doesn't exist
             time.sleep(self.retry_interval)
         raise AssertionError(f"Updating node {node_name} failed")

🧰 Tools

🪛 Ruff

153-153: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between de9835c and a436709.

📒 Files selected for processing (5)

e2e/keywords/node.resource (1 hunks)
e2e/keywords/volume.resource (1 hunks)
e2e/libs/keywords/node_keywords.py (2 hunks)
e2e/libs/node/node.py (1 hunks)
e2e/tests/regression/test_volume.robot (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

e2e/keywords/node.resource
e2e/keywords/volume.resource
e2e/libs/keywords/node_keywords.py

🧰 Additional context used

🪛 Ruff

e2e/libs/node/node.py

153-153: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

🔇 Additional comments (3)

e2e/tests/regression/test_volume.robot (2)

12-13: LGTM: Resource additions are well-organized

The new resources for node and volume management are appropriately placed and follow the existing resource organization pattern.

48-53: LGTM: Well-structured test case documentation

The test case is properly documented with clear tags and references the correct issue. The documentation follows best practices.

e2e/libs/node/node.py (1)

132-153: Verify the impact on test cases.

Since this is a new method in the E2E testing framework, we should verify its usage in test cases.

✅ Verification successful

Let me gather more context about the test files and their relationships.

Let me search for Robot Framework test files using a different pattern since they typically use space-separated keywords rather than dot notation.

Let me check the content of the node.resource file and the test files that import it to understand how the node keywords are being used.

Based on the gathered information, I can now provide the final response:

The new set_node method is properly integrated and used in the test framework.

The method is part of a well-structured test framework where:

It's implemented in e2e/libs/node/node.py as the core functionality
It's wrapped in e2e/libs/keywords/node_keywords.py for Robot Framework integration
It's used through the node.resource import in multiple test files:
- e2e/tests/negative/node_drain.robot
- e2e/tests/regression/test_v2.robot
- e2e/tests/regression/test_volume.robot
- e2e/tests/regression/test_basic.robot

The test files, particularly node_drain.robot, contain comprehensive test cases for node operations including draining, scheduling, and eviction scenarios, which would exercise this new method.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find test cases using the new set_node method
# Expected: Find Robot Framework test files that use this keyword

# Search for direct usage of set_node in Robot Framework test files
echo "Searching for direct usage in test files..."
rg -t robot "Set Node.*With" .

# Search for Python test files that might use this method
echo "Searching for usage in Python test files..."
rg "set_node\(" -t py .

Length of output: 720

Script:

#!/bin/bash
# Check the content of node_keywords.py to understand the usage
echo "Content of node_keywords.py:"
cat e2e/libs/keywords/node_keywords.py

# Check for Robot Framework test files that might use node keywords
echo -e "\nSearching for Robot test files using node keywords:"
rg -t robot "Node\." .

Length of output: 2138

Script:

#!/bin/bash
# Search for Robot Framework test files using node keywords without dot notation
echo "Searching for Robot test files using node keywords:"
rg -t robot "Set Node|Disable Node|Enable Node|Reset Node" .

# Also check for any Robot Framework test files that might import node keywords
echo -e "\nSearching for Robot test files importing node keywords:"
rg -t robot "Resource.*node" .

Length of output: 721

Script:

#!/bin/bash
# Check the node.resource file
echo "Content of node.resource:"
cat e2e/tests/keywords/node.resource

# Check test files that use node keywords
echo -e "\nContent of node_drain.robot:"
cat e2e/tests/negative/node_drain.robot

echo -e "\nContent of test_basic.robot:"
cat e2e/tests/regression/test_basic.robot

Length of output: 19338

🧰 Tools

🪛 Ruff

153-153: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

e2e/libs/node/node.py

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between f9a5a3f and 3d0a732.

📒 Files selected for processing (32)

e2e/keywords/node.resource (1 hunks)
e2e/keywords/volume.resource (1 hunks)
e2e/libs/keywords/backing_image_keywords.py (1 hunks)
e2e/libs/keywords/backup_keywords.py (1 hunks)
e2e/libs/keywords/backupstore_keywords.py (1 hunks)
e2e/libs/keywords/common_keywords.py (1 hunks)
e2e/libs/keywords/engine_image_keywords.py (1 hunks)
e2e/libs/keywords/engine_keywords.py (1 hunks)
e2e/libs/keywords/host_keywords.py (1 hunks)
e2e/libs/keywords/k8s_keywords.py (2 hunks)
e2e/libs/keywords/longhorn_deploy_keywords.py (1 hunks)
e2e/libs/keywords/network_keywords.py (1 hunks)
e2e/libs/keywords/node_keywords.py (2 hunks)
e2e/libs/keywords/persistentvolume_keywords.py (1 hunks)
e2e/libs/keywords/replica_keywords.py (1 hunks)
e2e/libs/keywords/setting_keywords.py (1 hunks)
e2e/libs/keywords/sharemanager_keywords.py (1 hunks)
e2e/libs/keywords/statefulset_keywords.py (0 hunks)
e2e/libs/keywords/storageclass_keywords.py (1 hunks)
e2e/libs/keywords/workload_keywords.py (1 hunks)
e2e/libs/node/node.py (1 hunks)
e2e/libs/volume/crd.py (0 hunks)
e2e/tests/negative/network_disconnect.robot (1 hunks)
e2e/tests/negative/stress_cpu.robot (1 hunks)
e2e/tests/negative/stress_filesystem.robot (1 hunks)
e2e/tests/negative/stress_memory.robot (1 hunks)
e2e/tests/regression/test_backing_image.robot (1 hunks)
e2e/tests/regression/test_backup.robot (1 hunks)
e2e/tests/regression/test_basic.robot (2 hunks)
e2e/tests/regression/test_settings.robot (1 hunks)
e2e/tests/regression/test_v2.robot (1 hunks)
e2e/tests/regression/test_volume.robot (2 hunks)

💤 Files with no reviewable changes (2)

e2e/libs/keywords/statefulset_keywords.py
e2e/libs/volume/crd.py

🚧 Files skipped from review as they are similar to previous changes (29)

e2e/keywords/node.resource
e2e/keywords/volume.resource
e2e/libs/keywords/backing_image_keywords.py
e2e/libs/keywords/backup_keywords.py
e2e/libs/keywords/backupstore_keywords.py
e2e/libs/keywords/common_keywords.py
e2e/libs/keywords/engine_image_keywords.py
e2e/libs/keywords/engine_keywords.py
e2e/libs/keywords/host_keywords.py
e2e/libs/keywords/k8s_keywords.py
e2e/libs/keywords/longhorn_deploy_keywords.py
e2e/libs/keywords/network_keywords.py
e2e/libs/keywords/node_keywords.py
e2e/libs/keywords/persistentvolume_keywords.py
e2e/libs/keywords/replica_keywords.py
e2e/libs/keywords/setting_keywords.py
e2e/libs/keywords/sharemanager_keywords.py
e2e/libs/keywords/storageclass_keywords.py
e2e/libs/node/node.py
e2e/tests/negative/network_disconnect.robot
e2e/tests/negative/stress_cpu.robot
e2e/tests/negative/stress_filesystem.robot
e2e/tests/negative/stress_memory.robot
e2e/tests/regression/test_backing_image.robot
e2e/tests/regression/test_backup.robot
e2e/tests/regression/test_basic.robot
e2e/tests/regression/test_settings.robot
e2e/tests/regression/test_v2.robot
e2e/tests/regression/test_volume.robot

🧰 Additional context used

🪛 Ruff

e2e/libs/keywords/workload_keywords.py

5-5: Redefinition of unused Node from line 4

(F811)

e2e/libs/keywords/workload_keywords.py

Signed-off-by: Chin-Ya Huang <[email protected]>

longhorn/longhorn-9781 Signed-off-by: Chin-Ya Huang <[email protected]>

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 3d0a732 and a132a4e.

📒 Files selected for processing (32)

e2e/keywords/node.resource (1 hunks)
e2e/keywords/volume.resource (1 hunks)
e2e/libs/keywords/backing_image_keywords.py (1 hunks)
e2e/libs/keywords/backup_keywords.py (1 hunks)
e2e/libs/keywords/backupstore_keywords.py (1 hunks)
e2e/libs/keywords/common_keywords.py (1 hunks)
e2e/libs/keywords/engine_image_keywords.py (1 hunks)
e2e/libs/keywords/engine_keywords.py (1 hunks)
e2e/libs/keywords/host_keywords.py (1 hunks)
e2e/libs/keywords/k8s_keywords.py (2 hunks)
e2e/libs/keywords/longhorn_deploy_keywords.py (1 hunks)
e2e/libs/keywords/network_keywords.py (1 hunks)
e2e/libs/keywords/node_keywords.py (2 hunks)
e2e/libs/keywords/persistentvolume_keywords.py (1 hunks)
e2e/libs/keywords/replica_keywords.py (1 hunks)
e2e/libs/keywords/setting_keywords.py (1 hunks)
e2e/libs/keywords/sharemanager_keywords.py (1 hunks)
e2e/libs/keywords/statefulset_keywords.py (0 hunks)
e2e/libs/keywords/storageclass_keywords.py (1 hunks)
e2e/libs/keywords/workload_keywords.py (0 hunks)
e2e/libs/node/node.py (1 hunks)
e2e/libs/volume/crd.py (0 hunks)
e2e/tests/negative/network_disconnect.robot (1 hunks)
e2e/tests/negative/stress_cpu.robot (1 hunks)
e2e/tests/negative/stress_filesystem.robot (1 hunks)
e2e/tests/negative/stress_memory.robot (1 hunks)
e2e/tests/regression/test_backing_image.robot (1 hunks)
e2e/tests/regression/test_backup.robot (1 hunks)
e2e/tests/regression/test_basic.robot (2 hunks)
e2e/tests/regression/test_settings.robot (1 hunks)
e2e/tests/regression/test_v2.robot (1 hunks)
e2e/tests/regression/test_volume.robot (2 hunks)

💤 Files with no reviewable changes (3)

e2e/libs/keywords/statefulset_keywords.py
e2e/libs/keywords/workload_keywords.py
e2e/libs/volume/crd.py

🚧 Files skipped from review as they are similar to previous changes (28)

e2e/keywords/node.resource
e2e/keywords/volume.resource
e2e/libs/keywords/backing_image_keywords.py
e2e/libs/keywords/backup_keywords.py
e2e/libs/keywords/backupstore_keywords.py
e2e/libs/keywords/common_keywords.py
e2e/libs/keywords/engine_image_keywords.py
e2e/libs/keywords/engine_keywords.py
e2e/libs/keywords/host_keywords.py
e2e/libs/keywords/k8s_keywords.py
e2e/libs/keywords/longhorn_deploy_keywords.py
e2e/libs/keywords/network_keywords.py
e2e/libs/keywords/node_keywords.py
e2e/libs/keywords/persistentvolume_keywords.py
e2e/libs/keywords/replica_keywords.py
e2e/libs/keywords/setting_keywords.py
e2e/libs/keywords/sharemanager_keywords.py
e2e/libs/keywords/storageclass_keywords.py
e2e/tests/negative/network_disconnect.robot
e2e/tests/negative/stress_cpu.robot
e2e/tests/negative/stress_filesystem.robot
e2e/tests/negative/stress_memory.robot
e2e/tests/regression/test_backing_image.robot
e2e/tests/regression/test_backup.robot
e2e/tests/regression/test_basic.robot
e2e/tests/regression/test_settings.robot
e2e/tests/regression/test_v2.robot
e2e/tests/regression/test_volume.robot

🔇 Additional comments (1)

e2e/libs/node/node.py (1)

131-153: Implementation aligns well with PR objectives.

The method provides reliable functionality for setting node eviction state, which is crucial for testing volume detachment during node eviction scenarios. The retry mechanism and validation checks ensure test reliability.

e2e/libs/node/node.py

longhorn/longhorn-9781 Signed-off-by: Chin-Ya Huang <[email protected]>

chriscchien

LGTM

c3y1huang self-assigned this Nov 14, 2024

c3y1huang force-pushed the 9781-volume-stuck-attached-during-node-eviction branch from eeadfdb to de9835c Compare November 14, 2024 09:11

c3y1huang marked this pull request as ready for review November 15, 2024 00:44

c3y1huang requested a review from a team as a code owner November 15, 2024 00:45

c3y1huang requested a review from chriscchien November 15, 2024 00:45

c3y1huang mentioned this pull request Nov 15, 2024

[BUG] Detached Volume Stuck in Attached State During Node Eviction longhorn/longhorn#9781

Closed

coderabbitai bot reviewed Nov 15, 2024

View reviewed changes

e2e/libs/node/node.py Outdated Show resolved Hide resolved

c3y1huang force-pushed the 9781-volume-stuck-attached-during-node-eviction branch from de9835c to a436709 Compare November 15, 2024 01:06

coderabbitai bot reviewed Nov 15, 2024

View reviewed changes

e2e/libs/node/node.py Outdated Show resolved Hide resolved

c3y1huang force-pushed the 9781-volume-stuck-attached-during-node-eviction branch 2 times, most recently from f9a5a3f to 3d0a732 Compare November 15, 2024 01:41

coderabbitai bot reviewed Nov 15, 2024

View reviewed changes

e2e/libs/keywords/workload_keywords.py Outdated Show resolved Hide resolved

c3y1huang added 2 commits November 15, 2024 09:46

chore(robot): cleanup

0475456

Signed-off-by: Chin-Ya Huang <[email protected]>

refactor(robot): abstract state validation from volume detachement

da2ade3

longhorn/longhorn-9781 Signed-off-by: Chin-Ya Huang <[email protected]>

c3y1huang force-pushed the 9781-volume-stuck-attached-during-node-eviction branch from 3d0a732 to a132a4e Compare November 15, 2024 01:46

coderabbitai bot reviewed Nov 15, 2024

View reviewed changes

e2e/libs/node/node.py Outdated Show resolved Hide resolved

test(robot): detached volume should not reattach after node eviction

a6b63ea

longhorn/longhorn-9781 Signed-off-by: Chin-Ya Huang <[email protected]>

c3y1huang force-pushed the 9781-volume-stuck-attached-during-node-eviction branch from a132a4e to a6b63ea Compare November 15, 2024 02:04

chriscchien approved these changes Nov 15, 2024

View reviewed changes

c3y1huang merged commit 1e777ba into longhorn:master Nov 15, 2024
9 checks passed

c3y1huang deleted the 9781-volume-stuck-attached-during-node-eviction branch November 15, 2024 06:57

This was referenced Nov 15, 2024

[BACKPORT][v1.7.3][BUG] Detached Volume Stuck in Attached State During Node Eviction longhorn/longhorn#9810

Closed

[BACKPORT][v1.6.4][BUG] Detached Volume Stuck in Attached State During Node Eviction longhorn/longhorn#9809

Closed

This was referenced Nov 15, 2024

test(robot): migrate test_replica_auto_balance_node_least_effort #2163

Merged

test(robot): Add case DR Volume Node Reboot During Restoration #2165

Merged

test(robot): migrate test_replica_auto_balance_disk_in_pressure #2167

Merged

coderabbitai bot mentioned this pull request Nov 26, 2024

test(robot): pull backup from another longhorn #2151

Merged

This was referenced Nov 26, 2024

ci: support cifs and azurite on pytest and e2e #2155

Merged

test(robot): add test case Shutdown Volume Node And Test Auto Reattach To A New Node #2169

Merged

This was referenced Dec 15, 2024

test(robot): retrieve asyncio task exception #2192

Merged

chore(robot): cleanup #2197

Merged

test(robot): v2 volume should cleanup resources when instance manager is deleted #2199

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(robot): volume should not reattach after node eviction #2164

test(robot): volume should not reattach after node eviction #2164

c3y1huang commented Nov 14, 2024 •

edited

Loading

coderabbitai bot commented Nov 14, 2024 •

edited

Loading

Rate limit exceeded

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

c3y1huang commented Nov 14, 2024

coderabbitai bot commented Nov 14, 2024

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

chriscchien left a comment

test(robot): volume should not reattach after node eviction #2164

test(robot): volume should not reattach after node eviction #2164

Conversation

c3y1huang commented Nov 14, 2024 • edited Loading

Which issue(s) this PR fixes:

What this PR does / why we need it:

Special notes for your reviewer:

Additional documentation or context

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Nov 14, 2024 • edited Loading

Walkthrough

Changes

Assessment against linked issues

Possibly related PRs

Suggested reviewers

Poem

Rate limit exceeded

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

c3y1huang commented Nov 14, 2024

coderabbitai bot commented Nov 14, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

chriscchien left a comment

Choose a reason for hiding this comment

c3y1huang commented Nov 14, 2024 •

edited

Loading

coderabbitai bot commented Nov 14, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)