test(robot): migrate test_replica_rebuild_per_volume_limit #2159

yangchiu · 2024-11-08T03:19:52Z

Which issue(s) this PR fixes:

Issue longhorn/longhorn#9774

What this PR does / why we need it:

migrate test_replica_rebuild_per_volume_limit

Special notes for your reviewer:

Additional documentation or context

Summary by CodeRabbit

New Features
- Introduced functionality to delete a specified number of replicas for a volume.
- Added a constraint ensuring only one replica can be rebuilt at a time for a volume.
Bug Fixes
- Enhanced error handling and logging for replica deletion processes.
Tests
- Added a regression test case to verify the behavior of volume replicas, ensuring only one replica is rebuilt at a time.

coderabbitai · 2024-11-08T03:19:58Z

Caution

Review failed

The pull request is closed.

Walkthrough

The changes introduce new functionalities for managing volume replicas in the system. Two new keywords are added for deleting a specified number of replicas and enforcing a constraint on replica rebuilding. Additionally, several methods in the volume_keywords, CRD, Rest, and Volume classes are updated to enhance replica management capabilities. A new regression test file is also created to validate the behavior of volume replicas, ensuring that only one replica rebuilding occurs at a time.

Changes

File Path	Change Summary
e2e/keywords/volume.resource	Added keywords: `Delete ${count} replicas of volume ${volume_id}` and `Only one replica rebuilding will start at a time for volume ${volume_id}`.
e2e/libs/keywords/volume_keywords.py	Added methods: `delete_replicas(volume_name, count)` and `async only_one_replica_rebuilding_will_start_at_a_time(volume_name)`. Simplified `wait_for_replica_rebuilding_to_complete`.
e2e/libs/volume/crd.py	Added method: `delete_replica_by_name(volume_name, replica_name)`. Updated `wait_for_replica_rebuilding_complete` to accept an optional `node_name`.
e2e/libs/volume/rest.py	Updated methods to accept an optional `node_name`: `wait_for_replica_rebuilding_start`, `is_replica_rebuilding_in_progress`, `wait_for_replica_rebuilding_complete`.
e2e/libs/volume/volume.py	Added methods: `delete_replica_by_name(volume_name, replica_name)` and updated several methods to accept `node_name=None`.
e2e/tests/regression/test_replica.robot	Introduced a new test case: `Test Replica Rebuilding Per Volume Limit` to verify replica rebuilding behavior.

Assessment against linked issues

Objective	Addressed	Explanation
Migrate existing pytest test cases to robot (9774)	✅

Possibly related PRs

test(robot): migrate test_replica_rebuild_per_volume_limit #2159: The changes in this PR directly relate to the main PR by introducing the same keywords for deleting replicas and managing replica rebuilding, enhancing the volume management capabilities.

Suggested reviewers

chriscchien

🐇 In the meadow, where replicas play,
New keywords hop in, brightening the day.
Deleting and rebuilding, all in a line,
With tests to ensure everything's fine.
So let’s cheer for the changes, a joyful delight,
For managing volumes just got more bright! 🌼

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (14)

e2e/tests/regression/test_replica.robot (4)
1-15: Consider enhancing the documentation.

While the basic documentation is present, it would be beneficial to add more details about:

The purpose of these replica test cases

Prerequisites for running these tests

Expected test environment requirements
-Documentation    Replica Test Cases
+Documentation    Test cases validating Longhorn volume replica behavior,
+...             including replica rebuilding limits and scheduling.
+...             Prerequisites: Longhorn cluster with sufficient nodes
+...             for replica distribution.
16-20: Add documentation for test variables.

The variables lack documentation explaining their purpose and chosen values. This information is crucial for maintainability.
 *** Variables ***
+# Number of test iterations
 ${LOOP_COUNT}    1
+# Maximum number of retry attempts for operations
 ${RETRY_COUNT}    300
+# Time interval between retries in seconds
 ${RETRY_INTERVAL}    1
+# Data engine version to use for volume creation
 ${DATA_ENGINE}    v1
35-49: Add soft anti-affinity validation.

While soft anti-affinity is enabled, the test doesn't verify that replicas are actually scheduled on different nodes.

Add a verification step after replica creation:
 And Wait for volume 0 healthy
+And Verify volume 0 replicas are on different nodes
 And Write data to volume 0
23-25: Consider adding error scenarios.

While the happy path is well tested, consider adding test cases for:

Node failure during replica rebuilding

Network partition scenarios

Disk full conditions

Would you like me to help create additional test cases for these scenarios?
e2e/libs/volume/volume.py (2)
107-108: Add docstring to clarify node_name parameter usage.

Consider adding a docstring to explain the behavior when node_name is provided vs. when it's None. This will help other developers understand how to use this method effectively.
     def wait_for_replica_rebuilding_start(self, volume_name, node_name=None):
+        """Wait for replica rebuilding to start.
+        
+        Args:
+            volume_name: Name of the volume
+            node_name: Optional. If provided, wait for rebuilding on specific node.
+                      If None, wait for any replica rebuilding.
+        """
         return self.volume.wait_for_replica_rebuilding_start(volume_name, node_name)
110-111: Add docstring to clarify node_name parameter usage.

Similar to the previous method, add a docstring to explain the behavior with and without node_name parameter.
     def is_replica_rebuilding_in_progress(self, volume_name, node_name=None):
+        """Check if replica rebuilding is in progress.
+        
+        Args:
+            volume_name: Name of the volume
+            node_name: Optional. If provided, check rebuilding on specific node.
+                      If None, check for any replica rebuilding.
+        Returns:
+            bool: True if rebuilding is in progress, False otherwise.
+        """
         return self.volume.is_replica_rebuilding_in_progress(volume_name, node_name)
e2e/keywords/volume.resource (2)
102-104: Add documentation for the new keyword.

The implementation looks good and follows the established patterns. Consider adding documentation to describe the purpose and behavior of this keyword, similar to other keywords in the file.
+Documentation    Delete specified number of replicas from a volume
 Delete ${count} replicas of volume ${volume_id}
     ${volume_name} =    generate_name_with_suffix    volume    ${volume_id}
     delete_replicas    ${volume_name}    ${count}
199-201: Add documentation for the new keyword.

The implementation looks good and follows the established patterns. Consider adding documentation to describe the purpose and behavior of this keyword, similar to other keywords in the file.
+Documentation    Verify that only one replica rebuilding process occurs at a time for the specified volume
 Only one replica rebuilding will start at a time for volume ${volume_id}
     ${volume_name} =    generate_name_with_suffix    volume    ${volume_id}
     only_one_replica_rebuilding_will_start_at_a_time    ${volume_name}
e2e/libs/keywords/volume_keywords.py (1)
209-223: Improve clarity and add timeout handling.

The implementation could be enhanced for better clarity and robustness:

Consider these improvements:
 async def only_one_replica_rebuilding_will_start_at_a_time(self, volume_name):
+    timeout = 300  # Add appropriate timeout value
 
-    async def wait_for_replica_rebuilding():
-        tasks = [
-            asyncio.create_task(self.volume.wait_for_replica_rebuilding_start(volume_name), name=volume_name),
-        ]
-
-        done, pending = await asyncio.wait(tasks, return_when=asyncio.ALL_COMPLETED)
-        logging(f"Observed {done.pop().get_name()} started replica rebuilding")
-
-    await wait_for_replica_rebuilding()
+    try:
+        logging(f"Waiting for replica rebuilding to start for volume {volume_name}")
+        await asyncio.wait_for(
+            self.volume.wait_for_replica_rebuilding_start(volume_name),
+            timeout=timeout
+        )
+        logging(f"Replica rebuilding started for volume {volume_name}")
 
-    assert self.volume.is_replica_rebuilding_in_progress(volume_name), \
-        f"Expect {volume_name} replica rebuilding in progress"
+        assert self.volume.is_replica_rebuilding_in_progress(volume_name), \
+            f"Expected volume {volume_name} to have replica rebuilding in progress, but no rebuilding was detected"
+    except asyncio.TimeoutError:
+        raise TimeoutError(f"Timeout waiting for replica rebuilding to start for volume {volume_name} after {timeout} seconds")
e2e/libs/volume/rest.py (5)
140-140: Use is not None for None comparison

In Python, it's recommended to use is not None when comparing to None for clarity and adherence to PEP 8 guidelines.

Apply this diff to correct the comparison:
-            assert rebuilding_replica_name != None, f"Waiting for replica rebuilding start for volume {volume_name} on node {node_name} failed: replicas = {v.replicas}"
+            assert rebuilding_replica_name is not None, f"Waiting for replica rebuilding start for volume {volume_name} on node {node_name} failed: replicas = {v.replicas}"
🧰 Tools

🪛 Ruff

140-140: Comparison to None should be cond is not None

Replace with cond is not None

(E711)

144-144: Replace unused loop variable i with underscore _

The loop variable i is not used within the loop body. Using _ indicates that the variable is intentionally unused.

Apply this diff to improve readability:
-            for i in range(self.retry_count):
+            for _ in range(self.retry_count):
🧰 Tools

🪛 Ruff

144-144: Loop control variable i not used within loop body

(B007)

183-183: Simplify conditional assignment of node_name

You can enhance readability by rearranging the conditional expression.

Apply this diff to simplify the assignment:
-                node_name = replica.hostId if not node_name else node_name
+                node_name = node_name if node_name else replica.hostId
Alternatively, you can use the more concise:
+                node_name = node_name or replica.hostId
🧰 Tools

🪛 Ruff

183-183: Use node_name if node_name else replica.hostId instead of replica.hostId if not node_name else node_name

Replace with node_name if node_name else replica.hostId

(SIM212)

237-237: Simplify conditional expression in logging statement

For improved readability, rearrange the conditional expression in the log message.

Apply this diff:
-            logging(f"wait for {volume_name} replica rebuilding completed on {'all nodes' if not node_name else node_name} ... ({i})")
+            logging(f"wait for {volume_name} replica rebuilding completed on {node_name if node_name else 'all nodes'} ... ({i})")
🧰 Tools

🪛 Ruff

237-237: Use node_name if node_name else 'all nodes' instead of 'all nodes' if not node_name else node_name

Replace with node_name if node_name else 'all nodes'

(SIM212)

274-275: Consistent conditional expressions in assertions and logging

To maintain consistency and enhance readability, adjust the conditional expressions in both the logging and assertion statements.

Apply this diff:
-        logging(f"Completed volume {volume_name} replica rebuilding on {'all nodes' if not node_name else node_name}")
-        assert completed, f"Expect volume {volume_name} replica rebuilding completed on {'all nodes' if not node_name else node_name}"
+        logging(f"Completed volume {volume_name} replica rebuilding on {node_name if node_name else 'all nodes'}")
+        assert completed, f"Expect volume {volume_name} replica rebuilding completed on {node_name if node_name else 'all nodes'}"
🧰 Tools

🪛 Ruff

274-274: Use node_name if node_name else 'all nodes' instead of 'all nodes' if not node_name else node_name

Replace with node_name if node_name else 'all nodes'

(SIM212)

275-275: Use node_name if node_name else 'all nodes' instead of 'all nodes' if not node_name else node_name

Replace with node_name if node_name else 'all nodes'

(SIM212)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 78eca51 and 77fab67.

📒 Files selected for processing (6)

e2e/keywords/volume.resource (2 hunks)
e2e/libs/keywords/volume_keywords.py (3 hunks)
e2e/libs/volume/crd.py (2 hunks)
e2e/libs/volume/rest.py (4 hunks)
e2e/libs/volume/volume.py (2 hunks)
e2e/tests/regression/test_replica.robot (1 hunks)

🧰 Additional context used

🪛 Ruff

e2e/libs/volume/rest.py

140-140: Comparison to None should be cond is not None

Replace with cond is not None

(E711)

144-144: Loop control variable i not used within loop body

(B007)

183-183: Use node_name if node_name else replica.hostId instead of replica.hostId if not node_name else node_name

Replace with node_name if node_name else replica.hostId

(SIM212)

237-237: Use node_name if node_name else 'all nodes' instead of 'all nodes' if not node_name else node_name

Replace with node_name if node_name else 'all nodes'

(SIM212)

274-274: Use node_name if node_name else 'all nodes' instead of 'all nodes' if not node_name else node_name

Replace with node_name if node_name else 'all nodes'

(SIM212)

275-275: Use node_name if node_name else 'all nodes' instead of 'all nodes' if not node_name else node_name

Replace with node_name if node_name else 'all nodes'

(SIM212)

🔇 Additional comments (4)

e2e/libs/volume/volume.py (1)

104-106: LGTM: Well-structured addition of replica deletion by name.

The new method enhances the API by providing more granular control over replica deletion, which is beneficial for testing scenarios requiring precise replica management.

e2e/libs/keywords/volume_keywords.py (1)

162-162: LGTM! Code simplification improves maintainability.

The simplified implementation correctly delegates the wait operation to the volume class.

e2e/libs/volume/crd.py (1)

452-452: LGTM! Good flexibility improvement.

Making the node_name parameter optional improves the method's flexibility while maintaining backward compatibility.

e2e/libs/volume/rest.py (1)

149-156: Good practice: Checking for concurrent replica rebuilding

The addition of checks to ensure that only one replica is rebuilding at a time enhances data consistency and prevents potential conflicts.

coderabbitai · 2024-11-08T03:26:36Z

e2e/tests/regression/test_replica.robot

+    ...    1. Set soft anti-affinity to `true`.
+    ...    2. Create a volume with 1 replica.
+    ...    3. Attach the volume and write a few hundreds MB data to it.


⚠️ Potential issue

Documentation inconsistency with implementation.

The documentation states "Create a volume with 1 replica" but the implementation creates a volume with 3 replicas.

Either update the documentation or adjust the implementation to match:

- ... 2. Create a volume with 1 replica. + ... 2. Create a volume with 3 replicas.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

... 1. Set soft anti-affinity to `true`.

... 2. Create a volume with 1 replica.

... 3. Attach the volume and write a few hundreds MB data to it.

... 1. Set soft anti-affinity to `true`.

... 2. Create a volume with 3 replicas.

... 3. Attach the volume and write a few hundreds MB data to it.

Agree with this comment for making test description and test code more consistent.

@chriscchien, glad to hear that you agree. Updating the documentation step to match the implementation will ensure consistency. Here's the updated step:

... 2. Create a volume with 3 replicas.

Let me know if you need any further assistance.

coderabbitai · 2024-11-08T03:26:36Z

e2e/tests/regression/test_replica.robot

+    And Write data to volume 0
+


🛠️ Refactor suggestion

Add validation for data write operation.

The test writes data but doesn't verify the write operation's success before proceeding with replica operations.

And Write data to volume 0 +And Verify data write succeeded on volume 0

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

And Write data to volume 0

And Write data to volume 0

And Verify data write succeeded on volume 0

coderabbitai · 2024-11-08T03:26:36Z

e2e/libs/volume/volume.py

+    def wait_for_replica_rebuilding_complete(self, volume_name, node_name=None):
        return self.volume.wait_for_replica_rebuilding_complete(volume_name, node_name)


💡 Codebase verification

Inconsistency found in node_name parameter handling

The verification reveals inconsistencies in the node_name parameter handling across implementations:

base.py: All methods declare node_name as required parameter

rest.py: All methods correctly implement node_name=None as optional

crd.py: Only wait_for_replica_rebuilding_complete has node_name=None, while other rebuilding methods have required node_name

volume.py: All methods correctly implement node_name=None as optional

The base class and CRD implementation need to be updated to maintain consistency with the optional node_name parameter pattern.

🔗 Analysis chain

Add docstring and verify implementation consistency.

Add a docstring to explain the behavior with and without node_name parameter.

def wait_for_replica_rebuilding_complete(self, volume_name, node_name=None): + """Wait for replica rebuilding to complete. + + Args: + volume_name: Name of the volume + node_name: Optional. If provided, wait for rebuilding completion on specific node. + If None, wait for any replica rebuilding completion. + """ return self.volume.wait_for_replica_rebuilding_complete(volume_name, node_name)

Let's verify that the underlying implementations in CRD and Rest classes handle the optional node_name consistently:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check if CRD and Rest implementations handle optional node_name consistently # Test: Search for the implementation of these methods in CRD and Rest classes echo "Checking implementations in CRD and Rest classes..." rg -A 10 "def (wait_for_replica_rebuilding_start|is_replica_rebuilding_in_progress|wait_for_replica_rebuilding_complete)" "e2e/libs/volume/"

Length of output: 7703

coderabbitai · 2024-11-08T03:26:36Z

e2e/libs/keywords/volume_keywords.py

+    def delete_replicas(self, volume_name, count):
+        replica_list = self.replica.get(volume_name, node_name="")
+        replica_names = [replica['metadata']['name'] for replica in replica_list]
+        for i in range(int(count)):
+            logging(f"Deleting volume {volume_name} replica volume {replica_names[i]}")
+            self.volume.delete_replica_by_name(volume_name, replica_names[i])
+


🛠️ Refactor suggestion

Add input validation and error handling.

The method needs additional safeguards:

Validate that count is a positive integer and doesn't exceed available replicas

Add error handling for the delete operation

Consider this implementation:

def delete_replicas(self, volume_name, count): + try: + count = int(count) + if count <= 0: + raise ValueError("Count must be positive") + replica_list = self.replica.get(volume_name, node_name="") replica_names = [replica['metadata']['name'] for replica in replica_list] + if count > len(replica_names): + raise ValueError(f"Cannot delete {count} replicas, only {len(replica_names)} available") + for i in range(int(count)): logging(f"Deleting volume {volume_name} replica volume {replica_names[i]}") - self.volume.delete_replica_by_name(volume_name, replica_names[i]) + try: + self.volume.delete_replica_by_name(volume_name, replica_names[i]) + except Exception as e: + logging(f"Failed to delete replica {replica_names[i]}: {str(e)}") + raise + except ValueError as e: + logging(f"Invalid input: {str(e)}") + raise + except Exception as e: + logging(f"Failed to delete replicas: {str(e)}") + raise

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def delete_replicas(self, volume_name, count):

replica_list = self.replica.get(volume_name, node_name="")

replica_names = [replica['metadata']['name'] for replica in replica_list]

for i in range(int(count)):

logging(f"Deleting volume {volume_name} replica volume {replica_names[i]}")

self.volume.delete_replica_by_name(volume_name, replica_names[i])

def delete_replicas(self, volume_name, count):

try:

count = int(count)

if count <= 0:

raise ValueError("Count must be positive")

replica_list = self.replica.get(volume_name, node_name="")

replica_names = [replica['metadata']['name'] for replica in replica_list]

if count > len(replica_names):

raise ValueError(f"Cannot delete {count} replicas, only {len(replica_names)} available")

for i in range(int(count)):

logging(f"Deleting volume {volume_name} replica volume {replica_names[i]}")

try:

self.volume.delete_replica_by_name(volume_name, replica_names[i])

except Exception as e:

logging(f"Failed to delete replica {replica_names[i]}: {str(e)}")

raise

except ValueError as e:

logging(f"Invalid input: {str(e)}")

raise

except Exception as e:

logging(f"Failed to delete replicas: {str(e)}")

raise

coderabbitai · 2024-11-08T03:26:37Z

e2e/libs/volume/crd.py

+    def delete_replica_by_name(self, volume_name, replica_name):
+        replica = self.obj_api.get_namespaced_custom_object(
+            group="longhorn.io",
+            version="v1beta2",
+            namespace="longhorn-system",
+            plural="replicas",
+            name=replica_name
+        )
+        logging(f"Deleting replica {replica['metadata']['name']}")
+        self.obj_api.delete_namespaced_custom_object(
+            group="longhorn.io",
+            version="v1beta2",
+            namespace="longhorn-system",
+            plural="replicas",
+            name=replica['metadata']['name']
+        )
+


🛠️ Refactor suggestion

Add error handling and optimize replica deletion.

The implementation could be improved in the following ways:

Add error handling for non-existent replicas

Remove redundant metadata access since we already have the replica name

Consider this implementation:

def delete_replica_by_name(self, volume_name, replica_name): - replica = self.obj_api.get_namespaced_custom_object( - group="longhorn.io", - version="v1beta2", - namespace="longhorn-system", - plural="replicas", - name=replica_name - ) - logging(f"Deleting replica {replica['metadata']['name']}") - self.obj_api.delete_namespaced_custom_object( - group="longhorn.io", - version="v1beta2", - namespace="longhorn-system", - plural="replicas", - name=replica['metadata']['name'] - ) + try: + logging(f"Deleting replica {replica_name}") + self.obj_api.delete_namespaced_custom_object( + group="longhorn.io", + version="v1beta2", + namespace="longhorn-system", + plural="replicas", + name=replica_name + ) + except ApiException as e: + if e.status == 404: + logging(f"Replica {replica_name} not found") + else: + raise e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def delete_replica_by_name(self, volume_name, replica_name):

replica = self.obj_api.get_namespaced_custom_object(

group="longhorn.io",

version="v1beta2",

namespace="longhorn-system",

plural="replicas",

name=replica_name

)

logging(f"Deleting replica {replica['metadata']['name']}")

self.obj_api.delete_namespaced_custom_object(

group="longhorn.io",

version="v1beta2",

namespace="longhorn-system",

plural="replicas",

name=replica['metadata']['name']

)

def delete_replica_by_name(self, volume_name, replica_name):

try:

logging(f"Deleting replica {replica_name}")

self.obj_api.delete_namespaced_custom_object(

group="longhorn.io",

version="v1beta2",

namespace="longhorn-system",

plural="replicas",

name=replica_name

)

except ApiException as e:

if e.status == 404:

logging(f"Replica {replica_name} not found")

else:

raise e

chriscchien · 2024-11-08T08:20:20Z

e2e/tests/regression/test_replica.robot

+
+    When Update volume 0 replica count to 5
+    Then Only one replica rebuilding will start at a time for volume 0
+    And Wait until volume 0 replicas rebuilding completed


Is it possible to change the keyword name Wait until volume 0 replicas rebuilding is completed to emphasize that it ensures only one replica in the rebuild (WO) state until the rebuild is completed, for better match test case name Test Replica Rebuilding Per Volume Limit? Beside this, no further questions. Thank you.

@chriscchien Could you suggest an appropriate name?

How about Only one replica for volume 0 rebuilds at a time until the rebuild is completed for below two?

Then Only one replica rebuilding will start at a time for volume 0 And Wait until volume 0 replicas rebuilding completed

@chriscchien How about adding another step Monitor only one replica rebuilding will start at a time for volume 0:

Then Only one replica rebuilding will start at a time for volume 0 And Monitor only one replica rebuilding will start at a time for volume 0 And Wait until volume 0 replicas rebuilding completed

This aligns with the description in the test skeleton.

Signed-off-by: Yang Chiu <[email protected]>

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (7)

e2e/tests/regression/test_replica.robot (2)
16-21: Add documentation for test variables.

Consider adding comments to explain the significance of these values, particularly:

Why LOOP_COUNT is set to 1

The reasoning behind 300 retries with 1-second intervals

What DATA_ENGINE=v1 represents
 *** Variables ***
+# Number of test iterations
 ${LOOP_COUNT}    1
+# Maximum attempts to wait for volume operations (5 minutes total)
 ${RETRY_COUNT}    300
+# Interval between retry attempts in seconds
 ${RETRY_INTERVAL}    1
+# Storage engine version to use
 ${DATA_ENGINE}    v1
23-34: Documentation needs additional clarity.

While the test steps are well-documented, consider adding:

Expected outcomes for each step

Explanation of why 5 replicas are chosen in step 4

Clarification on the data size mentioned in step 3 ("few hundreds MB")
e2e/keywords/volume.resource (2)
102-104: Add documentation for the new keyword.

The implementation looks good, but adding documentation would improve maintainability.

Add documentation to describe the keyword's purpose and parameters:
+[Documentation]    Deletes the specified number of replicas from a volume
+...               ${count} - Number of replicas to delete
+...               ${volume_id} - ID of the volume
 Delete ${count} replicas of volume ${volume_id}
     ${volume_name} =    generate_name_with_suffix    volume    ${volume_id}
     delete_replicas    ${volume_name}    ${count}
203-206: Add documentation for the replica rebuilding constraint keyword.

The implementation correctly validates the "one at a time" rebuilding constraint.

Add documentation to clarify the keyword's purpose:
+[Documentation]    Validates that only one replica rebuilding process occurs at a time for the specified volume
+...               ${volume_id} - ID of the volume to monitor
 Only one replica rebuilding will start at a time for volume ${volume_id}
     ${volume_name} =    generate_name_with_suffix    volume    ${volume_id}
     only_one_replica_rebuilding_will_start_at_a_time    ${volume_name}
e2e/libs/volume/rest.py (3)
140-140: Fix comparison with None.

Use is not None instead of != None for Python identity comparison.
-        assert rebuilding_replica_name != None, f"Waiting for replica rebuilding start for volume {volume_name} on node {node_name} failed: replicas = {v.replicas}"
+        assert rebuilding_replica_name is not None, f"Waiting for replica rebuilding start for volume {volume_name} on node {node_name} failed: replicas = {v.replicas}"
🧰 Tools

🪛 Ruff

140-140: Comparison to None should be cond is not None

Replace with cond is not None

(E711)

183-183: Simplify node_name assignment.

The current expression can be simplified for better readability.
-            node_name = replica.hostId if not node_name else node_name
+            node_name = node_name if node_name else replica.hostId
🧰 Tools

🪛 Ruff

183-183: Use node_name if node_name else replica.hostId instead of replica.hostId if not node_name else node_name

Replace with node_name if node_name else replica.hostId

(SIM212)

237-237: Simplify string interpolation in log messages.

Multiple instances of string interpolation can be simplified for better readability.
-            logging(f"wait for {volume_name} replica rebuilding completed on {'all nodes' if not node_name else node_name} ... ({i})")
+            logging(f"wait for {volume_name} replica rebuilding completed on {node_name if node_name else 'all nodes'} ... ({i})")

-        logging(f"Completed volume {volume_name} replica rebuilding on {'all nodes' if not node_name else node_name}")
+        logging(f"Completed volume {volume_name} replica rebuilding on {node_name if node_name else 'all nodes'}")

-        assert completed, f"Expect volume {volume_name} replica rebuilding completed on {'all nodes' if not node_name else node_name}"
+        assert completed, f"Expect volume {volume_name} replica rebuilding completed on {node_name if node_name else 'all nodes'}"
Also applies to: 274-274, 275-275

🧰 Tools

🪛 Ruff

237-237: Use node_name if node_name else 'all nodes' instead of 'all nodes' if not node_name else node_name

Replace with node_name if node_name else 'all nodes'

(SIM212)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 77fab67 and cf7e52d.

📒 Files selected for processing (6)

e2e/keywords/volume.resource (3 hunks)
e2e/libs/keywords/volume_keywords.py (3 hunks)
e2e/libs/volume/crd.py (2 hunks)
e2e/libs/volume/rest.py (4 hunks)
e2e/libs/volume/volume.py (2 hunks)
e2e/tests/regression/test_replica.robot (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

e2e/libs/keywords/volume_keywords.py
e2e/libs/volume/crd.py
e2e/libs/volume/volume.py

🧰 Additional context used

🪛 Ruff

e2e/libs/volume/rest.py

140-140: Comparison to None should be cond is not None

Replace with cond is not None

(E711)

144-144: Loop control variable i not used within loop body

(B007)

183-183: Use node_name if node_name else replica.hostId instead of replica.hostId if not node_name else node_name

Replace with node_name if node_name else replica.hostId

(SIM212)

237-237: Use node_name if node_name else 'all nodes' instead of 'all nodes' if not node_name else node_name

Replace with node_name if node_name else 'all nodes'

(SIM212)

274-274: Use node_name if node_name else 'all nodes' instead of 'all nodes' if not node_name else node_name

Replace with node_name if node_name else 'all nodes'

(SIM212)

275-275: Use node_name if node_name else 'all nodes' instead of 'all nodes' if not node_name else node_name

Replace with node_name if node_name else 'all nodes'

(SIM212)

🔇 Additional comments (5)

e2e/tests/regression/test_replica.robot (3)

1-15: LGTM: Well-structured test setup with comprehensive resources.

The test framework setup includes all necessary resources and proper test lifecycle hooks.

35-51: Enhance test robustness with additional validations.

Consider adding volume state validation after each major operation:

 When Update volume 0 replica count to 5
+And Verify volume 0 is attached
 Then Only one replica rebuilding will start at a time for volume 0

As suggested in the previous review, consider renaming the keywords to better reflect their purpose:

-Then Only one replica rebuilding will start at a time for volume 0
-And Wait until volume 0 replicas rebuilding completed
+Then Only one replica for volume 0 rebuilds at a time until the rebuild is completed
+And Wait for volume 0 replica rebuild completion

1-51: Verify complete migration from pytest.

Let's ensure all functionality from the original pytest test has been properly migrated.

✅ Verification successful

Migration verification completed - test functionality matches

The Robot Framework test implementation successfully covers all the key functionality from the original pytest version:

Both tests verify the same core scenario of one-replica-at-a-time rebuilding:
- Setting soft anti-affinity
- Creating volume with initial replicas
- Writing test data
- Scaling to 5 replicas
- Monitoring rebuild states
- Deleting 4 replicas
- Re-verifying rebuild behavior
- Data integrity check
The Robot test actually improves the verification by adding explicit steps to:
- Monitor the single-replica rebuilding constraint
- Wait for rebuild completion
- Verify volume health

The migration has maintained full test coverage while making the test steps more explicit and readable in Robot Framework's format.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Compare pytest and robot test implementations
# to ensure complete migration

# Find the original pytest file
echo "Looking for original pytest file..."
rg -l "test_replica_rebuild_per_volume_limit" --type py

# Search for any remaining pytest references to this test
echo "Checking for remaining pytest references..."
rg "test_replica_rebuild_per_volume_limit" --type py -C 5

Length of output: 1172

Script:

#!/bin/bash
# Get the full implementation of the pytest test for comparison
echo "Getting pytest implementation..."
rg -A 50 "def test_replica_rebuild_per_volume_limit" manager/integration/tests/test_scheduling.py

# Check for any helper functions used in the pytest implementation
echo "Checking for helper functions..."
rg "def (check_volume_data|cleanup_volume|wait_for_rebuild_complete)" manager/integration/tests/test_scheduling.py -A 5

Length of output: 2396

e2e/libs/volume/rest.py (2)

Line range hint 121-171: LGTM! Good addition of concurrent rebuilding check.

The changes to make node_name optional and the addition of monitoring for concurrent rebuilding (lines 149-155) improve the robustness of the test framework.

🧰 Tools

🪛 Ruff

140-140: Comparison to None should be cond is not None

Replace with cond is not None

(E711)

144-144: Loop control variable i not used within loop body

(B007)

234-270: LGTM! Well-documented rebuild status handling.

The changes effectively handle both node-specific and all-node rebuilding scenarios. The comments about rebuildStatus reliability (lines 251-258) provide valuable context for future maintainers.

🧰 Tools

🪛 Ruff

237-237: Use node_name if node_name else 'all nodes' instead of 'all nodes' if not node_name else node_name

Replace with node_name if node_name else 'all nodes'

(SIM212)

coderabbitai · 2024-11-11T06:15:42Z

e2e/keywords/volume.resource

+Monitor only one replica rebuilding will start at a time for volume ${volume_id}
+    ${volume_name} =    generate_name_with_suffix    volume    ${volume_id}
+    wait_for_replica_rebuilding_to_complete   ${volume_name}
+


⚠️ Potential issue

Keyword implementation may not match its intended purpose.

The keyword name suggests monitoring that only one replica rebuilds at a time, but the implementation only waits for rebuilding completion. This might not effectively validate the "one at a time" constraint.

Consider implementing proper monitoring logic:

Monitor only one replica rebuilding will start at a time for volume ${volume_id} ${volume_name} = generate_name_with_suffix volume ${volume_id} - wait_for_replica_rebuilding_to_complete ${volume_name} + only_one_replica_rebuilding_will_start_at_a_time ${volume_name}

Committable suggestion skipped: line range outside the PR's diff.

chriscchien

LGTM

yangchiu self-assigned this Nov 8, 2024

yangchiu requested a review from a team as a code owner November 8, 2024 03:19

coderabbitai bot reviewed Nov 8, 2024

View reviewed changes

chriscchien reviewed Nov 8, 2024

View reviewed changes

yangchiu requested review from chriscchien and a team November 11, 2024 00:56

test(robot): migrate test_replica_rebuild_per_volume_limit

cf7e52d

Signed-off-by: Yang Chiu <[email protected]>

yangchiu force-pushed the migrate-replica-rebuild-per-volume-limit branch from 77fab67 to cf7e52d Compare November 11, 2024 06:10

coderabbitai bot reviewed Nov 11, 2024

View reviewed changes

yangchiu requested review from chriscchien and removed request for chriscchien November 11, 2024 06:33

chriscchien approved these changes Nov 11, 2024

View reviewed changes

Merge branch 'master' into migrate-replica-rebuild-per-volume-limit

69877c2

yangchiu merged commit b5d2aee into longhorn:master Nov 11, 2024
4 of 7 checks passed

This was referenced Nov 11, 2024

test(robot): add test case Test Longhorn components recovery #2143

Merged

test(robot): migrate test_replica_auto_balance_node_least_effort #2163

Merged

test(robot): volume should not reattach after node eviction #2164

Merged

coderabbitai bot mentioned this pull request Nov 25, 2024

fix(robot): fix Test Longhorn dynamic provisioned RWX volume recovery #2168

Merged

coderabbitai bot mentioned this pull request Dec 6, 2024

test(robot): expand PVC more than storage maximum size #2178

Merged

This was referenced Dec 13, 2024

test(robot): wait for pvc and volume deleted when cleaning up statefulset #2187

Merged

test(robot): retrieve asyncio task exception #2192

Merged

test(robot): fix migration confirmation after migration node down test case for v2 volumes #2195

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(robot): migrate test_replica_rebuild_per_volume_limit #2159

test(robot): migrate test_replica_rebuild_per_volume_limit #2159

yangchiu commented Nov 8, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 8, 2024 •

edited

Loading

Review failed

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Nov 8, 2024 •

edited

Loading

chriscchien Nov 8, 2024

coderabbitai bot Nov 8, 2024

coderabbitai bot Nov 8, 2024

coderabbitai bot Nov 8, 2024

coderabbitai bot Nov 8, 2024

coderabbitai bot Nov 8, 2024

chriscchien Nov 8, 2024 •

edited

Loading

yangchiu Nov 11, 2024

chriscchien Nov 11, 2024

yangchiu Nov 11, 2024

coderabbitai bot left a comment

coderabbitai bot Nov 11, 2024

chriscchien left a comment

	And Write data to volume 0
	And Write data to volume 0
	And Verify data write succeeded on volume 0

		def wait_for_replica_rebuilding_complete(self, volume_name, node_name=None):
		return self.volume.wait_for_replica_rebuilding_complete(volume_name, node_name)

test(robot): migrate test_replica_rebuild_per_volume_limit #2159

test(robot): migrate test_replica_rebuild_per_volume_limit #2159

Conversation

yangchiu commented Nov 8, 2024 • edited by coderabbitai bot Loading

Which issue(s) this PR fixes:

What this PR does / why we need it:

Special notes for your reviewer:

Additional documentation or context

Summary by CodeRabbit

Summary by CodeRabbit

coderabbitai bot commented Nov 8, 2024 • edited Loading

Review failed

Walkthrough

Changes

Assessment against linked issues

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 8, 2024 • edited Loading

Choose a reason for hiding this comment

chriscchien Nov 8, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 8, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 8, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 8, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 8, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 8, 2024

Choose a reason for hiding this comment

chriscchien Nov 8, 2024 • edited Loading

Choose a reason for hiding this comment

yangchiu Nov 11, 2024

Choose a reason for hiding this comment

chriscchien Nov 11, 2024

Choose a reason for hiding this comment

yangchiu Nov 11, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 11, 2024

Choose a reason for hiding this comment

chriscchien left a comment

Choose a reason for hiding this comment

yangchiu commented Nov 8, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 8, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai bot Nov 8, 2024 •

edited

Loading

chriscchien Nov 8, 2024 •

edited

Loading