Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backup): fix test cases for multiple backup stores #2202

Merged
merged 1 commit into from
Dec 19, 2024

Conversation

mantissahz
Copy link
Contributor

@mantissahz mantissahz commented Dec 18, 2024

Which issue(s) this PR fixes:

Issue # longhorn/longhorn#5411

What this PR does / why we need it:

Special notes for your reviewer:

Additional documentation or context

Summary by CodeRabbit

  • New Features

    • Introduced new test cases for managing volume snapshots and backing images, including support for specific selectors.
  • Bug Fixes

    • Improved robustness of backup and volume management logic, particularly during restoration processes.
  • Documentation

    • Updated comments and docstrings for clarity in test functions.
  • Refactor

    • Streamlined function signatures for better clarity regarding expected input types.
    • Standardized backup volume retrieval methods across multiple tests.
  • Tests

    • Adjusted tests to ensure accurate handling of backup operations and states.
    • Increased sleep duration for recurring backup job checks to improve timing accuracy.
    • Enhanced validation of backup states during backup and restore operations.

@mantissahz mantissahz requested a review from derekbit December 18, 2024 02:56
@mantissahz mantissahz self-assigned this Dec 18, 2024
@mantissahz mantissahz requested a review from a team as a code owner December 18, 2024 02:56
Copy link

coderabbitai bot commented Dec 18, 2024

Note

Reviews paused

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Warning

Rate limit exceeded

@mantissahz has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 10 minutes and 10 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 8599ce7 and ccf3cdd.

📒 Files selected for processing (10)
  • manager/integration/tests/backupstore.py (4 hunks)
  • manager/integration/tests/common.py (4 hunks)
  • manager/integration/tests/test_basic.py (28 hunks)
  • manager/integration/tests/test_csi.py (1 hunks)
  • manager/integration/tests/test_csi_snapshotter.py (2 hunks)
  • manager/integration/tests/test_kubernetes.py (2 hunks)
  • manager/integration/tests/test_metric.py (4 hunks)
  • manager/integration/tests/test_recurring_job.py (11 hunks)
  • manager/integration/tests/test_statefulset.py (1 hunks)
  • manager/integration/tests/test_system_backup_restore.py (4 hunks)

Walkthrough

The pull request introduces modifications to several test integration files within the Longhorn project, focusing on backup and volume management operations. The changes primarily involve refining function signatures, improving backup volume retrieval mechanisms, and enhancing test robustness. A key addition is the find_backup_volume function in the common module, which standardizes backup volume retrieval across multiple test files. The modifications aim to improve code consistency, error handling, and test reliability in backup and restore scenarios.

Changes

File Changes
manager/integration/tests/backupstore.py Enhanced error handling and retry logic in backup store setting functions; added constants DEFAULT_BACKUPTARGET and object_has_been_modified.
manager/integration/tests/common.py Updated function signatures for delete_backup and delete_backup_volume; added retry mechanism to find_backup_volume; modified parameter names in various functions.
manager/integration/tests/test_basic.py Updated backup deletion and creation function signatures; enhanced error handling and validation for backup operations.
manager/integration/tests/test_csi.py Modified delete_backup function call in backupstore_test.
manager/integration/tests/test_kubernetes.py Updated backup volume handling and method calls; refined logic for managing backup volumes.
manager/integration/tests/test_metric.py Added find_backup_volume import and adjusted backup creation process; increased sleep duration for backup job checks.
manager/integration/tests/test_recurring_job.py Imported find_backup_volume and standardized backup volume retrieval; updated calls to wait_for_backup_count.
manager/integration/tests/test_statefulset.py Changed backup volume attribute access from bv.name to bv.volumeName.
manager/integration/tests/test_system_backup_restore.py Added find_backup_volume import and standardized backup volume retrieval.
manager/integration/tests/test_csi_snapshotter.py Introduced new tests for backing images with selectors; enhanced existing tests for volume snapshots.

Suggested reviewers

  • yangchiu
  • chriscchien

Possibly related PRs

Poem

🐰 Backup Bunnies Hop and Test
In Longhorn's meadow, code's finessed
Functions dance with grace and might
Volumes backed up, tests shine bright
Reliability hops with zest! 🚀


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (17)
manager/integration/tests/test_basic.py (14)

Line range hint 259-259: Ensure consistent usage of helper function.
Using find_backup_volume keeps the implementation uniform, but confirm that all references to backup volumes in this function now use the same helper for clarity.


Line range hint 315-315: Double-check variable scoping.
system_backup_name might also be needed outside this block. Consider placing it closer to where it’s first used if necessary.


628-628: Validate error handling during backup deletion.
When calling delete_backup, consider checking if the backup still exists or if an exception is raised so that errors won't be silently ignored.


697-697: Potential repeated code for snapshot creation.
If possible, encapsulate repeated snapshot creation logic to reduce duplication.


719-719: Backup volume object usage.
Ensure the correct backup volume object is being referenced if multiple backups are present in the system.


731-731: Prevent magic numbers.
RETRY_BACKUP_COUNTS is used directly. Document or clarify its usage to make the code self-explanatory.


877-877: Check for snapshot infiltration.
When combining backups and snapshots, confirm that older snapshots aren’t inadvertently re-used.


978-978: Update docstring for backup_status_for_unavailable_replicas_test.
Current docstring might need to mention concurrency checks around node scheduling for clarity.


1073-1073: Add clarifying comment on newly introduced logic.
The new logic for discounting partial data might be unclear to new contributors. A short comment helps future maintainers.


1094-1094: Ensure consistent naming convention.
The variable dr_vol_name2 diverges from the naming style used in other places. Consider using a uniform naming pattern.


1164-1164: Enhance error messaging.
If found is not set to True after this loop, add a more descriptive error or debug statement for quicker diagnosis.


1311-1311: Ensure consistent naming for function arguments.
The function’s argument might differ from the internal naming used in the data object. Consider aligning them for clarity.


1349-1349: Enhance documentation for post-deletion verification.
If post-deletion checks are essential, add inline doc or logging to ensure a future dev grasps why they’re needed.


1854-1854: Consider more robust error handling.
When dealing with corrupted backup files, ensure an exception triggers a user-friendly error or retry logic.

manager/integration/tests/test_system_backup_restore.py (1)

315-315: Clarify naming.
Consider naming the variable more specifically, for example backup_name_for_validation, to reflect its usage.

manager/integration/tests/test_statefulset.py (1)

60-60: Rename attribute for clarity.
Instead of volumeName, consider a more descriptive attribute if this field is used for verifying the PV name, e.g. “associatedVolumeName”.

manager/integration/tests/test_metric.py (1)

768-769: Consider adding a comment explaining the sleep duration

The sleep duration has been increased from 60s to 90s. Consider adding a comment explaining why this increase was necessary to help future maintainers understand the timing requirements.

-    time.sleep(90)
+    # Wait 90 seconds for the recurring backup job to complete
+    # Previous value of 60 seconds was insufficient in some cases
+    time.sleep(90)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c5dfb2 and 3202cb2.

📒 Files selected for processing (9)
  • manager/integration/tests/backupstore.py (1 hunks)
  • manager/integration/tests/common.py (4 hunks)
  • manager/integration/tests/test_basic.py (28 hunks)
  • manager/integration/tests/test_csi.py (1 hunks)
  • manager/integration/tests/test_kubernetes.py (2 hunks)
  • manager/integration/tests/test_metric.py (4 hunks)
  • manager/integration/tests/test_recurring_job.py (10 hunks)
  • manager/integration/tests/test_statefulset.py (1 hunks)
  • manager/integration/tests/test_system_backup_restore.py (4 hunks)
🔇 Additional comments (35)
manager/integration/tests/test_basic.py (17)

Line range hint 12-12: Use of new helper function looks good.
Importing find_backup_volume centralizes the logic of retrieving the backup volume, improving consistency.


Line range hint 325-325: Frequent repeated calls to find_backup_volume.
If performance is a concern, try storing the result in a local variable unless there's a specific need to refresh it each time.


637-637: Check concurrency in backup volume usage.
If multiple backups are triggered simultaneously, confirm the new helper function handles concurrency gracefully without partial updates.


703-703: Ensure robust error handling for backup creation.
After create_backup, consider verifying backup status is InProgress, then transitions to Completed, to catch any intermediate failures.


709-709: Double-check indexing during data writes.
Make sure different data sets aren’t overwriting each other if offset or length remains constant. This helps avoid potential data collisions.


724-724: Consider edge cases for indefinite wait.
In loops waiting for backup state, add a timeout or maximum retry mechanism to avoid infinite loops if something goes wrong.


741-741: Consider concurrency in backup volume removal.
If other processes are reading the backup volume, ensure they handle the volume object’s removal gracefully.


861-861: Validate logic for offset overlap.
When writing data at overlapping offsets, confirm that the final data matches the expected content after multiple writes.


1088-1088: Potential performance risk.
Frequent read/write calls in tight loops might degrade performance. Consider batching or exponential backoff where possible.


1159-1159: Avoid overshadowing variables.
In nested loops, be sure that backing up the same variable name does not overshadow variables declared in outer scopes.


1162-1162: Consider verifying backup states thoroughly.
The logic checks for the presence of backup(1), but might not confirm other attributes, such as CreatedTime or URL correctness.


1281-1281: Double-check concurrency issues in delete_backup.
If multiple tests run in parallel, ensure the backup reference is unique and not shared with another test.


1282-1282: Confirm leftover references after removal.
After the backups are deleted, confirm that references in data structures are also removed to avoid memory usage growth.


1443-1443: Validate cleanup logic.
After a restore operation gets canceled or fails, confirm that leftover volumes or stale references are cleared.


1739-1739: Test coverage for volume with no backups.
Line 1739 references volume lastBackup; ensure there’s coverage for volumes that never had backups.


1890-1890: Potential leftover references.
Deleting a backup volume may leave behind references if the code doesn’t handle ongoing backup tasks. Confirm that’s not an issue.


1891-1891: Edge case: leftover block files.
After removing the backup volume, confirm no block files remain that might mislead future backups or cause disk usage issues.

manager/integration/tests/test_system_backup_restore.py (3)

12-12: Use of find_backup_volume import.
This import aligns with the approach in other files, ensuring the backup volume retrieval logic is unified.


259-259: Double-check concurrency with find_backup_volume.
If the system backup interacts with multiple volumes, ensure calls to find_backup_volume do not overshadow each other.


325-325: Validate consistent usage of backup volume across steps.
Ensure that the same backup volume object is passed to subsequent calls, preventing mismatch if multiple backups exist.

manager/integration/tests/backupstore.py (1)

151-152: Early-return optimization.
Skipping an unnecessary update is efficient. Ensure that no side effects are missed when the backup target remains unchanged.

manager/integration/tests/test_metric.py (3)

43-43: LGTM: Import aligns with PR objective

The addition of find_backup_volume import aligns with the PR's objective to fix test cases for multiple backup stores.

Also applies to: 43-43


721-721: LGTM: Improved backup volume handling

The change to unpack backup volume from create_backup improves clarity and consistency in backup handling.


750-750: LGTM: Consistent backup deletion

Using delete_backup_volume directly with the backup volume object improves code consistency.

manager/integration/tests/test_kubernetes.py (3)

588-588: LGTM: Improved backup volume retrieval

Using volbv to store the backup volume improves code clarity and maintainability.


Line range hint 594-599: LGTM: Added retry logic for backup volume labels

The addition of retry logic for backup volume labels handles potential race conditions when the backup volume or its labels are not immediately available.


632-632: LGTM: Consistent backup deletion

Updated delete_backup call to pass the backup volume object directly, maintaining consistency with other changes in the PR.

manager/integration/tests/test_csi.py (1)

249-249: LGTM: Consistent backup deletion

Updated delete_backup call to pass the backup volume object directly, maintaining consistency with the backup volume handling changes across the test suite.

manager/integration/tests/test_recurring_job.py (5)

Line range hint 515-527: Improve backup volume handling with retry mechanism

The code now uses the enhanced find_backup_volume() function with retry capability and proper error handling for backup volume operations.


Line range hint 588-603: Enhance backup volume status verification

The code improves the verification of Kubernetes status labels on backup volumes with better error handling and assertions.


785-788: Add retry mechanism for backup volume count verification

The code adds proper retry logic when waiting for backup volume count, improving test reliability.


2001-2013: Improve backup volume verification with retry mechanism

The code enhances backup volume verification by using the retry-capable find_backup_volume() function and proper error handling.


1074-1077: Enhance backup volume handling in multiple volume tests

The code improves backup volume operations in multiple volume test scenarios with better error handling and retry mechanisms.

Also applies to: 1591-1591, 1602-1603

manager/integration/tests/common.py (2)

3221-3227: Add retry-capable backup volume lookup function

The new find_backup_volume() function provides a robust way to find backup volumes with retry capability, improving test reliability.


3956-3966: Enhance backup volume wait function

The modified wait_for_backup_volume() function improves error handling and adds proper assertions for backup volume state verification.

assert backupstore_count_backup_block_files(client,
core_api,
volume_name) == 2

delete_backup(client, volume_name, backup2.name)
delete_backup(client, backup_volume, backup2.name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Check for partial deletions.
When deleting backups, ensure partial or incomplete states are cleaned up. This can prevent leftover references.


delete_backup_volume(client, volume_name)
delete_backup_volume(client, bv.name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential race condition.
Deleting a backup volume while a backup is in progress might cause a race condition if not handled. Use robust checks.

Comment on lines +1883 to +1887
bv3, b3 = find_backup(client, volume3_name, snap3.name)
delete_backup(client, bv3, b3.name)
bv4, b4 = find_backup(client, volume3_name, snap4.name)
delete_backup(client, bv3, b4.name)
bv5, b5 = find_backup(client, volume3_name, snap5.name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Bulk deletion of multiple backups.
Deleting multiple backups in a quick succession can be error-prone. Possibly handle them in a single loop with consistent checks.

@roger-ryao roger-ryao requested a review from a team December 18, 2024 07:23
@@ -148,6 +148,8 @@ def set_backupstore_azurite(client):

def set_backupstore_url(client, url):
backup_target_setting = client.by_id_setting(SETTING_BACKUP_TARGET)
if backup_target_setting.value == url:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't the same backup target be set twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove this at next update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The REST API itself rejects the request if the backup target is the same, but why?

(ApiError(...), "500 : failed to set the setting backup-target with invalid value s3://backupbucket@us-east-1/backupstore: url s3://backupbucket@us-east-1/backupstore is the same to backup target default\n{'code': 500, 
'detail': '', 'message': 'failed to set the setting backup-target with invalid value s3://backupbucket@us-east-1/backupstore: url s3://backupbucket@us-east-1/backupstore is the same to backup target default', 'status': 500}")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a limit in the multiple backup stores PR (it's meanless to update the same URL for a backup target), but I would like to allow it in the PR longhorn/longhorn-manager#3360

@yangchiu
Copy link
Member

When I tried to list backups from a backup volume by calling the REST API like:

$ curl -X POST http://44.208.160.164:30000/v1/backupvolumes/test-1-9b3f1a0e?action=backupList

The backups from other volumes were also listed:

{
	"data": [{
		"actions": {},
		"backupMode": "incremental",
		"backupTargetName": "default",
		"compressionMethod": "lz4",
		"created": "2024-12-18T07:48:25Z",
		"error": "",
		"id": "backup-28eb283fc8394e17",
		"labels": {
			"longhorn.io/volume-access-mode": "rwo"
		},
		"links": {
			"self": "http://10.42.4.66:9500/v1/backups/backup-28eb283fc8394e17"
		},
		"messages": null,
		"name": "backup-28eb283fc8394e17",
		"newlyUploadDataSize": "0",
		"progress": 100,
		"reUploadedDataSize": "0",
		"size": "0",
		"snapshotCreated": "2024-12-18T07:48:23Z",
		"snapshotName": "8656077f-82d9-4f28-b7b0-9c84539df7bb",
		"state": "Completed",
		"type": "backup",
		"url": "s3://backupbucket@us-east-1/backupstore?backup=backup-28eb283fc8394e17&volume=test-1",
		"volumeBackingImageName": "",
		"volumeCreated": "2024-12-18T07:48:23Z",
		"volumeName": "test-1",
		"volumeSize": "21474836480"
	}, {
		"actions": {},
		"backupMode": "incremental",
		"backupTargetName": "default",
		"compressionMethod": "lz4",
		"created": "2024-12-18T07:54:26Z",
		"error": "",
		"id": "backup-e0fd1581dd3e4aba",
		"labels": {
			"longhorn.io/volume-access-mode": "rwo"
		},
		"links": {
			"self": "http://10.42.4.66:9500/v1/backups/backup-e0fd1581dd3e4aba"
		},
		"messages": null,
		"name": "backup-e0fd1581dd3e4aba",
		"newlyUploadDataSize": "0",
		"progress": 100,
		"reUploadedDataSize": "0",
		"size": "0",
		"snapshotCreated": "2024-12-18T07:54:23Z",
		"snapshotName": "296b20f0-3f6c-4a68-ac5d-39e31fcb2349",
		"state": "Completed",
		"type": "backup",
		"url": "s3://backupbucket@us-east-1/backupstore?backup=backup-e0fd1581dd3e4aba&volume=test-2",
		"volumeBackingImageName": "",
		"volumeCreated": "2024-12-18T07:54:23Z",
		"volumeName": "test-2",
		"volumeSize": "21474836480"
	}],
	"links": {
		"self": "http://10.42.4.66:9500/v1/backupvolumes/test-1-9b3f1a0e"
	},
	"resourceType": "backup",
	"type": "collection"
}

Is this expected behavior?

@mantissahz
Copy link
Contributor Author

When I tried to list backups from a backup volume by calling the REST API like:

$ curl -X POST http://44.208.160.164:30000/v1/backupvolumes/test-1-9b3f1a0e?action=backupList

The backups from other volumes were also listed:
...
Is this expected behavior?

It will be fixed after longhorn/longhorn-manager#3360.

@chriscchien
Copy link
Contributor

chriscchien commented Dec 18, 2024

Using PR code to run backup test cases(-k test_backup), 2 failures but did not see error related to bv, b = find_backup(client, volname, snap.name). no concern from me , thank you.

@mantissahz
Copy link
Contributor Author

mantissahz commented Dec 18, 2024

Using PR code to run backup test cases(-k test_backup), 2 failures but did not see error related to bv, b = find_backup(client, volname, snap.name). no concern from me , thank you.

tests.test_basic.test_backup_metadata_deletion[s3] is related to issue that @yangchiu mentioned above,
and the test case backup_target_invalid passed with longhorn-manager PR in https://ci.longhorn.io/job/private/job/longhorn-tests-regression/7801/console

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (12)
manager/integration/tests/backupstore.py (4)

36-36: Clarify semantic meaning of default backup target.
The constant DEFAULT_BACKUPTARGET = "default" may mix two concepts: a default backup target reference vs. the string "default" implying system defaults. If there's more than one default, consider clarifying the name.


37-38: Improve maintainability of error message.
object_has_been_modified is a custom message matching a K8s concurrency error. Consider referencing or linking to the underlying K8s error type to make it easier for future contributors to find the root cause.


164-175: Confirm atomicity of backup target updates.
The code handles ResourceVersion conflicts by catching object_has_been_modified. If the setting is updated frequently, ensure that no other concurrently updated settings are lost in each retry.


182-193: Consistent concurrency handling.
Similar concurrency retry block as above. Extract into a helper function to reduce duplication and ensure consistent error handling between backup_target_credential_secret and backupstore_url updates.

manager/integration/tests/test_metric.py (2)

721-721: Verify data consistency before backup creation.
When calling create_backup, confirm that the data written to the volume is fully synced or flushed for consistent backups. Just a best practice reminder.


750-750: Consider re-check for volume presence before deletion.
delete_backup_volume might raise an error if the backup volume was already removed. A quick check in tests can improve clarity of test logs.

manager/integration/tests/test_recurring_job.py (4)

1075-1075: Check the logic for waiting on backup completion.
wait_for_backup_count looks good. Evaluate the chance of partial backups or concurrency issues when test-job-1 runs multiple backups in close intervals.


1078-1078: Avoid silent failures with retry=60.
The explicit retry of 60 is quite large. Consider adding an error or debug message each time the code is waiting, to ease troubleshooting.


1592-1592: Check for concurrency & slow environment.
Same remark: if the environment lags, 30 short tries might not be enough to discover the backup volume. Keep this in mind for CI reliability.


2002-2005: Restructure repeated call pattern.
The calls to wait_for_backup_count with find_backup_volume are repeated. Possibly unify them into a single helper function with optional arguments for the volume & expected counts.

manager/integration/tests/common.py (1)

3956-3966: Prevent infinite loop & clarify assertion.
wait_for_backup_volume loops for up to RETRY_BACKUP_COUNTS. If the volume is never found, it only asserts at the end. Consider extra logs for partial failures or repeated attempts for debugging.

manager/integration/tests/test_basic.py (1)

3241-3247: Consider adding timeout handling for backup restoration

While the backup deletion during restoration is handled correctly, consider adding timeout handling for the restoration wait to prevent potential hangs.

    bv, b = find_backup(client, std_volume_name, snap1.name)
+   try:
        client.create_volume(name=restore_volume_name, fromBackup=b.url,
                          numberOfReplicas=3,
                          dataEngine=DATA_ENGINE)
        wait_for_volume_restoration_start(client, restore_volume_name, b.name)
+   except Exception as e:
+       logging.error("Volume restoration failed: %s", str(e))
+       raise
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1263a63 and eaa1b4d.

📒 Files selected for processing (9)
  • manager/integration/tests/backupstore.py (4 hunks)
  • manager/integration/tests/common.py (4 hunks)
  • manager/integration/tests/test_basic.py (28 hunks)
  • manager/integration/tests/test_csi.py (1 hunks)
  • manager/integration/tests/test_kubernetes.py (2 hunks)
  • manager/integration/tests/test_metric.py (4 hunks)
  • manager/integration/tests/test_recurring_job.py (11 hunks)
  • manager/integration/tests/test_statefulset.py (1 hunks)
  • manager/integration/tests/test_system_backup_restore.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • manager/integration/tests/test_system_backup_restore.py
  • manager/integration/tests/test_csi.py
  • manager/integration/tests/test_statefulset.py
  • manager/integration/tests/test_kubernetes.py
🔇 Additional comments (22)
manager/integration/tests/backupstore.py (2)

17-18: Ensure constants are well-defined.
Importing NODE_UPDATE_RETRY_COUNT and NODE_UPDATE_RETRY_INTERVAL from "common" helps unify the global retry logic. Confirm that these constants match their use case to avoid unintended infinite loops or too-long sleeps.


99-106: Recheck repeated logic in loop.
The for-loop repeatedly checks backupTargetURL, credentialSecret, and pollInterval. If this condition is tied to concurrency logic, ensure it’s not missing error handling or a fallback mechanism in case the settings do not reset to empty.

manager/integration/tests/test_metric.py (2)

43-43: Centralize function usage.
Importing find_backup_volume for test_metric indicates cross-file usage. Ensure that usage is consistent with any existing find_backup method calls in other test modules to avoid duplication.


768-769: Provide rational for the 90-second wait.
time.sleep(90) is somewhat large. If it accommodates consistent test timing, consider explaining in a code comment or adjusting dynamically by poll intervals.

manager/integration/tests/test_recurring_job.py (4)

56-56: Consolidate usage of backup utilities.
find_backup_volume is imported in test_recurring_job. This function is repeated in many tests. Confirm it’s the canonical approach to retrieving a backup volume.


79-79: Use RETRY_COUNTS_SHORT carefully.
Ensure RETRY_COUNTS_SHORT is not too small to cause flaky test failures, especially under heavier load or slower storage environments.


1580-1580: Review usage of "retry=RETRY_COUNTS_SHORT".
Similar to line 79 mention, double-check if short retries risk missing the backup volume in slower test setups.


1603-1604: Distinct line usage for waiting backup counts.
Consider combining into one wait_for_backup_count call if logic is the same. Provided the test scenario is straightforward.

manager/integration/tests/common.py (3)

519-521: Adapt to standard signature for backup deletions.
Switching delete_backup to accept bv and backup_name clarifies usage. Consider adding code-level docstring that clarifies the difference between volumeName and backup volume object.


524-525: Confirm backward compatibility.
Renaming the parameter from volume_name to backup_volume_name is good but ensure no old references remain (e.g., "delete_backup_volume(volume_name, ...)" calls).


527-527: Call wait_for_backup_volume_delete.
This ensures a synchronous test flow. Validate that the code handles any unexpected concurrency errors gracefully if the backup volume was removed in prior steps.

manager/integration/tests/test_basic.py (11)

Line range hint 628-637: LGTM: Backup deletion and status verification looks correct

The backup deletion and volume status verification logic is properly implemented with appropriate wait conditions.


697-709: LGTM: Backup creation sequence is well structured

The backup creation sequence with proper data writing and verification is implemented correctly.


719-724: LGTM: Backup volume validation and cleanup

Good validation of backup volume data and proper cleanup with in-progress backup handling.


731-741: LGTM: Backup deletion and block count verification

The backup deletion and block count verification is implemented correctly with proper wait conditions.


Line range hint 1281-1311: LGTM: Backup metadata deletion sequence

The backup metadata deletion sequence is well structured with proper verification of backup states and cleanup.

🧰 Tools
🪛 Ruff (0.8.2)

1284-1284: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


Line range hint 1349-1359: LGTM: Additional backup cleanup logic

Good implementation of backup cleanup with proper block count verification.

🧰 Tools
🪛 Ruff (0.8.2)

1350-1350: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


Line range hint 1443-1449: LGTM: Simple backup deletion in backupstore test

Clean implementation of backup deletion with status verification.


Line range hint 3314-3324: LGTM: Backup deletion during backup creation

Good implementation of backup deletion during creation with proper verification of backup states.

Also applies to: 3331-3335


3458-3466: LGTM: Backup deletion during restore with verification

The backup deletion during restore and subsequent verification is implemented correctly.

Also applies to: 3478-3482


4662-4662: LGTM: Simple backup deletion

Clean implementation of single backup deletion.


5422-5424: LGTM: Backup deletion during volume restore

Good implementation of backup deletion during volume restore with proper status verification.

Comment on lines +200 to +210
for _ in range(NODE_UPDATE_RETRY_COUNT):
try:
backup_target_poll_interal_setting = client.update(
backup_store_poll_interval_setting, value=poll_interval)
except Exception as e:
if object_has_been_modified in str(e.error.message):
time.sleep(NODE_UPDATE_RETRY_INTERVAL)
continue
print(e)
raise
else:
break
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Check poll interval usage & clarity.
set_backupstore_poll_interval is also wrapped in concurrency logic. If the code attempts repeated updates triggered by “object_has_been_modified” errors, consider adding logs to ensure debugging clarity in production.

Comment on lines +2011 to +2024
wait_for_backup_count(
find_backup_volume(client, volume1_name, retry=RETRY_COUNTS_SHORT), 2)
wait_for_backup_count(
find_backup_volume(client, volume2_name, retry=RETRY_COUNTS_SHORT), 2)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Check consistency for final backup checks.
The second round is for verifying 2 backups. Ensure final test validations match test objectives so that incomplete backups are not counted incorrectly.

Comment on lines +3221 to +3227
def find_backup_volume(client, volume_name, retry=1):
for _ in range(retry):
bvs = client.list_backupVolume()
for bv in bvs:
if bv.volumeName == volume_name and bv.created != "":
return bv
time.sleep(RETRY_BACKUP_INTERVAL)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Consider returning partial result or raising error.
When find_backup_volume cannot find the volume, it returns None after retries. Consider logging a warning or raising a custom exception to help test debugging.

+raise Exception(f"No backup volume found for volume {volume_name} after {retry} retries")

Committable suggestion skipped: line range outside the PR's diff.

@chriscchien chriscchien self-requested a review December 18, 2024 09:15
@mantissahz mantissahz force-pushed the issue5411v2 branch 2 times, most recently from 63b4f29 to 0284c79 Compare December 18, 2024 09:39
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (2)
manager/integration/tests/test_basic.py (2)

Line range hint 1847-1891: Improve backup deletion validation

The backup deletion sequence has been improved but needs better validation.

Add validation checks and error handling for backup deletion operations:

     bv1, b1 = find_backup(client, volume1_name, snap1.name)
+    assert b1 is not None, "Backup b1 not found"
     delete_backup(client, bv1, b1.name)

     bv2, b2 = find_backup(client, volume2_name, snap2.name)
+    assert b2 is not None, "Backup b2 not found"
     delete_backup(client, bv2, b2.name)

     delete_backup(client, bv3, b3.name)
+    # Verify backup was deleted
+    assert wait_for_backup_delete(client, volume_name, b3.name)

     delete_backup_volume(client, bv3.name)
+    # Verify backup volume was deleted
+    assert wait_for_backup_volume_delete(client, bv3.name)

Line range hint 3314-3324: Enhance backup deletion error handling

The backup deletion error handling could be improved.

Add error handling and validation for backup deletion:

+    # Validate backup exists before deletion
+    assert b1 is not None, "Backup b1 not found"
     backup_volume = client.by_id_backupVolume(bv.name)
+    try:
         backup_volume.backupDelete(name=b1.name)
+    except Exception as e:
+        logging.error(f"Failed to delete backup {b1.name}: {str(e)}")
+        raise

Also applies to: 3331-3335

🧹 Nitpick comments (6)
manager/integration/tests/test_metric.py (1)

768-769: Consider adding a comment explaining the sleep duration

The sleep duration was increased from 60s to 90s. Consider adding a comment explaining why this longer duration is necessary for the recurring backup job to complete.

+    # Wait 90 seconds to ensure the recurring backup job completes
     time.sleep(90)
manager/integration/tests/test_basic.py (4)

Line range hint 628-637: Improve backup deletion error handling

The backup deletion code has been simplified but could benefit from additional error handling for edge cases.

Consider adding error handling and retries for backup deletion failures:

-    delete_backup(client, bv, b.name)
+    try:
+        delete_backup(client, bv, b.name)
+    except Exception as e:
+        logging.error(f"Failed to delete backup {b.name}: {str(e)}")
+        raise

697-709: Improve backup creation and deletion flow

The backup creation and deletion sequence has been improved but could be more robust.

Consider adding validation checks before backup operations:

+    # Validate backup exists before deletion
+    assert backup0 is not None, "Backup0 not found"
     _, backup0, _, _ = create_backup(client, volume_name, data0)

+    # Validate backup exists before deletion  
+    assert backup1 is not None, "Backup1 not found"
     _, backup1, _, _ = create_backup(client, volume_name, data1)

+    # Validate backup exists before deletion
+    assert backup2 is not None, "Backup2 not found"
     backup_volume, backup2, _, _ = create_backup(client, volume_name, data2)

3241-3247: Improve backup volume retrieval and deletion

The backup volume retrieval and deletion flow could be more robust.

Add validation for backup volume existence:

+    # Validate backup volume exists before operations
+    assert bv is not None, "Backup volume not found"
     bv, b = find_backup(client, std_volume_name, snap1.name)

5422-5422: Add validation for backup deletion

The backup deletion could use additional validation.

Add validation check before backup deletion:

+    assert b is not None, "Backup not found"
     delete_backup(client, bv, b.name)
manager/integration/tests/common.py (1)

3221-3227: Consider adding error logging for debugging.

While the retry mechanism is good, consider adding debug logging when retries occur to help troubleshoot backup volume lookup issues.

 def find_backup_volume(client, volume_name, retry=1):
     for _ in range(retry):
         bvs = client.list_backupVolume()
         for bv in bvs:
             if bv.volumeName == volume_name and bv.created != "":
                 return bv
+        if retry > 1:
+            print(f"Retrying backup volume lookup for {volume_name}")
         time.sleep(RETRY_BACKUP_INTERVAL)
     return None
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eaa1b4d and 0284c79.

📒 Files selected for processing (9)
  • manager/integration/tests/backupstore.py (4 hunks)
  • manager/integration/tests/common.py (4 hunks)
  • manager/integration/tests/test_basic.py (28 hunks)
  • manager/integration/tests/test_csi.py (1 hunks)
  • manager/integration/tests/test_kubernetes.py (2 hunks)
  • manager/integration/tests/test_metric.py (4 hunks)
  • manager/integration/tests/test_recurring_job.py (11 hunks)
  • manager/integration/tests/test_statefulset.py (1 hunks)
  • manager/integration/tests/test_system_backup_restore.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • manager/integration/tests/test_system_backup_restore.py
  • manager/integration/tests/test_statefulset.py
  • manager/integration/tests/test_csi.py
  • manager/integration/tests/test_kubernetes.py
  • manager/integration/tests/test_recurring_job.py
🔇 Additional comments (8)
manager/integration/tests/test_metric.py (1)

721-721: LGTM: Improved backup deletion with direct backup volume reference

The code now uses the backup volume object directly instead of just the name, making the deletion more robust.

manager/integration/tests/backupstore.py (5)

36-38: LGTM: Clear and descriptive constant definitions.

The new constants improve code readability and maintainability by centralizing common values.


99-105: LGTM: Robust retry mechanism for resetting backup store settings.

The retry mechanism ensures reliable backup store settings reset by:

  1. Checking backup target URL, credential secret and poll interval
  2. Using configurable retry count and interval

163-174: LGTM: Proper error handling for backup store URL updates.

The code properly handles concurrent modification errors by:

  1. Retrying on "object has been modified" errors
  2. Using configurable retry count and interval
  3. Preserving other error types for debugging

181-192: LGTM: Consistent error handling for credential updates.

The error handling matches the pattern used for URL updates, maintaining consistency across the codebase.


199-210: LGTM: Consistent error handling for poll interval updates.

The error handling follows the same robust pattern used for other backup store settings.

manager/integration/tests/common.py (2)

519-521: LGTM: Simplified backup deletion API usage.

The code now accepts the backup volume object directly, eliminating redundant volume name lookup.


524-527: LGTM: Simplified backup volume deletion API usage.

The code now accepts the backup volume name directly, improving clarity and reducing indirection.

Comment on lines +719 to +741
if bv['name'] == backup_volume.name:
assert bv['dataStored'] == \
str(backup_blocks_count * BACKUP_BLOCK_SIZE)

backupstore_create_dummy_in_progress_backup(client, core_api, volume_name)
delete_backup(client, volume_name, backup1.name)
delete_backup(client, backup_volume, backup1.name)
assert backupstore_count_backup_block_files(client,
core_api,
volume_name) == 4

backupstore_delete_dummy_in_progress_backup(client, core_api, volume_name)

delete_backup(client, volume_name, backup0.name)
delete_backup(client, backup_volume, backup0.name)
assert backupstore_count_backup_block_files(client,
core_api,
volume_name) == 2

delete_backup(client, volume_name, backup2.name)
delete_backup(client, backup_volume, backup2.name)
assert backupstore_count_backup_block_files(client,
core_api,
volume_name) == 0

delete_backup_volume(client, volume_name)
delete_backup_volume(client, backup_volume.name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance backup volume deletion flow

The backup volume deletion logic has been improved but could use additional validation.

Add validation for backup volume existence and status before deletion:

+    # Validate backup volume exists
+    assert backup_volume is not None, "Backup volume not found"
+    
     if bv['name'] == backup_volume.name:
         assert bv['dataStored'] == \
             str(backup_blocks_count * BACKUP_BLOCK_SIZE)

     delete_backup(client, backup_volume, backup1.name)
+    # Verify backup was deleted
+    assert wait_for_backup_delete(client, volume_name, backup1.name)

     delete_backup(client, backup_volume, backup0.name)
+    # Verify backup was deleted
+    assert wait_for_backup_delete(client, volume_name, backup0.name)

     delete_backup(client, backup_volume, backup2.name)
+    # Verify backup was deleted
+    assert wait_for_backup_delete(client, volume_name, backup2.name)

     delete_backup_volume(client, backup_volume.name)

Committable suggestion skipped: line range outside the PR's diff.

@mantissahz mantissahz requested a review from yangchiu December 18, 2024 11:03
@yangchiu
Copy link
Member

Does this PR include all the test cases that need to be refined?

@mantissahz mantissahz force-pushed the issue5411v2 branch 2 times, most recently from 3b083d0 to 8599ce7 Compare December 18, 2024 14:12
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
manager/integration/tests/test_csi_snapshotter.py (1)

1368-1368: Validate usage of new “storage_class” parameter
It seems you’ve introduced a new "storage_class" parameter to this test function’s signature. The parameter’s usage in subsequent restore logic appears valid. However, ensure that you consistently handle or test different storage classes (e.g., beyond the default) to confirm that no additional modifications are needed elsewhere.

manager/integration/tests/backupstore.py (2)

99-105: LGTM! Robust retry mechanism for backup store reset.

The retry logic properly verifies that the backup target settings are completely reset before proceeding. Good use of constants for retry count and interval.

Consider adding a debug log when retrying to help with troubleshooting:

+        print(f"Waiting for backup target reset, attempt {_+1}")
         time.sleep(NODE_UPDATE_RETRY_INTERVAL)

163-174: LGTM! Good error handling and retry logic.

The retry mechanism properly handles object modification conflicts and validates the URL was set correctly.

Consider extracting the error message check into a helper function since it's used in multiple places:

def is_object_modified_error(error):
    return object_has_been_modified in str(error.error.message)

This would improve code reuse and make the error handling more maintainable.

manager/integration/tests/test_basic.py (1)

741-741: LGTM! Consistent backup volume deletion handling.

The changes maintain a consistent pattern for backup volume deletion and proper cleanup.

Consider adding a helper function to encapsulate the common pattern of deleting a backup and waiting for completion:

def delete_backup_and_wait(client, backup_volume, backup_name, volume_name):
    """Delete a backup and wait for it to be removed."""
    delete_backup(client, backup_volume, backup_name)
    wait_for_backup_delete(client, volume_name, backup_name)
    volume = wait_for_volume_status(client, volume_name, "lastBackup", "")
    assert volume.lastBackupAt == ""

This would reduce code duplication and make the cleanup flow more maintainable.

Also applies to: 1094-1094, 1311-1349, 1890-1891

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b083d0 and 8599ce7.

📒 Files selected for processing (10)
  • manager/integration/tests/backupstore.py (4 hunks)
  • manager/integration/tests/common.py (4 hunks)
  • manager/integration/tests/test_basic.py (28 hunks)
  • manager/integration/tests/test_csi.py (1 hunks)
  • manager/integration/tests/test_csi_snapshotter.py (2 hunks)
  • manager/integration/tests/test_kubernetes.py (2 hunks)
  • manager/integration/tests/test_metric.py (4 hunks)
  • manager/integration/tests/test_recurring_job.py (11 hunks)
  • manager/integration/tests/test_statefulset.py (1 hunks)
  • manager/integration/tests/test_system_backup_restore.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
  • manager/integration/tests/test_statefulset.py
  • manager/integration/tests/test_system_backup_restore.py
  • manager/integration/tests/test_kubernetes.py
  • manager/integration/tests/test_csi.py
  • manager/integration/tests/test_recurring_job.py
  • manager/integration/tests/test_metric.py
  • manager/integration/tests/common.py
🔇 Additional comments (3)
manager/integration/tests/test_csi_snapshotter.py (1)

1492-1492: Potential race condition with “wait_for_restore=False”
Disabling the wait for volume restoration may lead to a scenario where the test proceeds before the volume is fully restored, potentially causing unreliable test outcomes. Consider confirming that the volume is ready before the next steps to avoid intermittent failures.

manager/integration/tests/backupstore.py (1)

36-38: LGTM! Well-defined constants.

The new constants are well-named and follow Python naming conventions. The error message string is properly formatted for readability.

manager/integration/tests/test_basic.py (1)

628-628: LGTM! Improved type safety in delete_backup() calls.

The changes consistently pass backup volume objects instead of names to delete_backup(), which:

  1. Improves type safety
  2. Reduces potential errors from incorrect string names
  3. Makes the code more maintainable

Also applies to: 637-637, 1281-1282, 1311-1311, 1324-1324, 1349-1349, 1443-1443, 4662-4662

Comment on lines +181 to +192
for _ in range(NODE_UPDATE_RETRY_COUNT):
try:
backup_target_credential_setting = client.update(
backup_target_credential_setting, value=credential_secret)
except Exception as e:
if object_has_been_modified in str(e.error.message):
time.sleep(NODE_UPDATE_RETRY_INTERVAL)
continue
print(e)
raise
else:
break
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider extracting common retry logic.

The retry mechanism is duplicated across set_backupstore_url(), set_backupstore_credential_secret() and set_backupstore_poll_interval().

Consider extracting the retry logic into a decorator or helper function:

def retry_on_conflict(func):
    def wrapper(*args, **kwargs):
        for _ in range(NODE_UPDATE_RETRY_COUNT):
            try:
                return func(*args, **kwargs)
            except Exception as e:
                if object_has_been_modified in str(e.error.message):
                    time.sleep(NODE_UPDATE_RETRY_INTERVAL)
                    continue
                print(e)
                raise
    return wrapper

@retry_on_conflict
def set_backupstore_url(client, url):
    backup_target_setting = client.by_id_setting(SETTING_BACKUP_TARGET)
    backup_target_setting = client.update(backup_target_setting, value=url)
    assert backup_target_setting.value == url

This would:

  1. Eliminate code duplication
  2. Make the retry logic more maintainable
  3. Make it easier to add retry to other functions

Also applies to: 199-210

@mantissahz
Copy link
Contributor Author

mantissahz commented Dec 18, 2024

Does this PR include all the test cases that need to be refined?

It should be, but I might lose some. I have some failed test cases that were not re-run.
I will re-run them tonight.

"test_basic.py::test_dr_volume_activated_with_failed_replica",
"test_basic.py::test_allow_volume_creation_with_degraded_availability_dr",
"test_engine_upgrade.py::test_auto_upgrade_engine_to_default_version_dr_volume",
"test_ha.py::test_engine_crash_for_dr_volume",
"test_ha.py::test_engine_image_miss_scheduled_perform_volume_operations",
"test_ha.py::test_engine_image_not_fully_deployed_perform_volume_operations",
"test_ha.py::test_engine_image_not_fully_deployed_perform_dr_restoring_expanding_volume",

Update:

============================= test session starts ==============================
platform linux -- Python 3.11.10, pytest-6.2.4, py-1.11.0, pluggy-0.13.1 -- /usr/bin/python3.11
cachedir: .pytest_cache
rootdir: /integration, configfile: pytest.ini
plugins: order-1.0.1, repeat-0.9.1
collecting ... collected 7 items

test_basic.py::test_dr_volume_activated_with_failed_replica[s3] PASSED   [ 14%]
test_basic.py::test_allow_volume_creation_with_degraded_availability_dr[s3] PASSED [ 28%]
test_engine_upgrade.py::test_auto_upgrade_engine_to_default_version_dr_volume[s3] PASSED [ 42%]
test_ha.py::test_engine_crash_for_dr_volume[s3] PASSED                   [ 57%]
test_ha.py::test_engine_image_miss_scheduled_perform_volume_operations[s3] PASSED [ 71%]
test_ha.py::test_engine_image_not_fully_deployed_perform_volume_operations[s3] PASSED [ 85%]
test_ha.py::test_engine_image_not_fully_deployed_perform_dr_restoring_expanding_volume[s3] PASSED [100%]

================== 7 passed, 8 warnings in 1586.42s (0:26:26) ==================

Looks fine in my local cluster.

@yangchiu
Copy link
Member

Does this PR include all the test cases that need to be refined?

It should be, but I might lose some. I have some failed test cases that were not re-run. I will re-run them tonight.

I'm not sure if these failures have been fixed: https://ci.longhorn.io/job/private/job/longhorn-tests-regression/7817/testReport/.

Could you please help confirm?

@mantissahz
Copy link
Contributor Author

I'm not sure if these failures have been fixed: https://ci.longhorn.io/job/private/job/longhorn-tests-regression/7817/testReport/.
Could you please help confirm?

tests.test_recurring_job.test_recurring_job_groups
tests.test_recurring_job.test_recurring_job_multiple_volume

Yes, I think these are only two left, and I still look into the cases.

ref: longhorn/longhorn 5411

Signed-off-by: James Lu <[email protected]>
@derekbit
Copy link
Member

@mantissahz build / flake8 check (pull_request) failed

@derekbit
Copy link
Member

@coderabbitai pause

Copy link

coderabbitai bot commented Dec 19, 2024

✅ Actions performed

Reviews paused.

Copy link
Member

@yangchiu yangchiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yangchiu yangchiu merged commit d4e1c41 into longhorn:master Dec 19, 2024
7 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants