using rwsem lock as inode_lock to avoid deadlock #13736

Donny9 · 2024-09-29T13:50:08Z

Summary

using rwsem lock as inode_lock to avoid deadlock
deadlock examples:
When executing df -h on Core A to view mount information, this process will traverse inode nodes, thereby holding the inode_lock. Since the inode type of the mount point may be rpmsgfs, it will fetch staffs information from another Core B.
Meanwhile, rcS on Core B needs to obtain file information from Core A, which will be achieved by fetching stat information through rpmsgfs. When this message arrives at Core A, a deadlock can occur between Core A's rptun ap and nsh task.
However, both of these places involve read operations only, thus a reader-writer lock can be utilized to prevent such a deadlock.
simply inode_lock code, remove return value check
support recursive write for same process in sem_rw lock

Impact

Bug fix

Testing

Vela

nuttxpr · 2024-09-29T14:05:00Z

[Experimental Bot, please feedback here]

Let's analyze if this Pull Request (PR) description fulfills the NuttX requirements:

Strengths:

Addresses a Deadlock: The PR tackles a critical issue—a deadlock scenario. This is clearly a significant bug fix, directly enhancing the stability of NuttX.
Provides a Root Cause Explanation: The description explains the deadlock's origin, involving interactions between inode traversal, rpmsgfs, and Core A/B communication. This clarity is valuable for reviewers.
Solution Rationale: Justifies the use of a reader-writer (rwsem) lock as a solution. It correctly points out that the deadlock arises from concurrent read operations, making rwsem a suitable mitigation.
Mentions Additional Changes: Briefly notes code simplification (removing return value checks) and recursive write support. While needing more detail, it shows awareness of broader impacts.

Weaknesses:

Impact Section Lacking: The "Impact" section is overly brief. It correctly identifies the change as a bug fix but needs to elaborate on the potential ramifications as the PR description template requests.
Testing Details Insufficient: "Testing: Vela" is insufficient. The requirements demand:
- Specifics about the testing environment: Which build hosts (OS, CPU, compiler) and targets (architecture, board, configuration) were used?
- Testing logs: Provide "before" and "after" logs to demonstrate the deadlock's presence initially and its resolution after the change.
Code Simplification and Recursion Under-explained: The description mentions simplifying inode_lock code and adding recursive write support but lacks detail. These changes deserve elaboration:
- Code Simplification: What specific code was simplified? Why was the return value check unnecessary?
- Recursive Writes: Why was this support added? Does it introduce any potential risks?

Verdict & Recommendations:

This PR description is a good starting point but does not fully meet NuttX requirements.

Here's how to improve it:

Expand the Impact Section: Address all the points outlined in the template, even if the answer is "NO" (with a brief explanation). For example, "Impact on build (will build process change)? NO – The changes are localized to locking mechanisms and do not affect the build process."
Provide Thorough Testing Information:
- List all build host and target environments used for validation.
- Include relevant "before" and "after" testing logs. These logs are crucial for demonstrating the deadlock and its successful resolution.
Explain Code Simplification: Briefly describe the removed code and why its return value check was redundant.
Justify Recursive Write Support: Explain the rationale behind adding this feature and address any potential downsides or complexities it might introduce.

By addressing these points, you'll create a more robust and informative PR description, increasing the likelihood of a smoother review and acceptance.

Signed-off-by: dongjiuzhu1 <[email protected]>

fs/vfs/fs_pseudofile.c

fs/shm/shmfs.c

xiaoxiang781216 · 2024-10-01T04:02:05Z

fs/inode/fs_inodefind.c

-      return ret;
-    }
-
+  inode_lock();


inode_rlock

In some scenarios, inode_find and inode_remove are used in combination, such as in shm_open and shm_unlink. The remove operation requires a write lock. If we were to switch to a read lock in inode_find, then a lock would need to be added in inode_remove, and additional checks would be required elsewhere.

let's create a patch to change rwsem which promote rlock to wlock if the same thread already hold wlock.

#13797 Done~

fs/inode/fs_inodeaddref.c

Example: When executing "df -h" on Core A to view mount information, this process will traverse inode nodes, thereby holding the inode_lock. Since the inode type of the mount point may be rpmsgfs, it will fetch statfs information from another Core B. Meanwhile, rcS on Core B needs to obtain file information from Core A, which will be achieved by fetching stat information through rpmsgfs. When this message arrives at Core A, a deadlock can occur between Core A's rptun ap and nsh task. However, both of these places involve read operations only, thus a reader-writer lock can be utilized to prevent such a deadlock. Signed-off-by: dongjiuzhu1 <[email protected]>

Signed-off-by: dongjiuzhu1 <[email protected]>

github-actions bot added Area: File System File System issues Area: OS Components OS Components issues Size: M The size of the change in this PR is medium labels Sep 29, 2024

Donny9 force-pushed the inode_rw branch from 4878d21 to adf7dd3 Compare September 29, 2024 14:00

Donny9 force-pushed the inode_rw branch from adf7dd3 to 6c4d366 Compare September 30, 2024 06:22

sched/semaphore: add sem_rw source file to CMakeLists

2f7383d

Signed-off-by: dongjiuzhu1 <[email protected]>

Donny9 force-pushed the inode_rw branch from 6c4d366 to 661a393 Compare September 30, 2024 09:17

sched/semaphore: support recursive write for same process in sem_rw lock

947ee41

Signed-off-by: dongjiuzhu1 <[email protected]>

Donny9 force-pushed the inode_rw branch from 661a393 to 26ec205 Compare September 30, 2024 13:38

xiaoxiang781216 reviewed Oct 1, 2024

View reviewed changes

Donny9 added 2 commits October 1, 2024 22:17

fs/inode: remove unnecessary return value for inode_addrefs

792ff4d

Signed-off-by: dongjiuzhu1 <[email protected]>

Donny9 force-pushed the inode_rw branch from 26ec205 to 792ff4d Compare October 1, 2024 14:38

xiaoxiang781216 merged commit b2e69b8 into apache:master Oct 1, 2024
29 checks passed

Donny9 mentioned this pull request Oct 2, 2024

sched/sem_rw: convert read-lock to write-lock when self already holds write-lock #13797

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

using rwsem lock as inode_lock to avoid deadlock #13736

using rwsem lock as inode_lock to avoid deadlock #13736

Donny9 commented Sep 29, 2024

nuttxpr commented Sep 29, 2024

xiaoxiang781216 Oct 1, 2024

Donny9 Oct 1, 2024

xiaoxiang781216 Oct 1, 2024

Donny9 Oct 2, 2024

using rwsem lock as inode_lock to avoid deadlock #13736

using rwsem lock as inode_lock to avoid deadlock #13736

Conversation

Donny9 commented Sep 29, 2024

Summary

Impact

Testing

nuttxpr commented Sep 29, 2024

xiaoxiang781216 Oct 1, 2024

Choose a reason for hiding this comment

Donny9 Oct 1, 2024

Choose a reason for hiding this comment

xiaoxiang781216 Oct 1, 2024

Choose a reason for hiding this comment

Donny9 Oct 2, 2024

Choose a reason for hiding this comment