Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: Implement Custom Async Mutex Using Oneshot Channel #1208

Closed
wants to merge 1 commit into from

Conversation

drmingdrmer
Copy link
Member

@drmingdrmer drmingdrmer commented Jul 29, 2024

Changelog

Refactor: Implement Custom Async Mutex Using Oneshot Channel

This commit introduces a custom implementation of an asynchronous Mutex
using the AsyncRuntime::Oneshot functions, tailored specifically for
Openraft's limited use of asynchronous locks. This custom mutex replaces
the previously used Tokio mutex.

  • Refactor of RaftInner::tx_shutdown: The tx_shutdown member of
    RaftInner has been changed to use a standard (synchronous) Mutex
    instead of an asynchronous one. This change is made because
    tx_shutdown does not span across .await points, making the
    asynchronous capabilities unnecessary.

  • OneshotSender Drop Implementation: It is now documented that the
    OneshotSender should implement the Drop trait to ensure that when
    a sender is dropped, the receiver is notified and yields an error.
    This behavior is crucial for maintaining robust error handling in
    asynchronous communication patterns.

Refactor: RaftInner::tx_shutdown should be std Mutx

This change is Reviewable

@drmingdrmer drmingdrmer marked this pull request as draft July 29, 2024 02:51
@drmingdrmer
Copy link
Member Author

drmingdrmer commented Jul 29, 2024

@SteveLauC

Copy link
Collaborator

@SteveLauC SteveLauC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 5 files at r1, all commit messages.
Reviewable status: 4 of 5 files reviewed, 1 unresolved discussion (waiting on @drmingdrmer and @schreter)


openraft/src/sync/mutex.rs line 22 at r1 (raw file):

    /// The current lock holder.
    ///
    /// When the acquired `MutexGuard` is dropped, it will notify the next waiting task via this

Noticed that we didn't explicitly send a () in the Drop implementation (as there is no manual drop impl), instead, we are utilizing that rx.await will return an error when the tx (guard.holder) is dropped, is this something we can rely on? Or should we document this requirement in the doc of trait Oneshot so that users can know that we rely on it and have it implemented 🤔

Copy link
Member Author

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 5 files reviewed, 1 unresolved discussion (waiting on @schreter and @SteveLauC)


openraft/src/sync/mutex.rs line 22 at r1 (raw file):

Previously, SteveLauC (SteveLauC) wrote…

Noticed that we didn't explicitly send a () in the Drop implementation (as there is no manual drop impl), instead, we are utilizing that rx.await will return an error when the tx (guard.holder) is dropped, is this something we can rely on? Or should we document this requirement in the doc of trait Oneshot so that users can know that we rely on it and have it implemented 🤔

Correct. The functionality relies on the Drop implementation to notify the waiting task. However, there is currently insufficient documentation describing the Drop behavior for OneshotSender.

Copy link
Collaborator

@SteveLauC SteveLauC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 5 files reviewed, 1 unresolved discussion (waiting on @drmingdrmer and @schreter)


openraft/src/sync/mutex.rs line 22 at r1 (raw file):

However, there is currently insufficient documentation describing the Drop behavior for OneshotSender.

Then we should document this and probably add a test for it:)

@drmingdrmer drmingdrmer force-pushed the 113-mutex branch 2 times, most recently from c6b1b80 to 9a5574f Compare July 29, 2024 06:45
@drmingdrmer drmingdrmer changed the title Refactor: mock async Mutex with oneshot channel Refactor: Implement Custom Async Mutex Using Oneshot Channel Jul 29, 2024
@drmingdrmer drmingdrmer marked this pull request as ready for review July 29, 2024 06:46
Copy link
Member Author

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 6 files at r2, all commit messages.
Reviewable status: 6 of 7 files reviewed, 1 unresolved discussion (waiting on @schreter and @SteveLauC)


openraft/src/sync/mutex.rs line 22 at r1 (raw file):

Previously, SteveLauC (SteveLauC) wrote…

However, there is currently insufficient documentation describing the Drop behavior for OneshotSender.

Then we should document this and probably add a test for it:)

Add a TODO:

Copy link
Collaborator

@schreter schreter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 5 files at r1, 6 of 6 files at r2, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @drmingdrmer)


openraft/src/raft/mod.rs line 321 at r2 (raw file):

            rx_data_metrics,
            rx_server_metrics,
            tx_shutdown: std::sync::Mutex::new(Some(tx_shutdown)),

This is a good idea.


openraft/src/raft/mod.rs line 322 at r2 (raw file):

            rx_server_metrics,
            tx_shutdown: std::sync::Mutex::new(Some(tx_shutdown)),
            core_state: Mutex::new(CoreState::Running(core_handle)),

BTW, there is only a single place where this requires async mutex. You can also rewrite it with synchronous one, by adding an additional state Joining(watch::Receiver<()>) to await the task completion and first changing the state to Joining and then after the task is fully-joined change to Done. With this, there is no need for async mutex here.


openraft/src/raft/mod.rs line 324 at r2 (raw file):

            core_state: Mutex::new(CoreState::Running(core_handle)),

            snapshot: Mutex::new(None),

This mutex is also a bit strange (and the last async mutex). It seems to be used only at a single place to protect against parallel receive of multiple snapshots? Strange.

@drmingdrmer Is this lock needed at all? It will anyway call install_full_snapshot afterwards.


openraft/src/sync/mutex.rs line 17 at r2 (raw file):

/// Since oneshot channel is already required by AsyncRuntime implementation,
/// there is no need for the application to implement Mutex.
pub(crate) struct Mutex<C, T>

Well, though it's interesting implementation, it has some serious drawbacks. Especially, it requires memory allocation for each lock operation, which is a no-go for a Mutex, not only making it slow, but also introducing unexpected failure potential.

What I meant is to add a trait for Mutex, which would be default-implemented by TokioRuntime with tokio::sync::Mutex. Then, an optimized mutex would be used.

Copy link
Member Author

@drmingdrmer drmingdrmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @schreter)


openraft/src/raft/mod.rs line 322 at r2 (raw file):

Previously, schreter wrote…

BTW, there is only a single place where this requires async mutex. You can also rewrite it with synchronous one, by adding an additional state Joining(watch::Receiver<()>) to await the task completion and first changing the state to Joining and then after the task is fully-joined change to Done. With this, there is no need for async mutex here.

good idea. I'm going to fix this in anoter PR.


openraft/src/raft/mod.rs line 324 at r2 (raw file):

Previously, schreter wrote…

This mutex is also a bit strange (and the last async mutex). It seems to be used only at a single place to protect against parallel receive of multiple snapshots? Strange.

@drmingdrmer Is this lock needed at all? It will anyway call install_full_snapshot afterwards.

Yes, there might be multiple parallel chunked snapshot requests received.


openraft/src/sync/mutex.rs line 17 at r2 (raw file):

Previously, schreter wrote…

Well, though it's interesting implementation, it has some serious drawbacks. Especially, it requires memory allocation for each lock operation, which is a no-go for a Mutex, not only making it slow, but also introducing unexpected failure potential.

What I meant is to add a trait for Mutex, which would be default-implemented by TokioRuntime with tokio::sync::Mutex. Then, an optimized mutex would be used.

True. I'll revert this PR.

This commit introduces a custom implementation of an asynchronous Mutex
using the `AsyncRuntime::Oneshot` functions, tailored specifically for
Openraft's limited use of asynchronous locks. This custom mutex replaces
the previously used Tokio mutex.

- **Refactor of `RaftInner::tx_shutdown`:** The `tx_shutdown` member of
  `RaftInner` has been changed to use a standard (synchronous) Mutex
  instead of an asynchronous one. This change is made because
  `tx_shutdown` does not span across `.await` points, making the
  asynchronous capabilities unnecessary.

- **OneshotSender `Drop` Implementation:** It is now documented that the
  `OneshotSender` should implement the `Drop` trait to ensure that when
  a sender is dropped, the receiver is notified and yields an error.
  This behavior is crucial for maintaining robust error handling in
  asynchronous communication patterns.
@SteveLauC
Copy link
Collaborator

True. I'll revert this PR.

Let me add the Mutex primitive to the AsyncRuntime trait this afternoon:)

@SteveLauC SteveLauC mentioned this pull request Jul 31, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants