Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip dnode handles use when not needed #16374

Closed
wants to merge 2 commits into from
Closed

Conversation

amotin
Copy link
Member

@amotin amotin commented Jul 22, 2024

Neither FreeBSD nor Linux currently implement kmem_cache_set_move(), which means dnode_move() is never called. In such situation use of dnode handles with respective locking to access dnode from dbuf is a waste of time for no benefit.

This patch implements optional simplified code for such platforms, saving at least 3 dnode lock/dereference/unlock per dbuf life cycle. Originally I hoped to drop the handles completely to save memory, but they are still used in dnodes allocation code, so left for now.

Before this change in CPU profiles of some workloads I saw 4-20% of CPU time spent in zrl_add_impl()/zrl_remove(), which are gone now.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

 - Use the macros in few places it was missed.
 - Reduce scope of DB_DNODE_ENTER/EXIT() and inline some DB_DNODE()
uses to make it more obvious what exactly is protected there and
make unprotected accesses by mistake more difficult.
 - Make use of zrl_owner().

Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Neither FreeBSD nor Linux currently implement kmem_cache_set_move(),
which means dnode_move() is never called.  In such situation use of
dnode handles with respective locking to access dnode from dbuf is
a waste of time for no benefit.

This patch implements optional simplified code for such platforms,
saving at least 3 dnode lock/dereference/unlock per dbuf life cycle.
Originally I hoped to drop the handles completely to save memory,
but they are still used in dnodes allocation code, so left for now.

Before this change in CPU profiles of some workloads I saw 4-20% of
CPU time spent in zrl_add_impl()/zrl_remove(), which are gone now.

Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Copy link
Member

@robn robn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Impressive performance improvements! I wonder then if its worth removing the support for dnode_move() entirely; leaving aside that we don't have support for any other platforms, is there ever going to be a scenario where the ongoing locking overhead is going to be worth whatever we get back from being able to defragment memory?

@amotin
Copy link
Member Author

amotin commented Jul 22, 2024

@robn I was thinking about removing it completely, since I honestly not very believe in its usefulness, but it was in ZFS forever, and killing it would be much easier than restoring it back later, while giving no immediate benefit now. May be we could simplify dnode allocation code by using some simpler and more standard primitive instead of zrl locks, but that is much less busy code path I haven't looked much into.

@robn
Copy link
Member

robn commented Jul 22, 2024

Makes sense, reasonable tradeoff. Good work :)

@amotin amotin added the Status: Code Review Needed Ready for review and testing label Jul 22, 2024
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jul 29, 2024
behlendorf pushed a commit that referenced this pull request Jul 29, 2024
Neither FreeBSD nor Linux currently implement kmem_cache_set_move(),
which means dnode_move() is never called.  In such situation use of
dnode handles with respective locking to access dnode from dbuf is
a waste of time for no benefit.

This patch implements optional simplified code for such platforms,
saving at least 3 dnode lock/dereference/unlock per dbuf life cycle.
Originally I hoped to drop the handles completely to save memory,
but they are still used in dnodes allocation code, so left for now.

Before this change in CPU profiles of some workloads I saw 4-20% of
CPU time spent in zrl_add_impl()/zrl_remove(), which are gone now.

Reviewed-by: Rob Wing <[email protected]
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:  Alexander Motin <[email protected]>
Sponsored by:   iXsystems, Inc.
Closes #16374
@amotin amotin deleted the no_handle branch August 2, 2024 20:55
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Sep 4, 2024
 - Use the macros in few places it was missed.
 - Reduce scope of DB_DNODE_ENTER/EXIT() and inline some DB_DNODE()
uses to make it more obvious what exactly is protected there and
make unprotected accesses by mistake more difficult.
 - Make use of zrl_owner().

Reviewed-by: Rob Wing <[email protected]
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:	Alexander Motin <[email protected]>
Sponsored by:	iXsystems, Inc.
Closes openzfs#16374
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Sep 4, 2024
Neither FreeBSD nor Linux currently implement kmem_cache_set_move(),
which means dnode_move() is never called.  In such situation use of
dnode handles with respective locking to access dnode from dbuf is
a waste of time for no benefit.

This patch implements optional simplified code for such platforms,
saving at least 3 dnode lock/dereference/unlock per dbuf life cycle.
Originally I hoped to drop the handles completely to save memory,
but they are still used in dnodes allocation code, so left for now.

Before this change in CPU profiles of some workloads I saw 4-20% of
CPU time spent in zrl_add_impl()/zrl_remove(), which are gone now.

Reviewed-by: Rob Wing <[email protected]
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by:  Alexander Motin <[email protected]>
Sponsored by:   iXsystems, Inc.
Closes openzfs#16374
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants