Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize rcutils_logging_get_logger_effective_level() #381

Merged
merged 7 commits into from
Sep 7, 2022

Conversation

clalancette
Copy link
Contributor

This function is the most expensive part of
rcutils_logging_logger_is_enabled_for(), which is called on every
RCUTILS_LOG_* invocation.

We notice a couple of things:

  1. The current name -> logger level lookup is using a "string map"
    structure. A "string map" isn't a map at all, but is really a linear
    array of strings. Thus searching it can be slow when there are
    a lot of strings in the map. Since this is a global map, this
    can happen when many ROS 2 nodes are loaded in to the same process.

  2. When looking up the severity of a particular name, we need to
    check the full name, plus any ancestors (separated by '.') to find
    the severity level. This requires a bunch of work, including
    copying a string, running strlen on it, searching through it for
    periods, etc. This can be expensive, and in the common case it isn't
    needed; the fully-qualified string is usually in the map already.

To fix both of these, we switch to a hash map, and look up the
current logger level via the hash map. If the full logger name
isn't in the hash map, we fall back to the slow path where we
do the additional work of finding the severity via the hierarchy.
Note that even in this slow path, once we've computed the severity
level we place that in the map as well so that subsequent lookups
will take the fast path.

The benchmarks I ran on this change give very good improvements
for most cases. For cases where there are multiple set logger
names, and those logger names have ancestors, this change is
~9x faster than the current implementation. For cases where there
are multiple set logger names and those logger names have no
ancestors (one of the most common cases in ROS 2), this change is ~3x
faster than the current implementation. For cases where we
have a single logger name (another common case), this change is
~2x faster than the current implementation.

Finally, note that rcutils_logging_set_logger_level() is
more expensive than before, due to having to traverse the hash map
to find other logger names in the hierarchy. Since we don't expect
users to change logger levels often, this tradeoff seems worth it
to speed up lookups (which are called for every log message).

Signed-off-by: Chris Lalancette [email protected]

This should solve #364 (at least, as much as it can be solved given the current design).

This makes the code shorter and much more consistent.

Signed-off-by: Chris Lalancette <[email protected]>
This function is the most expensive part of
rcutils_logging_logger_is_enabled_for(), which is called on every
RCUTILS_LOG_* invocation.

We notice a couple of things:

1. The current name -> logger level lookup is using a "string map"
structure.  A "string map" isn't a map at all, but is really a linear
array of strings.  Thus searching it can be slow when there are
a lot of strings in the map.  Since this is a global map, this
can happen when many ROS 2 nodes are loaded in to the same process.

2. When looking up the severity of a particular name, we need to
check the full name, plus any ancestors (separated by '.') to find
the severity level.  This requires a bunch of work, including
copying a string, running strlen on it, searching through it for
periods, etc.  This can be expensive, and in the common case it isn't
needed; the fully-qualified string is usually in the map already.

To fix both of these, we switch to a hash map, and look up the
current logger level via the hash map.  If the full logger name
isn't in the hash map, we fall back to the slow path where we
do the additional work of finding the severity via the hierarchy.
Note that even in this slow path, once we've computed the severity
level we place that in the map as well so that subsequent lookups
will take the fast path.

The benchmarks I ran on this change give very good improvements
for most cases.  For cases where there are multiple set logger
names, and those logger names have ancestors, this change is
~9x faster than the current implementation.  For cases where there
are multiple set logger names and those logger names have no
ancestors (one of the most common cases in ROS 2), this change is ~3x
faster than the current implementation.  For cases where we
have a single logger name (another common case), this change is
~2x faster than the current implementation.

Finally, note that rcutils_logging_set_logger_level() is
more expensive than before, due to having to traverse the hash map
to find other logger names in the hierarchy.  Since we don't expect
users to change logger levels often, this tradeoff seems worth it
to speed up lookups (which are called for every log message).

Signed-off-by: Chris Lalancette <[email protected]>
@clalancette clalancette linked an issue Aug 29, 2022 that may be closed by this pull request
@clalancette
Copy link
Contributor Author

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

Also add in a test.

Signed-off-by: Chris Lalancette <[email protected]>
@clalancette
Copy link
Contributor Author

New CI after the latest fix:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

Signed-off-by: Chris Lalancette <[email protected]>
@clalancette
Copy link
Contributor Author

Another CI with the latest Windows fix:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@clalancette
Copy link
Contributor Author

@ros-pull-request-builder retest this please

@ivanpauno
Copy link
Member

I have a few questions before reviewing

When looking up the severity of a particular name, we need to
check the full name, plus any ancestors (separated by '.') to find
the severity level. This requires a bunch of work, including
copying a string, running strlen on it, searching through it for
periods, etc. This can be expensive, and in the common case it isn't
needed; the fully-qualified string is usually in the map already.

The parent logging severity is only queried if there wasn't a severity directly set for this logger, right?
If looking up for a parent severity is needed, it would be nice to set the child logger severity it in the map, so the next time the only thing needed is a query.
The only tricky thing is that if someone changes the parent logger severity, the child should also change in that case (as the severity was actually inherited from the parent), so to make sure that happens we will need to store some extra information.


IMO, everything would be more efficient if we would have a logger object, instead of only using a name.
It may be a pain to change that now, but most people are using the rclcpp/rclpy logging functions instead of this, and those already have a logger object.

@clalancette
Copy link
Contributor Author

The parent logging severity is only queried if there wasn't a severity directly set for this logger, right?

That's correct.

If looking up for a parent severity is needed, it would be nice to set the child logger severity it in the map, so the next time the only thing needed is a query.

Yep! That's what this PR does now; if the parent lookup is needed, then once it is calculated the child is added into the map for subsequent lookups (see the code in rcutils_logging_get_logger_effective_level).

The only tricky thing is that if someone changes the parent logger severity, the child should also change in that case (as the severity was actually inherited from the parent), so to make sure that happens we will need to store some extra information.

Right. This PR also addresses that by walking the entire hash map when loggers change severity, looking for children of the current name being changed. For any ones that haven't been explicitly set, it drops them from the hash map until they are used again (see the code in rcutils_logging_set_logger_level).

IMO, everything would be more efficient if we would have a logger object, instead of only using a name.
It may be a pain to change that now, but most people are using the rclcpp/rclpy logging functions instead of this, and those already have a logger object.

Yes, agreed. The performance with this PR is the best we can do (I think) with a strictly string-based hierarchy like we currently have. I think it is valuable to get this in (as it will help performance today), and then something like ros2/rcl_logging#92 may make this obsolete.

Copy link
Member

@ivanpauno ivanpauno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I have some minor comments

src/logging.c Outdated Show resolved Hide resolved
src/logging.c Show resolved Hide resolved
src/logging.c Show resolved Hide resolved
@clalancette
Copy link
Contributor Author

@ivanpauno Thanks for the review! I think I've addressed all of your comments; please take another look.

Copy link
Member

@ivanpauno ivanpauno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work!!

@clalancette
Copy link
Contributor Author

clalancette commented Sep 6, 2022

One more CI on the latest changes:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@clalancette
Copy link
Contributor Author

The Windows CI warning is unrelated to this change, and everything else is green.

Going ahead and merging this, thanks for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

High logging overhead
2 participants