Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Ray Core / Ray Data logging configuration leads to unexpected behavior #48958

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

kevin85421
Copy link
Member

@kevin85421 kevin85421 commented Nov 27, 2024

Why are these changes needed?

Issue

In the Ray codebase, logging.config.dictConfig may be called multiple times. However, we found that if a logger’s child loggers are set before the logger is set via dictConfig, it may cause issues.

  • Example1 (incremental: False): The logger Ray.data loses its original handler and uses the Ray logger’s handler after the Ray logger is set via dictConfig.

    2024-11-27 04:32:06,213 - Ray.data - INFO - This is an INFO log from Ray.data.
    2024-11-27 04:32:06,213 - Ray.data - WARNING - This is a WARNING log from Ray.data.
    2024-11-27 04:32:06,213 - Ray.data - INFO - Ray data propagate False
    abc Ray - DEBUG - This is a DEBUG log from Ray.
    abc Ray - ERROR - This is an ERROR log from Ray.
    abc Ray.data - INFO - Another INFO log from Ray.data.
    abc Ray.data - INFO - Ray data propagate True
    
  • Example2 (incremental: True): It looks like Ray.data’s handlers are removed after the Ray logger is set via dictConfig.

    2024-11-27 04:35:25,379 - Ray.data - INFO - This is an INFO log from Ray.data.
    2024-11-27 04:35:25,379 - Ray.data - WARNING - This is a WARNING log from Ray.data.
    2024-11-27 04:35:25,379 - Ray.data - INFO - Ray data propagate False
    This is an ERROR log from Ray.
    2024-11-27 04:35:25,379 - Ray.data - INFO - Another INFO log from Ray.data.
    2024-11-27 04:35:25,379 - Ray.data - INFO - Ray data propagate False
    
  • CPython implementation

  • Case 2: incremental is True

Solution

Instead of using dictConfig to set the root logger and the Ray logger, call other functions to set the loggers explicitly.

Related issue number

Closes #48732

Checks

  • Test 1

    import ray
    import logging
    import ray.data
    
    ray.init(logging_config=ray.LoggingConfig(encoding="TEXT", log_level="INFO"))
    
    
    root_logger = logging.getLogger()
    root_logger.info("root logger")
    
    ray_logger = logging.getLogger("ray")
    ray_logger.info("ray logger")
    
    ray_data_logger = logging.getLogger("ray.data")
    ray_data_logger.info("ray data logger")
    
    @ray.remote
    def f():
        root_logger = logging.getLogger()
        root_logger.info("root logger")
        ray_data_logger = logging.getLogger("ray.data")
        ray_data_logger.info("ray data logger")
    
    
    ray.get(f.remote())
    image
  • Test 2

    import ray
    import logging
    
    def report_logger(logger):
        # Collect this logger and its parents
        loggers = []
        current_logger = logger
        while current_logger:
            loggers.append(current_logger)
            if not current_logger.parent or current_logger.parent == current_logger:
                break
            current_logger = current_logger.parent
    
        # Report the configuration of each logger in the hierarchy
        print(f"Logging configuration for '{logger.name}' and its hierarchy:")
        for log in reversed(loggers):  # Start from the root and go down to the given logger
            print(f"\nLogger: {log.name or 'root'} (Level: {logging.getLevelName(log.level)})")
            if log.handlers:
                print("  Handlers:")
                for handler in log.handlers:
                    print(f"    - {handler.__class__.__name__} (Level: {logging.getLevelName(handler.level)})")
            else:
                print("  No handlers configured")
    
    print("BEFORE")
    report_logger(logging.getLogger("ray.data"))
    print()
    
    import ray.data
    ray.init(logging_config=ray.LoggingConfig(encoding="TEXT", log_level="INFO"))
    
    print("AFTER:")
    report_logger(logging.getLogger("ray.data"))
    image
  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
@kevin85421 kevin85421 marked this pull request as ready for review November 27, 2024 09:14
@kevin85421
Copy link
Member Author

I don't know the expected behavior for the ray.data logger inside a Ray task. @omatthew98, would you mind sharing the expected behavior and confirming whether "Test 1" aligns with it?

Copy link
Collaborator

@jjyao jjyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG.

Need to change python/ray/_private/log.py as well since there is old dictConfig call there.

python/ray/tests/test_logging_2.py Outdated Show resolved Hide resolved
@kevin85421
Copy link
Member Author

Need to change python/ray/_private/log.py as well since there is old dictConfig call there.

log.generate_logging_config() is called inside ray/__init__.py, which is executed before ray/data/__init__.py. I believe the order is guaranteed. If you prefer to minimize the use of dictConfig, it makes sense to rewrite it with explicit function calls.

@jjyao
Copy link
Collaborator

jjyao commented Nov 27, 2024

yea let's rewrite to avoid any surprises in the future.

kevin85421 and others added 3 commits November 27, 2024 18:59
Signed-off-by: kaihsun <[email protected]>
Co-authored-by: Jiajun Yao <[email protected]>
Signed-off-by: Kai-Hsun Chen <[email protected]>
Signed-off-by: kaihsun <[email protected]>
python/ray/_private/log.py Show resolved Hide resolved
python/ray/tests/test_logging_2.py Outdated Show resolved Hide resolved
import threading

# TODO (kevin85421): It is not used, but a weird error occurs if it is removed.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upper one is without import logging.config, and the lower one is with import logging.config.

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's spend some time figuring out why.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ray_logging/__init__.py contains a function setup_component_logger, which uses logging.handlers.RotatingFileHandler. However, the file doesn't import logging.handlers.

Instead, log.py imports logging.config, which implicitly imports logging.handlers.

When the runtime environment agent calls setup_component_logger, it fails, causing the Raylet to also terminate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solution: import logging.handlers in ray_logging/__init__.py directly.

import threading

# TODO (kevin85421): It is not used, but a weird error occurs if it is removed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's spend some time figuring out why.

python/ray/tests/test_logging_2.py Outdated Show resolved Hide resolved
ray.init(logging_config=ray.LoggingConfig(encoding="TEXT", log_level="INFO"))
new_test_logger = logging.getLogger("ray.test")
assert old_test_logger.getEffectiveLevel() == logging.DEBUG
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so with dictConfig, log level will also be cleared not just handlers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: kaihsun <[email protected]>
Signed-off-by: kaihsun <[email protected]>
@jjyao jjyao added the go add ONLY when ready to merge, run all tests label Nov 27, 2024
python/ray/_private/ray_logging/logging_config.py Outdated Show resolved Hide resolved
Signed-off-by: kaihsun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Core] Ray Core / Ray Data logging configuration leads to unexpected behavior
3 participants