-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy logging part 1 #16893
Draft
ksuderman
wants to merge
7
commits into
galaxyproject:dev
Choose a base branch
from
ksuderman:lazy-logging-part-1
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+219
−165
Draft
Lazy logging part 1 #16893
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
6edf242
Patch logging statements in lib/galaxy/jobs/runners
ksuderman 6ef0a05
Re-ran patching
ksuderman 2b15ee2
Resolve conflicts
ksuderman c5aac75
Removed extraneous back slash character
ksuderman a5af51b
Format changes
ksuderman 043667e
Rerun logfix
ksuderman 7da584b
Merge changes from latest logfix run.
ksuderman File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to open a giant can of worms, and I know this is only adhering to what we've put in the style guide, but... Do we really want to make this choice for performance? Is it really significant? The former is more naturally readable and I think I'd weigh that more heavily than what I assume are barely significant performance gains in debug log emission.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance is orders of magnitude better if
log.debug
messages are disabled (i.e. the log level islogging.INFO
or higher). The logging framework defers formatting log message until it can't be avoided for this reason. It is also arguably safer if the string interpolation raises an exception, although that is less an issue with f-strings.Whether it is worth it is another issue. I would advocate strongly for f-string in all other contexts but logging, then I think performance and safety outweigh the minor readability improvements. The process is automated so the cost of doing this is minimal. Having said that, if done project wide this does have the potential to modify a hundred or so files. So we (I) need to ensure the code being generated is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a discussion on this this summer (here's the link). I see Dannon's point; however, it's an accepted best practice, and I don't think readability is significantly affected by printf syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance is not a valid argument, logging is never a bottleneck for Galaxy. Speeding up fast code by 10X (or any factor, really) brings you nothing overall, and f-strings are certainly more readable. The addition to the styleguide was pretty contested, and the link you posted @jdavcs was the start of the discussion, see also #16312 (comment). I don't think we should touch existing code, and if we do then we should add the corresponding pylint rules and do it all in one batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agreed on adding this to our style guide. IMHO, there's nothing wrong with editing existing code (especially as trivial as logging messages) to adhere to our style guide. If we disagree with the style guide, I think we should change the style guide first.
If we can add pylint rules for this - great; although doing all in one batch would require significantly more upfront work, which is why I think fixing one step at a time is fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And @dannon I should also add that those number are for messages that are logged, nothing is going to speed those up. If the string needs to be formatted the string needs to be formatted. Any performance gains would be from messages that are not logged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't close it. I think the choice we made for the styleguide is contentious and I personally would have gone the other way, but it's really okay and I don't want to make us relitigate that here -- I just wanted to put a little context to the scale of the performance argument. You can feel safe to add as many log statements as you want and it won't make a dent, with or without fstrings :)
Again, thanks for the effort. I'll just second what Marius mentioned before -- we should also include a linting rule whether pylint, ruff, whatever, to enforce this moving forward if we're going to change it across the board, and we need to make sure this is well tested and closely reviewed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ksuderman please ping me when it's ready for review (I assume while in draft state it's not there yet).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to point out that there is a lot of work going into fstrings, in particular but not limited to 3.12.
https://realpython.com/python312-f-strings/
https://peps.python.org/pep-0701/
Migrating away from fstring feels wrong, especially mid-term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that if we do go ahead with this we need to be 100% confident the changes are correct. With this PR I am 99% confident I can prove the generated statements are correct, or if they are not correct that the original statement was not correct either. And I just had an idea on how to test each change to show that each one produces the same output as the original statement.
@jdavcs will do. This PR was just a proof of concept intended for discussion. I didn't think it would be such a hot topic.
@bgruening that should not be an issue. Any changes to f-strings will be handled by the Python parser. The latest version of my program simply rearranges nodes in the syntax tree for logging method calls. Only a handful of the logging method calls in the Galaxy code base do anything more that simple variable substitution. If the program does encounter a statement that does fancy formatting it prints a warning, skips that statement, and moves on. And this is just for logging methods, f-strings should be preferred for all other string formatting operations.