Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NullHandler added to all but git.repo.base logger #1806

Closed
EliahKagan opened this issue Jan 23, 2024 · 28 comments · Fixed by #1813
Closed

NullHandler added to all but git.repo.base logger #1806

EliahKagan opened this issue Jan 23, 2024 · 28 comments · Fixed by #1813

Comments

@EliahKagan
Copy link
Contributor

EliahKagan commented Jan 23, 2024

This issue should not be confused with what the change in #1807 is trying to improve. The outcome of that PR is unrelated to this issue, because GitPython itself uses logging.NullHandler not git.util.NullHandler.

Although the recommended default behavior for libraries is to allow messages of WARNING or higher severity to be displayed when the calling code does not configure logging, a library also has the option of causing its messages to be suppressed by default by registering an instance of NullHandler as a handler. (Logging is still happening; messages can still be subscribed to by the addition of other handlers.) GitPython takes this approach, though in a bit of a different way the Python documentation suggests for it. In GitPython, NullHandlers are registered for the individual modules' loggers rather than for a library-wide "top-level" logger.

This is with one exception. git.repo.base does not register a NullHandler:

(.venv) ek@Glub:~/repos-wsl/GitPython (main $=)$ git grep -Fn NullHandler -- git/
git/cmd.py:85:log.addHandler(logging.NullHandler())
git/config.py:65:log.addHandler(logging.NullHandler())
git/objects/commit.py:56:log.addHandler(logging.NullHandler())
git/objects/submodule/base.py:60:log.addHandler(logging.NullHandler())
git/objects/submodule/root.py:26:log.addHandler(logging.NullHandler())
git/remote.py:69:log.addHandler(logging.NullHandler())
git/util.py:1297:class NullHandler(logging.Handler):
(.venv) ek@Glub:~/repos-wsl/GitPython (main $=)$ git grep -Fn logging.getLogger -- git/
git/cmd.py:84:log = logging.getLogger(__name__)
git/config.py:64:log = logging.getLogger("git.config")
git/objects/commit.py:55:log = logging.getLogger("git.objects.commit")
git/objects/submodule/base.py:59:log = logging.getLogger("git.objects.submodule.base")
git/objects/submodule/root.py:25:log = logging.getLogger("git.objects.submodule.root")
git/remote.py:68:log = logging.getLogger("git.remote")
git/repo/base.py:92:log = logging.getLogger(__name__)
git/util.py:107:log = logging.getLogger(__name__)

It is unclear if this is intentional. It is easy to miss one (see #666) so it may well be an unintended omission.

I believe one of the following changes should be made, but I do not know which:

  • State in a comment (or docstring) in git.repo.base why its logger is special.
  • Register a NullHandler for the git.repo.base logger.
  • Stop registering NullHandlers for the other loggers.

I am not sure if causing messages from GitPython, even of WARNING and higher severity, not to appear by default is actually considered desirable. The NullHandlers were introduced in #300. At that time, this was not with the goal of suppressing messages, but instead to avoid "No handlers could be found for logger" errors. It is no longer necessary to do that since Python 3.2. So if the only reason is as given in #300 and suppressing messages is not itself a goal, then the NullHandlers should be removed, unless it is considered important to keep them for compatibility with previous expected behavior of GitPython.

Other than the first approach of adding a comment, any changes could be inconvenient to developers of programs or other libraries that use GitPython, and to users of such software. I am unsure if they could reasonably be considered breaking. I am inclined to think that most such changes would be non-breaking but still should not be undertaken without carefully considering the impact on users and developers who may expect the current behavior.

@Byron
Copy link
Member

Byron commented Jan 24, 2024

Thanks so much for bringing this up!

My intuition here is that the more common approach of having one top-level logger would be the least surprising to the downstream, as the effect stays the same for all modules except the ones that didn't override the logger. For now, I would consider these an accidental omission as well.

Maybe there are other ways to stick to the default way of handling logging within libraries that have a similar effect.

Personally I wouldn't mind adjusting this behaviour to show WARNINGs by default, like apparently suggested in the logging module, to help users to notice warnings that they might not have seen before.

All this is under the assumption that changing the handler configuration will not affect other subscribers, they should still receive all log events just like before.

Thanks again for your help with this.

@EliahKagan
Copy link
Contributor Author

EliahKagan commented Jan 24, 2024

My intuition here is that the more common approach of having one top-level logger would be the least surprising to the downstream, as the effect stays the same for all modules except the ones that didn't override the logger.

I may be misunderstanding what you're suggesting, but I think the existing design of having separate loggers for each module that performs logging is good. I believe this is the most common design. Changing it would break, or at least worsen, any code that subscribes to specific per-module loggers--handlers registered by such code would stop receiving messages. But even if breaking changes were okay, I would not recommend changing that aspect of the design.

Handlers also receive messages from loggers lower down (except when a propagate attribute is set to false). For this reason, it is easy to subscribe to messages from all the GitPython loggers. Also for this reason, adding a NullHandler instance to the library's top-level logger would be sufficient prevent messages from being handled by logging.lastResort, which only happens for messages not otherwise handled at all. A message that reaches logging.lastResort is printed at the level WARNING or higher, and otherwise ignored. Prior to Python 3.2, there was no logger of last resort, and "No handlers could be found for logger" errors were shown. (This behavior can be restored by setting logging.lastResort to None but that is rarely wanted.)

So before Python 3.2, adding a NullHandler at any logger that a message would propagate up to would prevent the message from causing a "No handlers could be found for logger" error. Since Python 3.2, adding such a NullHandler prevents logging.lastResort from being able to log the message. In both cases, a library adding "top level" NullHandler--"top level" in the sense of being named after the library's top-level module--has the same effect as adding them at all levels where such messages originate. A "top level" logger could be used for this purpose, without removing any of the individual modules' loggers that are currently used, and without actually logging any messages directly to the "top-level" logger (it would still receive them).

Personally I wouldn't mind adjusting this behaviour to show WARNINGs by default, like apparently suggested in the logging module, to help users to notice warnings that they might not have seen before.

That further simplifies things, and I do think it would be a good thing for its own sake, too. All we have to do is remove the statements that add NullHandler instances to loggers.

This seems reasonable to me, especially if comes in at a minor release bump, such as to 3.2.0. I think it should probably be seen as a feature change. It seems to me that it is a non-breaking change that adds a feature that is not just a bugfix but that produces an observable behavioral change. If the first release that contains the change ends up being the same as the first release that contains the changes from #1791, then I think that further strengthens the case for bumping the minor version number.

Something that may make sense to change at the same time is that the modules in GitPython that create per-module loggers bind them to module attributes called log. However, the log attribute in git.refs is of course not a logger, but a reference to the git.refs.log module. git.log is likewise a reference to the git.refs.log module (git/__init__.py has a * import from git.refs). I think it makes good sense that git.refs.log and git.log in GitPython are related to git logs. But that makes it confusing for other modules' log attributes to refer to loggers. Fortunately, those log attributes that refer to loggers are nonpublic; they don't start with _, but each appears in a module that defines __all__ and doesn't list it there. I think it's still confusing, but that they are nonpublic means it may be able to be changed.

So at the same time as making other changes to the loggers, such as removing the NullHandlers, I would like to also rename those attributes to _logger. This is not a breaking change, since any code outside GitPython that uses those attributes has a bug; code using GitPython should be calling logging.getLogger instead, e.g., logging.getLogger(git.objects.commit.__name__). Yet I worry that using those attributes is an easy mistake for code that uses the GitPython library to have made. On the other hand, when people directly add handlers to GitPython loggers, they are most likely adding them to the top-level logger. To do that, they already have to use logging.getLogger, since GitPython's doesn't create its top-level logger (and doesn't need to, as it is created as needed):

>>> import io, logging, git
>>> stream = io.StringIO()
>>> below = logging.getLogger(git.objects.commit.__name__)
>>> above = logging.getLogger(git.__name__)
>>> above.addHandler(logging.StreamHandler(stream))
>>> below.warning("hello, world")
>>> stream.getvalue()
'hello, world\n'

@Byron
Copy link
Member

Byron commented Jan 25, 2024

I may be misunderstanding what you're suggesting, but I think the existing design of having separate loggers for each module that performs logging is good. I believe this is the most common design. Changing it would break, or at least worsen, any code that subscribes to specific per-module loggers--handlers registered by such code would stop receiving messages. But even if breaking changes were okay, I would not recommend changing that aspect of the design.

Then I definitely misunderstood what was said in the initial comment.

This seems reasonable to me, especially if comes in at a minor release bump, such as to 3.2.0. I think it should probably be seen as a feature change. It seems to me that it is a non-breaking change that adds a feature that is not just a bugfix but that produces an observable behavioral change. If the first release that contains the change ends up being the same as the first release that contains the changes from #1791, then I think that further strengthens the case for bumping the minor version number.

I am definitely going to follow your guidance here. Maybe there are some specific notes for changes.rst or the release on GitHub itself that I should include then as well.

Thus, please feel free to go ahead and make the non-breaking (or least-breaking) changes as you see fit.
Thanks again for your help.

@EliahKagan
Copy link
Contributor Author

EliahKagan commented Jan 25, 2024

Then I definitely misunderstood what was said in the initial comment.

Sorry about the confusing initial description; I have just reread it, and I think it was not very clear. To clarify, there is a tree of loggers that corresponds to the tree of Python modules and their submodules, and handlers can be attached to any logger. The root of the tree is the root logger, which is not specific to any Python module; loggers for top-level modules like git are direct children of the root logger.

Logging a message to a logger also logs it to all the loggers in the path from that logger up to the root of the tree (unless a logger turns off propagation, which is uncommon). Loggers generate events, and handlers subscribe to these events; thus, actually displaying a message, writing it to a log file, etc., are done by handlers. Handlers don't suppress propagation. For example, logging to git.repo.base logger would notify handlers attached to the git.repo.base, git.repo, git, and root loggers. The presence of handler lower down does not prevent handlers higher up from being notified.

However, although handlers don't suppress propagation, a logging message that would not be seen by any handler--that is, a message logged to a logger that neither has any handlers nor has any ancestor in the tree with handlers--is treated specially. Such a message goes to logging.lastResort, the handler of last resort. In practice this only happens if logging is not configured, because otherwise there is typically a handler at the root logger.

Before logging.lastResort was added in Python 3.2, such a logging message would not be shown, and would result in a "No handlers could be found for logger" error. (One can bring back this behavior by setting logging.lastResort to None, but this is rarely done.) The "No handlers" error could be suppressed by attaching a handler that could receive the logging message. To do this without showing the logging message, NullHandler could be used. Before Python 3.2, this just suppressed the "No handlers" error, but since then, it instead keeps messages from going to the logger of last resort.

Because a message only goes to logging.lastResort if no handler could see it, NullHandler can be added anywhere that could see the message. A common approach is to add it at the lowest common ancestor of loggers whose messages one wished to be silently ignored when logging is not configured. If GitPython had taken his approach, it would've added NullHandler to the git logger. GitPython instead took the approach of adding NullHandler to each of the leaves--except git.repo.base, where it was omitted, seemingly by accident. The effect of this approach, if no leaves are omitted, is the same, assuming messages are only being logged to the leaves.

For simplicity, the above description completely ignores level, i.e., severity. This is important but doesn't affect propagation, so I don't think it introduces any inaccuracies. But since the default level is WARNING, the practical effect of NullHandler is to prevent messages of level WARNING and higher from being displayed when logging is not configured, since messages of level lower than WARNING are already not displayed when logging is not configured.

All this is under the assumption that changing the handler configuration will not affect other subscribers, they should still receive all log events just like before.

Yes. If a handler is attached to a logger, then what that handler sees is not affected by whether or not, or where, a NullHandler is also attached. (This is a key point, and I had meant to reply to this in my previous message, but I had not.)

I am definitely going to follow your guidance here. Maybe there are some specific notes for changes.rst or the release on GitHub itself that I should include then as well.

I'm not sure what the wording should be at this point, because it will also probably be affected by what is done for #1808. If I don't suggest wording, please feel free to remind me.

If it is only for removing the NullHandlers, then maybe:

NullHandler is no longer added automatically to GitPython's loggers, so if logging is not configured then messages from GitPython, of WARNING and higher severity, will now be displayed.

I feel like the wording can maybe be made simpler, though.

@EliahKagan
Copy link
Contributor Author

EliahKagan commented Jan 26, 2024

Sorry about my hesitation above in #1806 (comment). A short while after posting that comment, I had begun to have doubts, and added a note saying you may want to wait to read it until I checked something. That note is now removed. (This comment is so the change can generate another notification.)

I have checked that I was not confusing the effect of messages reaching logging.lastResort when logging is not configured, with the effect of logging being configured implicitly due to automatic calls to logging.basicConfig. I was fortunately not confusing them; it is only logging functions in the logging module, such as logging.warning, and not the same-named methods on actual logger objects, that ever automatically call logging.basicConfig.

EliahKagan added a commit to EliahKagan/GitPython that referenced this issue Jan 26, 2024
This stops adding `NullHandler` instances to GitPython's loggers.
As noted in gitpython-developers#1806, when they were added in gitpython-developers#300 this prevented
errors when GitPython logged messages and logging was not enabled,
but since Python 3.2 there is a logger of last resort providing a
nicer default behavior of showing the messages. (They are still
shown with better formatting if logging is configured, even if
just done with logging.basicConfig(), so applications should still
typically configure logging.)
@Byron
Copy link
Member

Byron commented Jan 26, 2024

Thanks a lot for elaborating, I believe I have understood how this is working now, and maybe I could have before if I had taken more time to read.

With that said, I am now taking a look at the corresponding PR.

@EliahKagan
Copy link
Contributor Author

Thanks a lot for elaborating, I believe I have understood how this is working now, and maybe I could have before if I had taken more time to read.

If that is true, then I am actually relieved you did not, because my explanations prior to #1806 (comment) were not especially good.

Anyway, as can be seen in #1806 (comment), Python's logging framework is complex and can be confusing (I guess this is true of logging frameworks in general).

@EliahKagan
Copy link
Contributor Author

Changes to logging, including #1813 (for this) and #1815 (for #1808), were included in 3.1.42, which was a patch release. That seems fine to me; it is reasonable to consider these bugfixes rather than feature changes, and #1791 and other improvements that might help justify a minor version bump have not come in yet.

However, I am not sure if you still intended to include information about the logging changes in the release notes and/or changelog (as mentioned in #1806 (comment) and #1806 (comment)). Both are editable, so it can be added, but I am unsure if it should. But I figured I should mention it just in case.

@Byron
Copy link
Member

Byron commented Feb 19, 2024

Apologies, I am so in the habit of creating patch releases, it simply wasn't on my radar. Neither is what should go into the changelog, I am quite oblivious.

This should be improved. Maybe in future, if you would like to see highlights in changes.rst, these could be added along with their PRs. It's something I'd definitely appreciate. For other changes, maybe to the GitHub release page, or a pre-amble that should be added as it's created, a 'next release' issue could be used to collect these notes, along with the designated version number.

With these two changes, I think future release wouldn't have to suffer the same quality issues as the previous one. What do you think?

@EliahKagan
Copy link
Contributor Author

Apologies, I am so in the habit of creating patch releases, it simply wasn't on my radar.

I think a patch release was reasonable anyway, so this is definitely no problem! The conditions these changes were made to address were bugs, and it seems to me the changes made this more like what users may already have been expecting.

Neither is what should go into the changelog, I am quite oblivious.

Should the 3.1.42 changelog entry be expanded for it at this point? If so, I can open a PR for changes to changes.rst and/or otherwise propose wording for the GitHub release's page. I am not sure if this is necessary or not.

Regarding the wording, because I know the important user-facing logging-related changes that made it into 3.1.42, which turned out only to be those from #1813, #1815, and related refinements, I can come up with a reasonable wording now.

However, I may not have been able to do so--or at least thought I could not do so--at the time I was opening those PRs. This is unimportant to the specific issue of the 3.1.42 changelog and release notes, but relevant to the larger release engineering question:

This should be improved. Maybe in future, if you would like to see highlights in changes.rst, these could be added along with their PRs. It's something I'd definitely appreciate.

I think there are three cases:

  1. A change doesn't require explicit mention in the changelog. This is usually the case, in the way GitPython uses its changelog, and given how GitPython is in maintenance mode. Because this is usually the case, I recommend against expanding CONTRIBUTING.md with anything about editing changes.rst.
  2. A change requires explicit mention in the changelog, which is specific to the change itself or otherwise known when the change is proposed. In this case, an unreleased section can be added at the top of changes.rst if not yet present and the information can be added there, then that section can be renamed and augmented--at least by adding the usual link to the GitHub release page--to turn it into the section for the released version. (Note that I am not recommending changes.rst become a "keep-a-changelog"-style changelog in general, only that this aspect of those guidelines seems like it may occasionally be useful for it.) If I understand you correctly, this is effectively the same as what you've suggested for that, except without the version number. I think it is sometimes hard to know what the version number ought to be ahead of time. But if you think it's best to include it then I can certainly do so in these cases.
  3. A change or possible combination of changes requires explicit mention in the changelog, but what should be said is not known at the time they are proposed, because of the way it depends on which of the proposed changes are to be accepted, what refinements are to be made, and what other relevant changes may also end up being included. I would consider what happened here to be a case of this, though perhaps less complex than some that might arrive in the future.

For the first case, nothing special has to be done. For the second case, the PR that proposes a change can propose wording for changes.rst. When the third case arises, maybe it would be sufficient to open an issue about the need to add something to changes.rst?

I've deliberately focused on changes.rst rather than the release notes on the GitHub release page, because changes.rst may be looked up in the commit tagged for the release and because it is present in packages uploaded to PyPI. I think it's still fine to edit changes.rst after a release, but it may be a small quality improvement to avoid having to do so. In contrast, the GitHub release page can be edited at any time with near-equally good results.

@Byron
Copy link
Member

Byron commented Feb 20, 2024

Thanks for sharing your thoughts!

Should the 3.1.42 changelog entry be expanded for it at this point? If so, I can open a PR for changes to changes.rst and/or otherwise propose wording for the GitHub release's page. I am not sure if this is necessary or not.

Since I am also not sure even though I thought you were, this probably means no change should be done.

When the third case arises, maybe it would be sufficient to open an issue about the need to add something to changes.rst?

I could open an issue advertising the new release, maybe CC'ing you on it or other major contributors, and give them some time to chime in with their changelog ideas. These could then make it both to changes.rst and as hand-written portion of the generated changelog on GitHub.
This has the greatest chances of filling in changes.rst before the release happens, which I'd like if new process is introduced.
Regarding the 'next release' section in changes.rst, I am all for it to be added if 2) is the case.

@EliahKagan
Copy link
Contributor Author

I could open an issue advertising the new release, maybe CC'ing you on it or other major contributors, and give them some time to chime in with their changelog ideas. These could then make it both to changes.rst and as hand-written portion of the generated changelog on GitHub.

It seems to me that most of the time there would be no changes that require this, unless you're envisioning adopting detailed changelogs along the lines of what https://keepachangelog.com/ suggests.

I am wondering how this compares to the approach of having prereleases. They seem to have a similar number and complexity of extra steps. Both could be omitted in most cases (if I understand you correctly). One disadvantage is that the prerelease release number is made based on the intended stable release number, so it would incorporate that decision. But maybe that is actually an advantage.

It seems to me that being able to quickly make patch releases without extra coordination and delays in situations where it seems one is merited, including as occurred in #1820 (comment) for 3.1.42, has value.

Since I am also not sure even though I thought you were, this probably means no change should be done.

Yes, I am not sure anything is really needed in this case. In my earlier thinking when I had suggested it, I am not sure I had adequately considered that the new behavior may be closer to what most users expect to happen. More importantly, some changes that I had been thinking about, such as #1791, were not part of that release.

Regarding the 'next release' section in changes.rst, I am all for it to be added if 2) is the case.

I will make sure to include that on the occasion that it seems useful, and also to mention on the rare occasion that it seems like it might be useful but I am not sure what should go there.

@EliahKagan
Copy link
Contributor Author

EliahKagan commented Feb 20, 2024

One other thought: Suppose the list of pull requests on GitHub release pages (which changes.rst links to) were divided into two lists, first the list of changes that affect the git module, and then the list of changes that affect only tests or CI checks.

Then the changes that people are most interested in, in terms of different behavior, would be immediately apparent. That might make cases where changes.rst and the GitHub releases page need extra descriptive text less common, and may also make it clearer that this is the case when it is.

I believe that can even be automated (I think GitHub supports a template for releases), though it might not be feasible to automate figuring out when a change to pyproject.toml makes a PR go in the first group.

Is this valuable? If it is, how much of the goals discussed here would this achieve? I am curious if this could either make a new process easier, or even to some extent avoid the need for a new process.

@Byron
Copy link
Member

Byron commented Feb 21, 2024

That's a fantastic idea, as it does indeed seem to remove the need for a new process entirely.

Since changes to the git module are rare, what would be enough for me is to prefix such PRs with an identifier. Then, upon generation of the release page, I can sort the list entries to their respective position. If a custom template was used, these top-level headlines, e.g. git and non-git, could be present already.

This would also mean that changes.rst will be left as is and the GitHub release page is going to have the most valuable release information. But if PRs that affect the git module also change changes.rst, this could probably be compensated for.

That, maybe with adjustments where needed, sounds like a viable plan for improving this.

Edit: I accidentally skipped over the comment above, but also think it's superseded by the comment below it, also in terms of how feasible/simple it is to put into action.

@EliahKagan
Copy link
Contributor Author

to prefix such PRs with an identifier

Although I have not done this in my new PRs (#1838 and #1839), I'd be pleased to edit their titles on request (or you can retitle them if you prefer).

The reason I haven't done it is not out of any opposition, but rather that I don't know what convention should be used, and also that I am inclined to consider this protocol as still in the process of being formed.

I believe a PR's title at the time a GitHub release is drafted is what gets placed automatically in the text of the release, rather than the title at the time the PR was merged. So I think this can be applied late to a PR, even after it has been merged.

I'll try to reply here more substantively, but I wanted to make note that the absence of a prefix on my recent PRs is not intended to express any view about what is being discussed here (and that I certainly have no objection to them ending up retitled).

@Byron
Copy link
Member

Byron commented Feb 23, 2024

Perfect, thanks for letting me know.

I wanted to try something visual, so I prefixed and suffixed #1838 with 📣 to indicate they should be amplified later.
Now I just have to remember this one thing when creating the next release: "look for the (in my case: 'golden') megaphones".

I am happy to rename it after merging if from the PR text I can understand your intentions, or from any other source of information really. So I will do the same to #1839 soon.

@EliahKagan
Copy link
Contributor Author

EliahKagan commented Feb 29, 2024

If I understand you correctly, you're willing to add these where you deem appropriate and there is no need for me to do so at this time. However, please let me know if that is not the case, or if you have other updated guidance for titling PRs.

(I am not saying this to advocate that #1850 be marked in this way; I don't know if that should happen or not; the actual behavioral changes there, as noted in the PR description, are very slight.)

As mentioned in #1806 (comment), I do still plan to respond more substantively, at least regarding ways to integrate this with automation that occurs in drafting release notes for GitHub releases. This comment isn't that more substantive reply, which is still forthcoming. 😄

@Byron
Copy link
Member

Byron commented Mar 1, 2024

If I understand you correctly, you're willing to add these where you deem appropriate and there is no need for me to do so at this time. However, please let me know if that is not the case, or if you have other updated guidance for titling PRs.

You can let me know in the PR description if you think it should be highlighted in the release. Otherwise it's left to my judgement which probably should be avoided. #1850 for instance, I just marked because I think better API docs are visible enough to be highlighted, but once I know you will always tell me about your choice, then I would certainly stop marking these on my own. It's definitely a workflow that is still 'in development' :D.

@EliahKagan
Copy link
Contributor Author

I considered suggesting in the description of #1859 that it may deserve to be marked in this way, but I am not sure and it may depend on what other changes have to be made (as detailed there, at least one part of it seems likely to need further revision). I mention this here, since this is where we've been discussing the question of what causes a PR to receive this designation and how to coordinate it. However, if you would prefer I mention this in PR descriptions for PRs where I am not sure as well, then I can do so in the future.

@Byron
Copy link
Member

Byron commented Mar 5, 2024

However, if you would prefer I mention this in PR descriptions for PRs where I am not sure as well, then I can do so in the future.

I would interpret 'not sure' as 'maybe', and since it's not explicitly 'no' it seems fair that I can upgrade such PRs to be highlighted on my own account. This is how I would do it these days, and we can see how this works if it's OK with you.

@EliahKagan
Copy link
Contributor Author

These are my more substantive thoughts, though I fear they may not be all that useful, especially since I did not end up finding ways of automating this easily that I had hoped existed.

As noted in #1859 (comment), that PR has a status similar to, though less significant than, this PR here that gave rise to the discussion about keeping track of information for inclusion in release notes. Specifically, there are some people who use GitPython who may be affected in such a way that they may want to be informed of the change.

I have been thinking of this as the goal of highlighting, which is why I have been rather unsure of which pull requests should be highlighted. For example, #1850, though it is a significant and worthwhile change that should make the documentation a bit nicer and the code a bit more convenient to work with, does not need to be highlighted under this standard. Likewise, #1862 may not need highlighting either.

Then again, highlighting might serve the more general purpose to identify pull requests that (a) are of general interest and deserve to be identified, or (b) you want to be reminded of when drafting a release so you can decide if you want to say something about them. This seems completely reasonable to me.

I am unsure, however, if it will be sufficient to identify pull requests where users/developers could be inconvenienced when upgrading without being aware of them. I wonder if the use of labels would be helpful, with two or more separate labels, to identify pull requests that, while non-breaking, affect users/developers, or downstream maintainers/packagers. I worry more, though, that this is overcomplicating things. That a PR is likely to be worth giving special mention can often be figured out during or soon after it is reviewed and merged, while the more specific impact will only sometimes be clear before the effect of it being integrated with other changes is examined.

I had vaguely heard something about how labels can be used to automate the way GitHub release notes are drafted. Unfortunately, unless you want to actually omit some pull requests from being mentioned, it doesn't look like that will help. I was able to find something that suggests a feature to group by label is available for Enterprise users--if you'd like to see that then I can probably find it again--but it looks like that's not supported for releases in a repository like this one. Of course, it is possible to code up an action or bot that performs arbitrary operations based on labels and other information, but that might not be worthwhile.

Finally, it might be useful to either enforce or add a prefix style like conventional commits for pull requests. I am willing to follow such a thing, but really I mention it mostly for completeness and not to advocate it. I tend to be skeptical of this style, outside of cases where it is actively used to near-fully automate something important that would otherwise be error-prone or laborious (gitoxide seems like a case where I would like it, because it is used to get right what is released based on pushed commits, at least if I understand correctly how it is being used).

@Byron Byron changed the title NullHandler added to all but git.repo.base logger 📣NullHandler added to all but git.repo.base logger📣 Mar 15, 2024
@Byron
Copy link
Member

Byron commented Mar 15, 2024

Thanks for sharing your thoughts on this topic.

First off, I removed the highlighting from #1850 and #1862 and added it here and agree that the reasoning for what to highlight should be based on how much users are affected. I will avoid adding these highlights in future and encourage you to do so when you see fit.

Indeed I'd not want to make matters more complex or even try to automate anything - for now it seems sufficient to use this simple system to improve the communication to users, and see how that goes first.

Indeed, gitoxide uses conventional commits for the purpose of determining the version of a crate based on the previous release, and to know which commit messages should be put into the changelog of the respective crate. This automates changelog generation and assures breaking changes are never accidentally published as non-breaking, which would break the previous release, sometimes in a way that isn't fixable other than yanking the new release entirely. From my experience, this still happens once or twice per year due to human error. However, I am incredibly thankful for cargo smart-release as without it, handling the publishing process of gitoxide would be impossible to get right. Just yesterday I published 53 crates automatically, and it just worked (even though it took a while to get there, the tool needed a lot of work).

@EliahKagan
Copy link
Contributor Author

EliahKagan commented Mar 15, 2024

I've found that GitHub doesn't seem to be able to search for 📣. Whether quoted or not, it never shows any pull requests (even when properly allowing it to search for closed pull requests). Maybe I'm doing something wrong, though, or maybe it relates to a temporary problem. I'll try again later and report back. If this is a long-standing limitation, then it might be a weakness of this notation, which I think is otherwise good because it does not clash with what people would write in pull request titles for other purposes.

Regarding the recent title edit here, should this issue really itself be highlighted? Everything else highlighted has been a pull request, as far as I can tell (and the pull request that fixed this issue, #1813, already came in as part of the 3.1.42 release). Unlike milestones, which contained both issues and pull requests, the GitHub release notes list only pull requests, and not issues.

(#1813 is an example of what could have received such highlighting had this been in use at that time, but my intention was not to advocate that it or this associated issue be highlighted at this point.)

Just yesterday I published 53 crates automatically, and it just worked

That rocks!

@Byron Byron changed the title 📣NullHandler added to all but git.repo.base logger📣 NullHandler added to all but git.repo.base logger Mar 15, 2024
@Byron
Copy link
Member

Byron commented Mar 15, 2024

It's true, nothing can be found when searching for 📣 or even "megaphone".

Sorry for the mindless edit of this issue, I am aware these aren't part of the generated GitHub changelog, but wasn't getting that this is not a PR 🤦‍♂️.

Maybe just labelling PRs would be a better fit. Then it's not obvious at a glance which one should be moved, but I can hover over each of them to see the label. This might save time overall as I won't have to remove the megaphones on both sides of the title.

Screenshot 2024-03-15 at 19 06 35

If you are also in favor of using a label, I will create one and adjust the markings on the 'megaphone' PRs accordingly.

@EliahKagan
Copy link
Contributor Author

Using a label sounds fine to me. What should it be called?

@Byron
Copy link
Member

Byron commented Mar 19, 2024

I called it 'highlight-in-changelog' and already applied it to the three PRs that there were.

@EliahKagan
Copy link
Contributor Author

I've noticed that the way GitHub presents the release in feeds may have the effect of exaggerating the significance of highlighted changes. For the recently released 3.1.43:

Screenshot of the GitPython 3.1.43 release as it appears in a user's GitHub feed, where due to only the beginning being shown, Particularly Important Changes appears at first glance to be a description of the release itself rather than a category of changes within it

Because of what is shown, it's possible for a cursory glance--of the kind common in looking at feeds--to give one the impression that "Particularly Important Changes" is a characterization of the release as a whole, with the items listed under it as being central to the release, even though "Particularly Important Changes" is really a subheading.

I don't know if there is a way to improve on this, but maybe "Particularly Important Changes" can be phrased differently, or phrased the same but the brief description under it phrased differently. Related to this is that they are not necessarily the most important changes for most users, but rather they are the ones that may require attention from a small handful of users who may have inadvertently depended on bugs. or otherwise be surprised by their absence or--in the case of downstream maintainers, sometimes, though I think not in this release--need to do something based on their absence.

In the case of 3.1.43, I think #1832 and #1862 are likely to effect more users in a directly noticeable way. The effect will be purely positive, though, so users don't need to be informed of them to prepare for their effects. (I don't mean that they need to be highlighted or otherwise treated specially; rather, I just mean this to illustrate the distinction between what is most likely to affect users and what some users may most benefit from being informed of when upgrading.)

@Byron
Copy link
Member

Byron commented Apr 2, 2024

Thanks for pointing this out! I wasn't aware how releases look in feeds, or even that they appear there in some form.

Maybe naming the section Most Notable Changes (possibly without 'Most') could help, without saying anything else underneath like was the case here. That way it's left open what a change means, without implying anything important that needs to be acted on. People would be left to read the bulletpoints and determine on their own what they mean in their context.

I will try to remember to use a different heading next time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants