-
-
Notifications
You must be signed in to change notification settings - Fork 906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Override bash executable, defaulting to Git for Windows git bash over WSL bash #1791
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for tackling this and for explaining the reasoning so very well!
@EliahKagan already prepared me for this case and I hope he can take a look at this PR as well to see if it fits into the big picture.
I will try to look at this in detail today or tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for bearing with my delay in reviewing this, and I apologize for any inconvenience. I strongly agree with the idea here, and I think this pull request will be suitable for merging, after some changes.
As you say, running hook scripts in WSL is not a good default, and it often fails or causes incorrect behavior. I think the problems with the way GitPython currently does it are severe enough to justify changing the default in a bugfix release. In my opinion, using WSL automatically even when a Git Bash shell is available is a bug, is usually not what users intend, and should be changed even though doing so may break some existing code that relies on the current behavior. (In #1745 and #1777 I had expressed hope that this could somehow be fixed in a way that does not break backward compatibility at all, for anyone, but I realize that is not realistic.) The ability to restore the old behavior by setting GIT_PYTHON_BASH_EXECUTABLE
to bash.exe
should be clearly documented.
The case for using the Git for Windows bash.exe
when available is actually even stronger than it seems. Unlike on other systems, path search on Windows differs substantially depending on whether a shell is used. That is, running a program in cmd.exe
resolves commands differently from a direct call to CreateProcessW
. One effect is that Popen
(and subprocess.run
, etc.) on Windows without shell=True
searches a few directories prior to those listed in the PATH
environment variable. One such directory is System32
, even if it is not in PATH
or no matter what is listed before it. So even when a "native" bash.exe
is in a PATH
directory listed before System32
, the WSL bash.exe
in System32
is usually what Popen
runs. (I say "usually" as there are other subtleties.) Furthermore, the WSL bash.exe
often exists in System32
even if no user has intentionally set up WSL and even if no WSL distributions are installed that would let it succeed.
Using the Git for Windows bash.exe
whenever it is available would solve those problems, as well as the ones you described, and the further problem that using the WSL bash
can hurt performance as it may need to start a WSL system. This is also much closer to the behavior that users are likely to expect, because Git for Windows itself runs hooks with a #!/bin/sh
or #!/bin/bash
shebang (and some others) with its bash.exe
.
I also agree with much of the strategy this takes. However, there are some changes that should be made, to avoid introducing new untrusted search path vulnerabilities, similar to CVE-2023-40590 or CVE-2024-22190, that result in arbitrary code execution; to cover more of the common ways Git for Windows may be installed and run, while also avoiding using a bash.exe
associated with a separate git
from the one GitPython will use; and to ensure no code that uses GitPython without using hooks can break as a result of changes made here. That is all covered among my comments below. There are also two considerations not covered in review comments:
A fix for CVE-2024-22190 (#1792) was recently merged. That was after this pull request was initially opened, so this will have to be updated to get that bugfix, and to help prevent that security bug from being inadvertently reintroduced. This is also needed to resolve conflicts, which fortunately are very minor, and to get the changes from #1803 without which subsequent CI runs would fail (#1802). Although the guidelines don't express a preference on how PRs should be updated, based on #1659 (comment) and how minor the conflicts are, I recommend rebasing (maybe @Byron can speak to this more definitively). Updating this PR won't automatically fix new untrusted search path vulnerabilities the current code in this PR would add, but I've left review comments on the specific affected code about how that can be done.
The tests should be modified to ensure these changes work as expected. I am unsure if any new tests are needed. The main thing I'd suggest doing is to remove all WSL-related xfail
mark decorators from test functions in test/test_index.py
, and remove the "Setup WSL" step from the pythonpackage.yml
workflow. Currently, the WSL bash.exe
is being used on CI, even with these changes. This is because, as written, the changes here make assumptions Git for Windows does not always satisfy, including as it is used on the Windows runners for GitHub Actions. As commented below, that can be fixed. I suggest removing marks and the "Setup WSL" step, so all hook tests fail on CI due to the WSL bash.exe
having no distribution to run bash
in (see 9717b8d), before fixing the bug in how the Git for Windows bash.exe
is found, and verifying that this makes the tests pass.
Thanks for all the comments! Been busy with the day job the last week or so, but planning to try to address these things this weekend so we can get another round of reviews going and get it merged. |
I believe I have addressed all the comments. I removed the xfails from the unit tests, and had assertion errors in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the revisions! I agree that this addresses everything from the first review, and I think this is very close to being ready to merge.
I think the the only tricky remaining issue is what to do in test_uses_shell_not_from_cwd
. I believe that one of the reasonable approaches is to leave it as it is, possibly removing the commented-out code altogether, and possibly adding a comment about future directions. But I'm also aware of two other reasonable approaches. I've left a review comment there that covers all three, and that also tries to answer the question of what that test is testing. Further information and context on what that test is checking for can be found in the "When hook scripts are run" subsection of CVE-2024-22190.
Aside from that, I recommend deleting the code that this PR currently comments out. You may have been intending to do that anyway, but I've posted review comments where applicable for it. It should not be necessary to install a WSL distribution on CI anymore, due to the improvements you've made in this PR. Furthermore, at least outside of tests related to security or the ability to signal a failed hook run, I don't think there is any need to cover WSL in the tests anymore (thus none of the WSL-related xfail decorations should be needed). If it turns out some such code will be needed again, it can be found in the commit history and restored as needed, with whatever modifications end up being required for it.
I've also noticed and included review comments about a few other small things.
# - name: Set up WSL (Windows) | ||
# if: startsWith(matrix.os, 'windows') | ||
# uses: Vampire/[email protected] | ||
# with: | ||
# distribution: Debian |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the improvement from this PR, we shouldn't need to install a WSL distribution on CI anymore, so this commented-out code can be removed altogether. It will still be in the history and can be brought back if needed in the future. (I'm guessing you may have been planning to do this anyway, but I figured I'd mention it where applicable.)
@@ -360,9 +361,107 @@ def __setstate__(self, d: Dict[str, Any]) -> None: | |||
the top level ``__init__``. | |||
""" | |||
|
|||
_bash_exec_env_var = "GIT_PYTHON_BASH_EXECUTABLE" | |||
|
|||
bash_exec_name = "bash" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems not to be used. I'm not sure it's needed. If it is kept, then _get_default_bash_path
should probably return it on non-Windows systems and possibly return f"{cla.bash_exec_name}.exe"
as the fallback on Windows systems. Although I believe it was for symmetry with git_exec_name
, the treatment of git
and bash
has become less similar in recent changes, so I'm not sure there's a need for this anymore.
|
||
GIT_PYTHON_BASH_EXECUTABLE = None | ||
""" | ||
Provides the path to the bash executable used for commit hooks. This is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend adding "on Windows" to the first sentence, both here and in the refresh_bash
docstring. It is set on other systems, and future software that uses GitPython may itself use it directly on other systems (though that would be unusual), but GitPython itself is only using it on Windows.
Refreshes the cached path to the bash executable used for executing | ||
commit hook scripts. This gets called by the top-level `refresh()` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above for GIT_PYTHON_BASH_EXECUTABLE
, I suggest adding "on Windows" to the first sentence, since the path this sets is not used by GitPython on other systems. Otherwise, this is saying that hook scripts are executed with this bash interpreter on all systems, which is not accurate, and the documentation of the behavior on non-Windows systems lower down in this same docstring--which is itself not a problem--will reinforce that wrong information.
this method may be invoked manually if the path changes after import. | ||
|
||
This method only checks one path for a valid bash executable at a time, | ||
using the first non-empty path provided in the following priority |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path
argument seems to be used if not None
, even if it is the empty string. Although people shouldn't do that, the behavior seems correct, since passing a string as path
should cause that to be used.
This description could be changed to just say "the first path provided..." because it is reasonable to interpret an empty environment variable as not providing a path, which is what I would suggest. But if you feel that's not accurate enough, then it could be expanded to be more specific, or something about the environment variable's value being nonempty could be added in item 2 of the numbered list.
# if type(_win_bash_status) in {WinBashStatus.Absent, WinBashStatus.WslNoDistro}: | ||
# # The real shell can't run, but the impostor should still not be used. | ||
# with self.assertRaises(HookExecutionError): | ||
# with maybe_chdir: | ||
# run_commit_hook("polyglot", repo.index) | ||
# self.assertFalse(payload.exists()) | ||
# else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the best thing is to do here right now. Some of the modifications to the test suite to take advantage of the improvements in this pull request could be done later.
The purpose of this test method is to verify that an untrusted repository (or, more realistically, an untrusted branch, possibly from a fork) with a malicious bash.exe
cannot cause that bash.exe
to run, which was part of CVE-2024-22190. There are two general scenarios that should be tested: when there is another bash.exe
that should run hooks, and when there isn't another one. In both cases, the impostor bash.exe
should not run.
It is a shortcoming of #1792, which fixed CVE-2024-22190 and added this test, that both scenarios would not be tested in a single run of the test suite. This would have been tricky to do prior to the changes in this pull request, but I probably could and maybe should have done it. Instead, this tested whichever of the two scenarios applied to the test system. Both scenarios were common prior to the changes in this pull request, which makes the broken-or-unavailable bash
scenario uncommon, though still possible.
Possible approaches:
- Assume a working
bash.exe
is available for the test. That's what this PR currently does, by commenting out the code for the broken-or-unavailablebash
scenario. That code could even be removed, like other commented-out code, since it is in the repository's history. It could be replaced with a comment “TODO: Test that an impostor bash doesn't run in a realistic case of the "real" bash being broken or absent.” (or maybe there is a better wording). - Modify
WinBashStatus.check
to work for checking the status of thebash
interpreter that GitPython uses now. One way is to have it take an argument for thebash
command and run that instead of hard-codingbash.exe
. That would be passed in the call used to bind_win_bash_status
. Then the existing check would work properly again. (It would even work in xfail conditions, but I think there is no good reason to cover it in those tests.) Further improvements could come later, including eventual removal of theWinBashStatus
class. This is the approach that I would lean toward taking at this time. - Produce both scenarios as separate test cases. This is ideal, but also seems like it may be outside this PR's scope. It may also be a bit tricky to do well, since the goal is for it to fail in a realistic regression of CVE-2024-22190. If such a bug were inadvertently introduced in the future, the exploit would probably be a
bash.exe
impostor in an untrusted repository/branch, and it would probably affect only the situation where the"bash.exe"
fallback is used forGIT_PYTHON_BASH_EXECUTABLE
. This can be hard to test reliably because Windows searches in some directories before usingPATH
, such asSystem32
. However, the changes in this PR allow this to be tested a different way, by temporarily settingGIT_PYTHON_BASH_EXECUTABLE
to a command name that is very unlikely to exist already, such asbash-7de052ca-c9e6-4e4c-abe8-4d168ecd74c6.exe
, and naming the impostor that as well. This could be done without excessive code duplication by expanding the way the test is parametrized, or by splitting it into two separate methods that call a shared helper (or use a shared fixture).
I think any of those approaches, and probably various others I have not thought of, are reasonable.
# @pytest.mark.xfail( | ||
# type(_win_bash_status) is WinBashStatus.Absent, | ||
# reason="Can't run a hook on Windows without bash.exe.", | ||
# rasies=HookExecutionError, | ||
# ) | ||
# @pytest.mark.xfail( | ||
# type(_win_bash_status) is WinBashStatus.WslNoDistro, | ||
# reason="Currently uses the bash.exe of WSL, even with no WSL distro installed", | ||
# raises=HookExecutionError, | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As on test_run_commit_hook
, this commented-out code can be removed.
# @pytest.mark.xfail( | ||
# type(_win_bash_status) is WinBashStatus.WslNoDistro, | ||
# reason="Currently uses the bash.exe of WSL, even with no WSL distro installed", | ||
# raises=AssertionError, | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As on test_run_commit_hook
and test_pre_commit_hook_success
, this commented-out code can be removed.
The situation with the if type(_win_bash_status) is WinBashStatus.Absent:
check in the method body is more subtle, because it would be useful to continue testing that we get a HookExecutionError
, with the expected attributes, when bash
is absent.
This is roughly analogous to the situation in test_hook_uses_shell_not_from_cwd
. But the stakes are much lower here, because this does not test for a security vulnerability. Also, this is passing on CI, because the WinBashStatus.Absent
case is fairly rare and not produced there: WinBashStatus.check
finds bash.exe
in System32
and concludes bash
is present, then the Git Bash bash.exe
is used due to the changes in this PR.
The possible approaches to test_hook_uses_shell_not_from_cwd
are also possible here, but because this test is not failing, there is the further option of just commenting to note that the check is not always accurate. As there, I think further improvements to the cases this covers should eventually be made, and could be made here, but that their absence should not get in the way of this PR being merged.
# @pytest.mark.xfail( | ||
# type(_win_bash_status) is WinBashStatus.Absent, | ||
# reason="Can't run a hook on Windows without bash.exe.", | ||
# rasies=HookExecutionError, | ||
# ) | ||
# @pytest.mark.xfail( | ||
# type(_win_bash_status) is WinBashStatus.Wsl, | ||
# reason="Specifically seems to fail on WSL bash (in spite of #1399)", | ||
# raises=AssertionError, | ||
# ) | ||
# @pytest.mark.xfail( | ||
# type(_win_bash_status) is WinBashStatus.WslNoDistro, | ||
# reason="Currently uses the bash.exe of WSL, even with no WSL distro installed", | ||
# raises=HookExecutionError, | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As on test_run_commit_hook
, test_pre_commit_hook_success
, and test_pre_commit_hook_fail
, this commented-out code can be removed.
I am not concerned about losing mention of the "Specifically seems to fail on WSL bash (in spite of #1399)" situation here, and I don't think even that commented-out decoration needs to be kept as comments here. After the improvements of this PR, I think most use of the WSL bash for hooks will be for people whose workflows were already working and who need it for backward compatibility. The solution to this problem, for people who experience it, is just to use the new default behavior.
If the #1399-related failure with WSL is considered important to keep track of, then a new issue can be opened about it (I would be pleased to do this on request), or a comment could briefly mention it, but it seems to me that neither is necessary at this time.
# @pytest.mark.xfail( | ||
# type(_win_bash_status) is WinBashStatus.WslNoDistro, | ||
# reason="Currently uses the bash.exe of WSL, even with no WSL distro installed", | ||
# raises=AssertionError, | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As on test_run_commit_hook
, test_pre_commit_hook_success
, test_pre_commit_hook_fail
, and test_commit_msg_hook_success
, this commented-out code can be removed.
+ Test a successful refresh with a relative path, which will be safer to do once the refresh tests restore changed state. This is incomplete, because while it is probably not necessary while running the test suite to preserve an old False value of git.GIT_OK (most tests can't work if that happens anyway), the value of Git.GIT_PYTHON_GIT_EXECUTABLE is not the only other global state that effects the behavior of subsequently run tests and that may be changed as a result of the refresh tests. 1. After the git.refresh function calls git.cmd.Git.refresh, it calls git.remote.FetchInfo.refresh, which rebinds the git.remote.FetchInfo._flag_map attribute. 2. Future changes to git.cmd.Git.refresh may mutate other state in the Git class, and ideally the coupling would be loose enough that the refresh tests wouldn't have to be updated for that if the behavior being tested does not change. 3. Future changes to git.refresh may perform other refreshing actions, and ideally it would be easy (and obvious) what has to be done to patch it back. In particular, it will likely call another Git method that mutates class-wide state due to gitpython-developers#1791, and for such state that is also of the Git class, ideally no further changes would have to be made to the code that restores state after the refresh tests. If we assume git.refresh is working at least in the case that it is called with no arguments, then the cleanup can just be a call to git.refresh(). Otherwise, sufficiently general cleanup may be more complicated.
+ Test a successful refresh with a relative path, which will be safer to do once the refresh tests restore changed state. See gitpython-developers#1811. This addresses it incompletely, because while it is probably not necessary while running the test suite to preserve an old False value of git.GIT_OK (most tests can't work if that happens anyway), the value of Git.GIT_PYTHON_GIT_EXECUTABLE is not the only other global state that effects the behavior of subsequently run tests and that may be changed as a result of the refresh tests. 1. After the git.refresh function calls git.cmd.Git.refresh, it calls git.remote.FetchInfo.refresh, which rebinds the git.remote.FetchInfo._flag_map attribute. 2. Future changes to git.cmd.Git.refresh may mutate other state in the Git class, and ideally the coupling would be loose enough that the refresh tests wouldn't have to be updated for that if the behavior being tested does not change. 3. Future changes to git.refresh may perform other refreshing actions, and ideally it would be easy (and obvious) what has to be done to patch it back. In particular, it will likely call another Git method that mutates class-wide state due to gitpython-developers#1791, and for such state that is also of the Git class, ideally no further changes would have to be made to the code that restores state after the refresh tests. If we assume git.refresh is working at least in the case that it is called with no arguments, then the cleanup can just be a call to git.refresh(). Otherwise, sufficiently general cleanup may be more complicated.
This comment was marked as outdated.
This comment was marked as outdated.
Nudge |
I believe this PR is waiting for the author, there is a lot of pending review comments and conflicts by now. |
Provides a mechanism to set a path to bash to use on Windows for executing commit hook commands. If none is set, the default behavior should be to prefer Git Bash from Git for Windows.
If a user is running in Windows, they probably are using Git for Windows, which includes Git Bash and should be associated with the configured Git command set in
refresh()
. Search first for Git Bash relative to the configured Git command, and, if it exists, use it. Otherwise, fail back to the hardcoded default of"bash.exe"
.The original "default" bash without a path assumes that bash.exe is on the PATH. If a user chose not to make MSYS2 tools available on the PATH on Git for Windows install, another method must be used to find that bash if it should be preferred over one found on the PATH. Newer versions of Windows include a bash.exe in System32 that is then found by default, but this bash.exe is intended for use with WSL. If a user intended to invoke git in WSL, presumably this package would have been installed there and not registered as "Windows" and reduces the ambiguity. In most cases where running the python in "native" Windows, then, it is undesirable to invoke the WSL Bash that is not associated with the Git for Windows install.
Furthermore, direct access to bash.exe in System32 is reportedly deprecated in lieu of commands that explicitly invoke WSL.
This addresses issues where git hooks are intended to run assuming the "native" Windows environment as seen by git.exe rather than inside the git sandbox of WSL, which is likely configured independently of the Windows Git as well. A noteworthy example are repos with Git LFS, where Git LFS may be installed in Windows but not in WSL, causing commit hooks for LFS to fail despite being installed alongside the configured Git command in this package.