-
-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests on Windows must usually be run in Git Bash or a similar environment #1359
Comments
Thanks for the detailed report! It appears that |
This runs `cargo nextest ...` and `cargo test --doc` in separate steps in the `test-fast` job, so that the job fails when `cargo nextest ...` fails. Otherwise, with `pwsh` on Windows, test failures (other than doctests) are masked. Background: Since 89a0567 (Byron#1556), doctests are run on all three major platforms, and not only on the full test job done on Ubuntu. But the way this was done relied on a script failing as soon as (or, at least, whenever) any command in the script failed. That works on Ubuntu and macOS, where a `bash` shell is used by default, with `-e` passed. But on Windows, GitHub Actions uses `pwsh` as the default shell. `pwsh` is not run in a way that causes it to stop at the first failing command. So, on Windows, when the `cargo nextest` command failed but the `cargo test --doc` command that followed it in the same script step passed, the step passed, thus allowing the job and workflow to pass. This was observed in Byron#1429 after a rebase (see comments). Note that this is not related to the increased use of `nextest`. While that was also done in Byron#1556, it did not affect the `test-fast` job where the bug was introduced, which was already using `nextest`. This fixes the problem by putting the two commands in separate steps. This is simpler than doing anything in PowerShell to make the script stop, such as using `&&` or attempting to produce `-e`-like behavior. Another option could be to use `bash` as the shell, which is a Git Bash environment suitable for running the tests. The reason I didn't do that is that I think it is valuable to see the results when the tests are run from a PowerShell environment. In particular, continuing to use PowerShell here should help in investigating Byron#1359 (and shows that the claim I made is overly strong, since CI on Windows with `pwsh` not itself started from a Unix-style shell is not "Git Bash or a similar environment").
This comment was marked as off-topic.
This comment was marked as off-topic.
As noted in #1559, my claim in this issue is apparently overstated, since tests are able to run in PowerShell on Windows GHA runners, and a |
The reason this works on CI as noted in #1359 (comment), while not working locally, is that That the current implementation is ever able to use Git Bash on Windows, even when the tests are run from a Git Bash shell, also depends inadvertently on undocumented behavior of How the tests failConsider the following output, which is typical of the test failures:
There are three things going on here:
Why is it even running
|
let output = match configure_command(&mut cmd, &args, &script_result_directory).output() { |
On Windows, that always fails, because files that do not match a supported binary executable format must be batch files with a .cmd
or .bat
extension for std::process::Command
to run them automatically with an interpreter (which Command
supports because CreateProcessW
supports it). The fixture scripts here are bash
scripts and they have a .sh
suffix.
Check if the error was due to the inability to execute the file:
gitoxide/tests/tools/src/lib.rs
Lines 555 to 556 in 1cfe577
Err(err) | |
if err.kind() == std::io::ErrorKind::PermissionDenied || err.raw_os_error() == Some(193) /* windows */ => |
Windows error code 193 means this is a "not a valid Win32 application" error, so we do effectively detect that the script was not able to be executed.
Try again by running bash
, with the script path and the arguments for the script as its arguments:
gitoxide/tests/tools/src/lib.rs
Lines 558 to 559 in 1cfe577
cmd = std::process::Command::new("bash"); | |
configure_command(cmd.arg(script_absolute_path), &args, &script_result_directory).output()? |
What is bash
on Windows?
Today, there are two typical cases for a bash
shell on Windows:
-
The port of
bash
to Windows that is provided by MSYS2 and similar environments, including Git Bash, tolerates backslashes used as directory separators. -
The
bash
associated with WSL does not, since it delegates tobash
in an installed distribution for WSL, which either fails because there is no distribution or the distribution does not havebash
, or succeeds but runs that distribution's nativebash
executable that has no custom behavior for compatibility with Windows.In addition, even if the paths were normalized so a WSL
bash
would run the fixture scripts, we should expect the scripts, if run that way, to fail to produce correct test data. This is because they would be run inside a WSL distribution, but these are the scripts that have to be run because their generated archives are in.gitignore
, which should only be done if the scripts do not currently produce archives that work when used on another system, including when generated on a GNU/Linux system and used on Windows.So any fixture for which this would work reliably should just be omitted from all
.gitignore
files and have its generated archive committed. Fixing the paths is thus not a worthwhile solution, at least not if done by itself.
Of those, which bash
is chosen?
In practice, there seem to be three cases for how bash
is resolved on Windows:
- In an MSYS2-like environment including Git Bash, the
PATH
will almost always be set in such a way that this environment'sbash
precedes any otherbash
. - Outside such an environment, the
PATH
will usually be set so that thebash
associated with WSL, usually atC:\Windows\System32\bash.exe
, precedes any otherbash
. - Outside such an environment but on a GitHub Actions runner, the
PATH
is set so thatbash
is Git Bash.
So when the the tests are not run from Git Bash or a similar environment, they will usually find the wrong bash
, the major exception being when run on CI.
Here's a simplified demonstration on my machine, in PowerShell (pwsh
), where bash
is the WSL-associated bash
, and the scoop
-installed bash
is Git Bash (though the effect is in no way scoop
-specific):
C:\Users\ek\tmp> (Get-Command bash).path
C:\Windows\system32\bash.exe
C:\Users\ek\tmp> Get-Content 'a\b'
echo 'Hello, world!'
C:\Users\ek\tmp> bash 'a/b'
Hello, world!
C:\Users\ek\tmp> bash 'a\b'
/bin/bash: ab: No such file or directory
C:\Users\ek\tmp> C:\Users\ek\scoop\shims\bash.exe 'a/b'
Hello, world!
C:\Users\ek\tmp> C:\Users\ek\scoop\shims\bash.exe 'a\b'
Hello, world!
But that's far from the whole story
To run Git Bash, why is it sufficient for the directory that contains the Git Bash bash.exe
to precede the System32
directory (that has the WSL-associated bash.exe
) in PATH
? Why does it search in PATH
before other locations?
The above example feels convincing, but it shouldn't. I ran those commands in PowerShell, which searches PATH
first. On most OSes, a path search is a PATH
search, but that is not generally the case on Windows, nor is this really an accurate characterization of what std::process::Command
is doing.
- Other than shells, which are expected to perform their own path search with different rules from
CreateProcessW
, most programs on Windows do not do path search themselves and instead letCreateProcessW
do it. Most libraries and frameworks that offer process creation facilities delegate fully, including for path search, toCreateProcessW
. ButCreateProcessW
searches various locations beforePATH
, regardless of whether or where those locations are also listed inPATH
. These prioritized locations include theSystem32
directory, which usually contains the WSL-associatedbash.exe
. - As noted in #1432 (comment), for security, the Rust standard library performs its own
PATH
search whenstd::process::Command
is used. It passes the resolved absolute path toCreateProcessW
, rather than usingCreateProcessW
for the search. This is mainly to avoid searching in the current directory. It was implemented in rust-lang/rust#87704 and fixes the main concern of rust-lang/rust#87945. - But that is not intended to, and usually does not, prevent the
System32
directory from taking precedence overPATH
directories. As reported in rust-lang/rust#122660,std::process::Command
will often choose thebash.exe
inSystem32
rather than a Git Bashbash.exe
that has been placed earlier inPATH
, for this very reason. - But as noted there and elaborated in rust-lang/rust#122660 (comment) (see also rust-lang/rust#37519), this is averted when methods such as
env
are called on theCommand
instance to customize the child process environment. This happens because, when customizing its environment, the current (undocumented) behavior ofCommand
is to search thePATH
for the child before searching special locations and thePATH
of the parent. This is done even ifPATH
is not one of the environment variables being changed or unset.
The latter condition applies to the way gix-testtools
runs fixture scripts on Windows. Returning attention to the code shown above, it calls configure_command
, which makes a few env_remove
calls and numerous env
calls.
gitoxide/tests/tools/src/lib.rs
Lines 598 to 604 in 1cfe577
.env_remove("GIT_DIR") | |
.env_remove("GIT_ASKPASS") | |
.env_remove("SSH_ASKPASS") | |
.env("MSYS", msys_for_git_bash_on_windows) | |
.env("GIT_CONFIG_SYSTEM", never_path) | |
.env("GIT_CONFIG_GLOBAL", never_path) | |
.env("GIT_TERMINAL_PROMPT", "false") |
There are many more env
calls than shown in that quoted portion, but none of them set PATH
, nor is that variable ever unset or the environment ever cleared. So the child PATH
is the parent's PATH
, but since other customizations to the environment have been made, std::process::Command
searches this PATH
for bash
before it searches in any special locations.
A possible fix
Although I tend to think about improvements to how gix-testtools
runs fixture scripts as being closely connected to improvements to gix-command
and the machinery for running hooks, I think the key to identifying a fix for this bug is to notice that they are different in ways that allow changes to gix-testtools
that might not be suitable elsewhere.
However gix-testtools
runs fixture scripts, it:
- Does not need to be compatible with how
git
runs scripts (though it should use the same shell if possible). - Does not need to be compatible with any documented or intended behavior related to how any other gitoxide crates run scripts.
- Does not need to be able to run scripts that are not shell scripts.
- Does not need to be able to run binary executables.
- Can require
git
, because nearly all of the fixture scripts would fail without it anyway.
Therefore, on Windows, gix-testtools
can always safely attempt to use Git Bash to run the script first. If it can't find Git Bash based on information provided by git
, then it can either just fail, or fall back to something like what it is currently doing.
Git Bash can usually be located using a git --exec-path
-based method like what is done in gitpython-developers/GitPython#1791. Conceptually, this is along the lines of
realpath -- "$(git --exec-path)/../../../bin/bash.exe"
though of course it would be in Rust and not Bash, since the point is to find which bash
to use.
Thanks so much for researching this, and the very informative and interesting read! The special-behaviour in By the looks of it, this does affect Fixing this both in However, I wouldn't mind any fix to start with whatever is needed for |
See #1359 (comment) for updated information.
Current behavior 😯
In this project, tests are typically run on Windows in a Git Bash environment. On CI in Windows,
bash
is Git Bash. This is also the environment from which I have typically run tests. This is a native environment; it should not be confused with WSL or even something like Cygwin. Furthermore, tools maintained byrustup
are not specifically connected to this environment, though they are run from it.All tests are able to pass when run in Git Bash on Windows, and likely would pass if run in similar environments that provide a POSIX-compatible shell for Windows, set environment variables accordingly, and provide access to POSIX versions of common tools.
But running the tests from PowerShell instead produces 627 test failures. This presumably relates to the associated environment rather than the shell itself, since however the tests are run, a test runner subprocess is controlling the run.
The main problem seems to be the use of paths with backslashes in them, and the way that interacts with fixture scripts. Note, however, that this is not specific to the situation where
GIX_TEST_IGNORE_ARCHIVES
is set. That is to say that this is a separate issue from #1358 (which also involves far fewer failures) and occurs even whenGIX_TEST_IGNORE_ARCHIVES
is unset.Even the test summary showing one failure per line is too long for GitHub to allow me to include it in this issue description. This gist has it. The full output can be seen in this other gist.
As of now, I have not figured out the cause, which I think is a prerequisite for making a decision about whether to try to support such environments or to instead document them as unsupported.
Expected behavior 🤔
Either all tests should pass even when run from PowerShell, or the need to run them in a Git Bash environment (or whatever other more precise requirement is known) should be documented.
I think which is better depends on how cumbersome it would be to enable the tests to pass when run from PowerShell and, more importantly, how cumbersome it would be to maintain this state. I suspect it may be feasible, though.
Git behavior
A direct comparison is difficult in that this is about running the tests rather than the functionality of Git and gitoxide. However, it is worth noting that Git for Windows has its own SDK environment that is used for building it and running its tests, and this environment is separate even from the more minimal Git Bash environment.
Thus, if it is infeasible to support running tests from PowerShell, documenting that would still not be very restrictive, even compared to what is needed to develop and test Git for Windows, since the non-SDK "Git Bash" environment works fine for running gitoxide's tests.
Steps to reproduce 🕹
Using Windows, make sure the tests can pass when run from Git Bash using the first of two
cargo
test-runner commands shown below.Optionally clean the build with
cargo clean
or, if preferred, clean everything withgix clean -xde
.There are two approaches. The first approach is to display the output in the console: To do that, in PowerShell, run:
However, that is hard to read because of the very large volume of output combined with the staggering of output lines suggesting a problem identifying the terminal width or computing the width of text written.
Therefore, you may instead wish to write both stdout and stderr to a file, and inspect the file during and/or after the run.
This writes all output to
output.txt
in the directory above the current directory. This is to avoid creating a new file in the repository while running the tests, in case that were to interfere with something, though it shouldn't since that should only be able to affect a small number of journey test runs, which are not included when runningcargo nextest
.The
*>
operator in PowerShell is similar to the&>
or>&
operators in some shells and to the effect of2>&1 >
in most shells.Although Windows comes with Windows PowerShell, I suggest against using it for this, since is somewhat less intuitive, and has fewer convenience features, compared to the newer PowerShell (which is sometimes called PowerShell Core). Windows PowerShell is also less likely to be used by developers on Windows, and thus probably less important to support than recent versions of PowerShell. I used PowerShell 7.4.2 on Windows 10.
The text was updated successfully, but these errors were encountered: