Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shutil.which() can return non-executable path starting with 3.12 on Windows #127001

Closed
lazka opened this issue Nov 19, 2024 · 9 comments
Closed
Assignees
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@lazka
Copy link
Contributor

lazka commented Nov 19, 2024

Bug report

Bug description:

Starting with Python 3.12 if a file is in PATH that does not end with PATHEXT, but happens to be named the same as the searched command it is returned instead of the real command later in PATH which includes PATHEXT.

# Assume:
# PATH=C:\foo;C:\WINDOWS\system32
# "C:\foo\cmd" exists
# "C:\WINDOWS\system32\cmd.exe" exists

import shutil
print(shutil.which('cmd'))
# Actual: C:\foo\cmd
# Expected: C:\WINDOWS\system32\cmd.exe (3.11 result)

I have verified that this change was introduced with #103179 and reverting it fixes the issue.

@csm10495 ^

Downstream context: mesonbuild/meson#13886

CPython versions tested on:

3.12

Operating systems tested on:

Windows

Linked PRs

@zooba
Copy link
Member

zooba commented Nov 19, 2024

Reverting the entire change is clearly not going to happen - the logic was incorrect before, and apparently is still incorrect, so let's figure out what correct logic for handling PATHEXT would be.

@serhiy-storchaka
Copy link
Member

Should not it be fixed by #109590?

@serhiy-storchaka
Copy link
Member

Ah, it still checks for file without extension.

@lazka
Copy link
Contributor Author

lazka commented Nov 19, 2024

Should not it be fixed by #109590?

That only handles the "in the same directory" case, and not in different directories, from what I see.

But, that issue mentioned Get-Command cmd in powershell, which I now tried, and that also returns the wrong path as "CommandType=Application" here, even though the docs say that it only returns files ending with PATHEXT (https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/get-command?view=powershell-7.4#-commandtype) so I don't know what's "right" anymore :)

PS C:\Users\user> echo $env:PATH
C:\msys64\usr\bin;C:\Windows\System32;C:\Windows
PS C:\Users\user> echo $env:PATHEXT
.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.CPL
PS C:\Users\user> Get-Command cmd

CommandType     Name                                               Version    Source
-----------     ----                                               -------    ------
Application     cmd                                                0.0.0.0    C:\msys64\usr\bin\cmd

PS C:\Users\user> Get-Command -All cmd

CommandType     Name                                               Version    Source
-----------     ----                                               -------    ------
Application     cmd                                                0.0.0.0    C:\msys64\usr\bin\cmd
Application     cmd.exe                                            10.0.2610… C:\Windows\System32\cmd.exe

@serhiy-storchaka serhiy-storchaka self-assigned this Nov 19, 2024
@serhiy-storchaka serhiy-storchaka added 3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes OS-windows labels Nov 19, 2024
@zooba
Copy link
Member

zooba commented Nov 19, 2024

The right behaviour ought to be "whatever CreateProcess("cmd") would run". Terminals all layer their own logic on top, but Python tries to generally stick to the closest public API that implements the functionality.

(One explicit exception we make to CreateProcess behaviour is that we don't search the application directory first with which, because if we did, it wouldn't be useful. By skipping that somewhat strange edge case, it is possible to use which to override the application directory. This mainly only matters for python.exe - CreateProcess("python") will always launch sys._base_executable due to how the API works, but CreateProcess(shutil.which("python")) follows PATH, which is likely what was intended if a user provided the command name.)

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Nov 19, 2024
…h() on Windows

Name without a PATHEXT extension is only searched if the mode does not
include X_OK.
@lazka
Copy link
Contributor Author

lazka commented Nov 19, 2024

The right behaviour ought to be "whatever CreateProcess("cmd") would run". Terminals all layer their own logic on top, but Python tries to generally stick to the closest public API that implements the functionality.

Also CreateProcess prefers "C:\Windows\System32" over PATH, like curl/tar/bash that are in there, which is another can of worms :(

Since my initial presented scenario occurs in MSYS2, and it's not quite clear why we have that "cmd" bash script in PATH in the first place, I'm trying to get rid of it instead now :/

@csm10495
Copy link
Contributor

Ouch. Sorry for the regression folks. I wish this was less complicated than it is. I see @serhiy-storchaka has already started on a fix. Thanks all.

@picnixz picnixz added the stdlib Python modules in the Lib dir label Nov 21, 2024
@serhiy-storchaka
Copy link
Member

There was other bug introduced in #109995: full match did not work for files with multcomponent extension in PATHEXT, (e.g. .foo.bar).

Other corner case was only discussed in comments: a dot (.) in PATHEXT.

serhiy-storchaka added a commit that referenced this issue Nov 22, 2024
* Name without a PATHEXT extension is only searched if the mode does not
  include X_OK.
* Support multi-component PATHEXT extensions (e.g. ".foo.bar").
* Support files without extensions in PATHEXT contains dot-only extension
  (".", "..", etc).
* Support PATHEXT extensions that end with a dot (e.g. ".foo.").
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Nov 22, 2024
…honGH-127035)

* Name without a PATHEXT extension is only searched if the mode does not
  include X_OK.
* Support multi-component PATHEXT extensions (e.g. ".foo.bar").
* Support files without extensions in PATHEXT contains dot-only extension
  (".", "..", etc).
* Support PATHEXT extensions that end with a dot (e.g. ".foo.").
(cherry picked from commit 8899e85)

Co-authored-by: Serhiy Storchaka <[email protected]>
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Nov 22, 2024
…ws (pythonGH-127035)

* Name without a PATHEXT extension is only searched if the mode does not
  include X_OK.
* Support multi-component PATHEXT extensions (e.g. ".foo.bar").
* Support files without extensions in PATHEXT contains dot-only extension
  (".", "..", etc).
* Support PATHEXT extensions that end with a dot (e.g. ".foo.").
(cherry picked from commit 8899e85)

Co-authored-by: Serhiy Storchaka <[email protected]>
serhiy-storchaka added a commit that referenced this issue Nov 22, 2024
…-127035) (GH-127156)

* Name without a PATHEXT extension is only searched if the mode does not
  include X_OK.
* Support multi-component PATHEXT extensions (e.g. ".foo.bar").
* Support files without extensions in PATHEXT contains dot-only extension
  (".", "..", etc).
* Support PATHEXT extensions that end with a dot (e.g. ".foo.").
(cherry picked from commit 8899e85)

Co-authored-by: Serhiy Storchaka <[email protected]>
serhiy-storchaka added a commit that referenced this issue Nov 22, 2024
…-127035) (GH-127158)

* Name without a PATHEXT extension is only searched if the mode does not
  include X_OK.
* Support multi-component PATHEXT extensions (e.g. ".foo.bar").
* Support files without extensions in PATHEXT contains dot-only extension
  (".", "..", etc).
* Support PATHEXT extensions that end with a dot (e.g. ".foo.").
(cherry picked from commit 8899e85)
@lazka
Copy link
Contributor Author

lazka commented Nov 22, 2024

I can confirm that this fixes things for my use case. Thanks everyone!

facsimiles-push bot pushed a commit to facsimiles/graphviz that referenced this issue Nov 25, 2024
Python 3.12 made some changes to how `shutil.which` functions that caused shell
scripts like gvmap.sh to now be detectable on Windows.
97b0c0b explains some of the backstory to this.
18ebd12 fixed various CI failures that began
occurring when this Python change propagated to MinGW in CI.

CI has now started once again failing on MinGW. Inspecting the diff between a
passing and failing run reveals Python has moved from 3.12.7-2 to 3.12.7-3. The
diff between these two appears to be the backporting of a claimed fix to
`shutil.which`.¹ Discussion of the underlying issue² seems to point to
significant disagreement between Python contributors on what the desirable
behavior of `shutil.which` is in these scenarios. It is also not clear to me how
gvmap.sh ends up non-executable (and thus affected by this) given its CMake
installation rule seems to add executability.

Rather than trying to continue accommodating Python changing its `shutil.which`
semantics, this change skips testing of this scenario.

¹ msys2/MINGW-packages@32bd97e
² python/cpython#127001
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes 3.14 new features, bugs and security fixes OS-windows stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants