Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

launcher.c: add support for Unicode sys.argv in cli/gui.exe Windows launchers #595

Closed
wants to merge 2 commits into from

Conversation

anthrotype
Copy link

In Python 3 on Windows, sys.argv is created through the Windows Unicode API.
That means command line arguments are retrieved as wchar_t* through the Windows-specific wmain (for console programs) and wWinMain (for gui programs) entry points.

See https://bugs.python.org/issue2128

However, setuptools' "cli.exe" and "gui.exe" launchers, as defined in launcher.c, are still using the Windows ANSI API: i.e. the entry point is main, and argv type is char * instead of wchar_t*, which means it's dependant on the current console's codepage.

This prevents Windows Python 3 from accessing the Unicode command line arguments which are sent by Windows when running python scripts via the usual python.exe command. When run through launchers, the sys.argv contains instead replacement strings like "?????", which is not nice.

The fix proposed here is similar to the one which introduced support for Unicode sys.argv in Python 3: https://hg.python.org/cpython/rev/22a74eaf6b22/

Basically:

  • we use 'wmain' and 'wWinMain' as entry points, so we can use wchar_t *argv
  • we replace all char* with wchar_t*
  • we use the 'wide' Unicode version of all relevant functions (wcslen instead
    of strlen, CreateProcessW instead of CreateProcessA, etc.)

Please let me know what you think.

Thank you.

Cosimo Lupo added 2 commits May 26, 2016 12:22
…aunchers

use the Windows Unicode API to retrieve the commandline arguments as wchar_t*,
so we can pass them on unchanged to the wrapped python script.

Also see:
- https://bugs.python.org/issue2128
- https://hg.python.org/cpython/rev/22a74eaf6b22/
@anthrotype
Copy link
Author

I just added unit tests for the sys.argv as unicode strings. I expect them to fail, as the precompiled cli.exe and gui.exe are still the unpatched ones. The tests do pass with my patched version of launcher.c.

Of course, I need to skip these for python 2, becuse of this and that.

@jaraco
Copy link
Member

jaraco commented May 26, 2016

This all sounds good to me. I no longer actively develop on Windows, so I might struggle a bit with compiling the executables, but I'll try to do that.

Out of curiosity, have you encountered this issue when running executables produced by distlib or pip (which uses distlib)? I have plans to strip this functionality out and rely entirely on distlib for handling executable launchers. If distlib has already solved this issue, it may be preferable to focus on switching to distlib for this functionality.

@anthrotype
Copy link
Author

No, I haven't tried distlib yet. I'll give it a try and let you know.

Btw, I managed to compile launcher.c with the following commands in Visual Studio (I tried all 2008, 2010, 2013 and 2015):

cl /DGUI=0 /DUNICODE /D_UNICODE /O1 /Fecli.exe launcher.c /link /SUBSYSTEM:CONSOLE
cl /DGUI=1 /DUNICODE /D_UNICODE /O1 /Fegui.exe launcher.c /link /SUBSYSTEM:WINDOWS

I also compiled it successfully using GCC 5.3.0 from mingw-w64, using the following commands:

gcc -DGUI=0 -municode -O -s -o cli.exe launcher.c
gcc -DGUI=1 -mwindows -municode -O -s -o cli.exe launcher.c

@anthrotype
Copy link
Author

So.. I just tried to plug the distlib-generated .exe launchers in setuptools' test suite, and tried running the unicode sys.argv unit tests above, and -- I'm happy and a little bit sad at the same time.

Happy because they seem to be working; a bit sad because I spent a few days trying to fix this in setuptools, without knowing @vsajip had already fixed the issue in distlib... ;)

Although the unicode argv works, two other tests in test_windows_launchers.py fail for some reason.
I guess I shall leave it to you to decide how to proceed.

I'd really like this issue to be solved, whether in setuptools or through distlib.
Thanks a lot.

@jaraco
Copy link
Member

jaraco commented May 26, 2016

@anthrotype Thanks so much for the extended work on this. My guess is there's more work to be done than just copying the executables from distlib. I was actually thinking that distlib could be vendored into setuptools (the way that six and packaging are) and then it could replace all of the script handling logic (not just the windows executables).

I'm happy to accept for a short-term solution to simply bundle the distlib executables or to accept this pull request above and implement the originally proposed solution, but of course it would be best to remove as much of this duplication as possible... I am worried about the failing test cases, as you mentioned. If we're to include the distlib executables, I'd like to know why those tests are failing.

So short answer is - whatever you have time for, I appreciate the help.

@anthrotype
Copy link
Author

distlib could be vendored into setuptools (the way that six and packaging are) and then it could replace all of the script handling logic

That sounds good to me, better to avoid duplication.

I'm happy to accept for a short-term solution to simply bundle the distlib executables

the problem is that the pre-compiled executables in distlib are meant to work with distlib.scripts.ScriptMaker, and can't simply be dropped in setuptools as they currently are.

At least they need to be recompiled without #define APPENDED_ARCHIVE, which unfortunately is written in the source file, so can't be undefined from the comand line: (launcher.c:36).

This definition specifies the mode in which the launcher works:

  • the default is append the python script source in an archive inside the same exe (but this requires the distlib ScriptMaker);
  • the other option (with APPEND_ARCHIVE undefined) works the same as setuptools' launcher, i.e. the exe tries to find a "-script.py[w]" file to run in the same folder.

As I mentioned yesterday, even when I recompile the distlib launcher without APPEND_ARCHIVE to make it work like setuptools', then there is one test case that fails: test_windows_launcher::TestCLI.test_with_options.

After tinkering with pointers, I eventually found a solution to the issue (see patch below).
The problem was that, when the python executable in the #!shebang was not wrapped in quotes (as in above test case), the distlib launcher was also appending the argument in the executable line. So the latter was being included twice: once in the quoted executable string, and again on its own:

"C:\Python35\python.exe -Oi" -Oi

As I'm not familiar enough with mercurial to send a pull request myself, it'd be nice if someone could have a look and send the patch to the bitbucket repository? 😁

diff -r a258e3bdd6f8 PC/launcher.c
--- a/PC/launcher.c Mon May 16 23:31:18 2016 +0100
+++ b/PC/launcher.c Fri May 27 12:46:01 2016 +0100
@@ -33,7 +33,7 @@

 #pragma comment (lib, "Shlwapi.lib")

-#define APPENDED_ARCHIVE
+// #define APPENDED_ARCHIVE

 #define MSGSIZE 1024

@@ -406,6 +406,8 @@
     }
     /* p points just past the executable. It must either be a NUL or whitespace. */
     assert(*p != L'"', "Terminating quote without starting quote for executable in shebang line.");
+    /* if p is whitespace, make it NULL to truncate 'line', and advance */
+    if (*p && iswspace(*p)) *p++ = L'\0';
     /* Now we can skip the whitespace, having checked that it's there. */
     while(*p && iswspace(*p))
         ++p;

so after this patch I can replace the cli.exe and gui.exe in setuptools package with the patched distlib launchers and the test pass, including those for the unicode argv (PY3 only).

@anthrotype
Copy link
Author

unlike setuptools, distlib does not have precompiled launchers for the ARM platform; only 32 and 64 bit, cli and gui. But I guess those should be easy to make.

@jaraco
Copy link
Member

jaraco commented May 27, 2016

not familiar enough with mercurial

It's quite possible that Vinay accepts pull requests in the github mirror.

they need to be recompiled without #define APPENDED_ARCHIVE

I don't think it will be acceptable to comment out that line - after all, pip uses the version with that variable defined. I'd like to setuptools to use the same. Can we adapt setuptools to append the script instead?

The problem was that, when the python executable in the #!shebang was not wrapped in quotes

Again, maybe it makes sense for setuptools to generate scripts that are compatible with the launchers as written. Is that possible?

But I guess those should be easy to make.

Yes, this issue should be filed with distlib. Would you like me to do that?

@anthrotype
Copy link
Author

Can we adapt setuptools to append the script instead?

Sure we can do that. So to clarify, you want setuptools to make the exe launchers the same way that distlib ScriptMaker does (by appending the shebang and source code), but without requiring to import from distlib (at least, temporarily)?

it makes sense for setuptools to generate scripts that are compatible with the launchers as written

yeah, we could put quotes to work around that bug, but the bug still needs to be fixed upstream.

this issue should be filed with distlib

Yes please! Thanks.
I'll try to figure out how to send the patch above to mercurial, otherwise I'll just send it to the git mirror.

@jaraco
Copy link
Member

jaraco commented May 27, 2016

you want setuptools to make the exe launchers the same way that distlib ScriptMaker does

Yes.

yeah, we could put quotes to work around that bug, but the bug still needs to be fixed upstream.

Sounds good.

@jaraco
Copy link
Member

jaraco commented May 27, 2016

I requested ARM launchers here.

anthrotype pushed a commit to anthrotype/distlib that referenced this pull request May 29, 2016
@anthrotype
Copy link
Author

anthrotype commented Jun 9, 2016

ok, now that ARM launchers are gone, it should be easier to integrate distlib launchers in setuptools.
I should be able to work on this over the weekend.

Do you want me to close this PR and open a separate one, or shall I keep working on this branch?

The simplest way to proceed is to modify the WindowsExecutableLauncherWriter._get_script_args method in easy_install module so that it zips and appends the wrapped script, along with the #!shebang, to the launcher executables.

Another option would be to "vendor" distlib, like pip currently does, and use distlib's ScriptMaker for that.

What do you prefer me to do?

Besides, I noticed another difference between the way setuptools and distlib make their launchers: setuptools also adds an (external) manifest file for each .exe. The manifest is added only for 32 bit launchers, and according to the inline comment, it is meant

to prevent Windows from detecting it as an installer (which it will for launchers like easy_install.exe). Consider only adding a manifest for launchers detected as installers. See Distribute #143 for details.

distlib, on the other hand, (or at least the way pip currently uses it) doesn't seem to add a manifest by default to the executables.

So I'm not sure what I should do here. /cc @vsajip @jaraco

@vsajip
Copy link
Contributor

vsajip commented Jun 9, 2016

Well, if pip is using distlib and that doesn't add a manifest and nobody's complained, does that mean manifests aren't really needed? I would imagine manifests are only useful for things like elevated privilege levels - which shouldn't in general apply to scripts installed with Python packages, ISTM.

@jaraco
Copy link
Member

jaraco commented Jun 10, 2016

I seem to recall that only one of the 32-bit or 64-bit executables needed the manifest.

@anthrotype
Copy link
Author

for the records, this is the distribute issue 143 that the comment I quoted above was referring to:
https://bitbucket.org/tarek/distribute/issues/143/easy_install-opens-new-console-cant-read

@anthrotype
Copy link
Author

so it appears the issue of undesired UAC elevation only affects executables detected as installers:

Installer Detection only applies to:

  1. 32 bit executables
  2. Applications without a requestedExecutionLevel
  3. Interactive processes running as a Standard User with LUA enabled
    Before a 32 bit process is created, the following attributes are checked to determine whether it is an installer:
    Filename includes keywords like "install," "setup," "update," etc.

https://technet.microsoft.com/en-us/library/cc709628(WS.10).aspx

I haven't read the full MS article, but I believe we should include the manifest at least for scripts that happen to contain one of those keywords.

It's going to be an external xml file, because for embedding it inside the executable itself we need the mt.exe command, which we can't assume will be available on the users' machines.

@anthrotype
Copy link
Author

OK, I run some tests and it appears that the distlib launchers do not need any external manifest file, even when the script name contains keywords like "install", or "setup".

I made a minimalist setup.py for a script named "launcher_install", containing just a print("hello world") in its main(), and I set that as a console script entry point.

I tested it on Python 2.7 and 3.5 32-bit with setuptools 23.0. When I install my test module from source distribution (pip install -v .), I get a launcher_install.exe which is the current setuptools launcher, accompanied by a separate .exe.manifest file and launcher_install-script.py.

If I then delete or rename the manifest file, and try to run this executable (containing the "install" word in the name) under Windows 7, I get "permission denied". So for setuptools launchers, the manifest is required, at least when thos keywords are present in the script name.

However, if I make a wheel first, and the I pip install from the wheel, I get a new-style, self-contained distlib launcher, with no external manifest file.

This time, even the executable has "install" in its name, It Just Works™ 😄

I presume it's the way the setuptools and the distlib launchers are compiled. For the first, the cl.exe command is used, as explained in the "msvc-build-launcher.c" batch script). For distlib launchers, Visual Studio is used via the included .sln solution file. So Visual Studio must be injecting a manifest directly in the distlib launchers.

So, all in all, I think we can safely get rid of the manifest thing if we switch to distlib launchers.
WDYT?

@jaraco
Copy link
Member

jaraco commented Jun 10, 2016

@anthrotype Thanks for the detailed analysis. That sounds good to me.

@vsajip
Copy link
Contributor

vsajip commented Jun 10, 2016

The distlib launchers do contain an embedded manifest:

<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0">
  <trustInfo xmlns="urn:schemas-microsoft-com:asm.v3">
    <security>
      <requestedPrivileges>
        <requestedExecutionLevel level="asInvoker" uiAccess="false"></requestedExecutionLevel>
      </requestedPrivileges>
    </security>
  </trustInfo>
</assembly>

It's present in all four launchers.

@anthrotype
Copy link
Author

@vsajip thanks for the confirmation! It wasn't clear to me. :)

@anthrotype
Copy link
Author

@jaraco I modified easy_install.py to use the distlib launchers, however now I'm not sure what to do with the current setuptools launcher executables (cli-32.exe, gui-32.exe, etc.) included as package_data in setuptools source.

Shall I simply replace them with the pre-compiled distlib binaries, found here?

And what names shall I use for these: the same distlib uses (t32.exe, t64.exe, w32.exe and w64.exe), or I should keep the current setuptools names (cli-32.exe, cli-64.exe, gui-32.exe, gui-64.exe)? I would prefer to use the distlib names to make it clearer they work differently from the old ones.

And what about those copies of the 32 bit launchers which don't not have a "-32" suffix?
The msvc-build-lancher.cmd says:

REM buildout (and possibly other implementations) currently depend on
REM the 32-bit launcher scripts without the -32 in the filename, so copy them
REM there for now.
copy setuptools/cli-32.exe setuptools/cli.exe
copy setuptools/gui-32.exe setuptools/gui.exe

Shall I get rid of the two launchers above?
I can't simply rename distlib's "t32.exe" to "cli.exe", etc., as the new launchers work differently from the old ones (i.e. don't expect to find the wrapped script in the same directory, but append it to themselves as zipped data).

The "msvc-build-launcher.cmd" itself should go, I think. It should probably be replaced by a copy of @vsajip simple_launcher source files.

@anthrotype
Copy link
Author

ah, and what about the launcher manifest.xml file, shall we remove that as well as no longer needed by distlib launchers?

@anthrotype
Copy link
Author

Please have a look at this branch on my setuptools fork:
https://github.com/anthrotype/setuptools/commits/distlib-launchers

I can force-push to this same PR branch, or open a new pull request if you like.
Thanks

@vsajip
Copy link
Contributor

vsajip commented Jun 13, 2016

by a copy of @vsajip simple_launcher source files

Note that these updated sources will be copied to the distlib source tree before the next release of distlib.

@jaraco
Copy link
Member

jaraco commented Jun 14, 2016

Overall, my suggestion is that we re-use as much of distlib as possible. If it's possible, for example, to call into distlib (perhaps passing the script text) and let distlib produce the executables, that would be best.

Shall I simply replace [the existing binaries] with the pre-compiled distlib binaries, found here?

Perhaps we should keep the old ones in place just in case other libraries are referencing them. We'll definitely mark them as deprecated and not use them internally.

And what names shall I use for these: the same distlib uses (t32.exe, t64.exe, w32.exe and w64.exe), or I should keep the current setuptools names (cli-32.exe, cli-64.exe, gui-32.exe, gui-64.exe)? I would prefer to use the distlib names to make it clearer they work differently from the old ones.

I'd like not to copy anything into setuptools, but instead reference it from distlib. If making calls into distlib isn't suitable, and we need to reference these resources directly, then we should reference them from the distlib package. There might be some challenges with doing this using a vendored package, but if you want to simply assume that distlib is a requirement of setuptools for now, that would be fine.

And what about those copies of the 32 bit launchers which don't not have a "-32" suffix? Shall I get rid of [them]?

Same as the other scripts - we'll leave them there for now for compatibility, and we'll remove all of them with one announced release.

The "msvc-build-launcher.cmd" itself should go, I think. It should probably be replaced by a copy of @vsajip simple_launcher source files.

Agreed.

@jaraco
Copy link
Member

jaraco commented Jun 14, 2016

ah, and what about the launcher manifest.xml file, shall we remove that as well as no longer needed by distlib launchers?

Let's consider that deprecated also, and remove it when removing the launcher executables.

I can force-push to this same PR branch, or open a new pull request if you like.

Go ahead and force-push that here. Thanks.

@anthrotype
Copy link
Author

anthrotype commented Jun 14, 2016

Thanks for the reply.
If we assume distlib is a dependency of setuptools, then it's easy for me to just call distlib's ScriptMaker. We don't need to reference the executable files directly as that is taken care of by distlib itself.
Ok, so I shall keep the old executables as well as the classes and functions that references them in easy_install.py. I will simply no longer use them, but import and use ScriptMaker from distlib (just for the windows launcher, for now, I guess).

@vsajip
Copy link
Contributor

vsajip commented Jun 14, 2016

If you run into any problems invoking ScriptMaker to do what you need, do let me know!

@jaraco
Copy link
Member

jaraco commented Sep 18, 2016

I'm going to close this for now, but feel free to revive the effort at any time.

@jaraco jaraco closed this Sep 18, 2016
@anthrotype
Copy link
Author

I'm sorry, I could not find the time to work on this as I was busy with other stuff...
Also I wasn't sure how I should go about vendoring distlib in setuptools.
For now I resorted to making a wheel and using pip to install so it uses the distlib launchers instead of setuptools'.

@Kagami
Copy link

Kagami commented Aug 18, 2018

I'm having exactly same issue. Really pity this was abandoned because CLI executables without support for unicode arguments are almost unusable. E.g. you can't even pass non-ASCII filename.

@dhvu
Copy link

dhvu commented May 15, 2022

Googling to find out why console scripts installed by conda on Windows (through setuptools 61.2.0 I guess, not distlib) lost out-of-codepage characters in the command line, I landed here. How about reopening this issue and merge the OP's PR to solve it in setuptools itself, too?

@jaraco
Copy link
Member

jaraco commented May 19, 2022

I'm sorry, but it's simply untenable to maintain multiple copies of launchers. The launchers in Setuptools are deprecated. The path forward is one of the following:

  • Rely on pip when you need launchers.
  • Use distlib to generate launchers (in conda or similar).
  • Update setuptools to use distlib to generate launchers.

@vsajip
Copy link
Contributor

vsajip commented May 19, 2022

Also I wasn't sure how I should go about vendoring distlib in setuptools.

Given #2825, would adding distlib as a dependency of setuptools be a more acceptable approach than vendoring? Though currently setuptools has no run-time dependencies ...

@jaraco
Copy link
Member

jaraco commented Jun 12, 2022

Agreed that #2825 would imply depending. Unfortunately, until #2825 is resolved, vendoring is the sole means of utilizing a dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants