Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pyproject.toml expected encoding seems inconsistent, bug with hatch test --cover and special characters #1677

Open
feddg opened this issue Aug 12, 2024 · 2 comments · May be fixed by #1682
Open

Comments

@feddg
Copy link

feddg commented Aug 12, 2024

Environment:

  • OS: Windows
  • Python: 3.12.5
  • Version: both 1.12.0 (installed with pipx) and 1.12.1.dev13 (commit 7c8dbbc, installed with pip in a clean virtual environment)

How to reproduce

  • Generate a new project with hatch.
  • Verify that all commands work as expected on the sample project (e.g. hatch fmt, hatch test, hatch test --cover).
  • Append # ” as a comment at the end of pyproject.toml. The last character is a 'RIGHT DOUBLE QUOTATION MARK' (U+201D), it is encoded in UTF-8 with bytes 0xE2 0x80 0x9D.
  • Run hatch test --cover.

Issue description

  • Expected behaviour: hatch ignores the comment. Instead, if the character was an invalid one, all hatch commands that require reading pyproject.toml should fail with the same error.
  • Current behaviour: hatch test --cover fails with UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1645: character maps to <undefined>, while other commands work fine (e.g. hatch env show, hatch fmt, hatch test).

Hatch 1.12.0 (on python 3.12.5, installed with pipx) traceback:

$ hatch test --cover
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\feder\.local\pipx\venvs\hatch\Lib\site-packages\hatch\cli\__init__.py:221 in main       │
│                                                                                                  │
│   218                                                                                            │
│   219 def main():  # no cov                                                                      │
│   220 │   try:                                                                                   │
│ ❱ 221 │   │   hatch(prog_name='hatch', windows_expand_args=False)                                │
│   222 │   except Exception:  # noqa: BLE001                                                      │
│   223 │   │   import sys                                                                         │
│   224                                                                                            │
│                                                                                                  │
│ C:\Users\feder\.local\pipx\venvs\hatch\Lib\site-packages\click\core.py:1157 in __call__          │
│                                                                                                  │
│ C:\Users\feder\.local\pipx\venvs\hatch\Lib\site-packages\click\core.py:1078 in main              │
│                                                                                                  │
│ C:\Users\feder\.local\pipx\venvs\hatch\Lib\site-packages\click\core.py:1688 in invoke            │
│                                                                                                  │
│ C:\Users\feder\.local\pipx\venvs\hatch\Lib\site-packages\click\core.py:1434 in invoke            │
│                                                                                                  │
│ C:\Users\feder\.local\pipx\venvs\hatch\Lib\site-packages\click\core.py:783 in invoke             │
│                                                                                                  │
│ C:\Users\feder\.local\pipx\venvs\hatch\Lib\site-packages\click\decorators.py:33 in new_func      │
│                                                                                                  │
│ C:\Users\feder\.local\pipx\venvs\hatch\Lib\site-packages\hatch\cli\test\__init__.py:161 in test  │
│                                                                                                  │
│   158 │   patched_coverage = PatchedCoverageConfig(app.project.location, app.data_dir / '.conf   │
│   159 │   coverage_config_file = str(patched_coverage.internal_config_path)                      │
│   160 │   if cover:                                                                              │
│ ❱ 161 │   │   patched_coverage.write_config_file()                                               │
│   162 │                                                                                          │
│   163 │   for context in app.runner_context(selected_envs, ignore_compat=multiple_possible, di   │
│   164 │   │   internal_arguments: list[str] = list(context.env.config.get('extra-args', []))     │
│                                                                                                  │
│ C:\Users\feder\.local\pipx\venvs\hatch\Lib\site-packages\hatch\cli\test\core.py:50 in            │
│ write_config_file                                                                                │
│                                                                                                  │
│   47 │   │   │                                                                                   │
│   48 │   │   │   from hatch.utils.toml import load_toml_data                                     │
│   49 │   │   │                                                                                   │
│ ❱ 50 │   │   │   project_data = load_toml_data(self.user_config_path.read_text())                │
│   51 │   │   │   project_data.setdefault('tool', {}).setdefault('coverage', {}).setdefault('r    │
│   52 │   │   │   self.internal_config_path.write_text(tomli_w.dumps(project_data))               │
│   53                                                                                             │
│                                                                                                  │
│ C:\Python\Python312\Lib\pathlib.py:1028 in read_text                                             │
│                                                                                                  │
│   1025 │   │   """                                                                               │
│   1026 │   │   encoding = io.text_encoding(encoding)                                             │
│   1027 │   │   with self.open(mode='r', encoding=encoding, errors=errors) as f:                  │
│ ❱ 1028 │   │   │   return f.read()                                                               │
│   1029 │                                                                                         │
│   1030 │   def write_bytes(self, data):                                                          │
│   1031 │   │   """                                                                               │
│                                                                                                  │
│ C:\Python\Python312\Lib\encodings\cp1252.py:23 in decode                                         │
│                                                                                                  │
│    20                                                                                            │
│    21 class IncrementalDecoder(codecs.IncrementalDecoder):                                       │
│    22 │   def decode(self, input, final=False):                                                  │
│ ❱  23 │   │   return codecs.charmap_decode(input,self.errors,decoding_table)[0]                  │
│    24                                                                                            │
│    25 class StreamWriter(Codec,codecs.StreamWriter):                                             │
│    26 │   pass                                                                                   │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1645: character maps to <undefined>

Cause

The issue seems caused by hatch/cli/test/core.py:50 self.user_config_path.read_text().
self.user_config_path seems to be a pathlib.Path(), thus .read_text() should default to encoding "utf-8" on Linux and "windows-1252" on Windows.

The same error can be produced by creating e file only containing the triggering character 'RIGHT DOUBLE QUOTATION MARK' (U+201D, UTF-8 0xE2 0x80 0x9D) and reading the file with pathlib.Path(<file>).read_text(encoding="windows-1252").

Proposed solution

Change hatch/cli/test/core.py:50 self.user_config_path.read_text() to self.user_config_path.read_text(encoding="utf-8"). This seems to solve the issue in the local hatch installation.

This should be consistent because

  • Hatch should have the same behaviour on Windows and Linux, thus the encoding should be fixed and not left to the system default.
  • It seems consistent with the behaviour of the other hatch commands.
  • It is consistent with the pyproject.toml specification, which requires the encoding to be "utf-8".

Proposed solution possible drawbacks

Possibly, a Windows only project might use windows-1252 for pyproject.toml. However, this seems not plausible, because a pyproject.toml with the same character in windows-1252 encoding makes hatch's other commands fail with error UnicodeDecodeError: 'utf-8' codec can't decode byte 0x94 in position 1643: invalid start byte. This confirms that the other commands expect the "utf-8" encoding. Moreover, windows-1252 would not be consistent with the pyproject.toml specification.

Notes

Running grep -n -F ".read_text(" src/**/*.py (with shopt -s globstar) in the hatch repository reveals many instances of .read_text(), many of them without explicit encoding, some of them with explicit "utf-8". It is possible that the same issue extends to those calls.

I might work on a patch, but I might need some guidance on how to design an appropriate test.

@feddg
Copy link
Author

feddg commented Aug 15, 2024

I analysed more in detail the issue. My pull requests should fix it.

I noticed that the issue also appears with hatch fmt, although I did not notice it because my configuration was using my ruff settings, excluding the ones from hatch (with tool.hatch.envs.hatch-static-analysis.config-path = "none"). Following the above steps in "How to reproduce" will trigger the issue also for hatch fmt (and possibly other commands).

I confirm that some commands are unaffected (e.g hatch env show, hatch test, hatch test --show, possibly others).

@rhkarls
Copy link

rhkarls commented Aug 25, 2024

Not sure if its related, but I'm getting UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 639: invalid start byte when attempting to add dependencies to pyproject.toml and Syncing dependencies does not complete. This happens when following the Hatch demo with cowsay step-by-step. On Windows 10, Hatch 1.12.0 (installed using the .msi) and Python 3.12.2 in the env.

mih added a commit to adswa/datalad-next that referenced this issue Sep 22, 2024
Hatch has some problems with UTF chars in `pyproject.toml` on windows.

Refs: pypa/hatch#1677
mih added a commit to adswa/datalad-next that referenced this issue Sep 22, 2024
Hatch has some problems with UTF chars in `pyproject.toml` on windows.

Refs: pypa/hatch#1677
mih added a commit to adswa/datalad-next that referenced this issue Sep 22, 2024
Hatch has some problems with UTF chars in `pyproject.toml` on windows.

Refs: pypa/hatch#1677
mih added a commit to datalad/datalad-next that referenced this issue Sep 23, 2024
Hatch has some problems with UTF chars in `pyproject.toml` on windows.

Refs: pypa/hatch#1677
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants