Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ruff] Auto-add r prefix when string has no backslashes for unraw-re-pattern (RUF039) #14536

Merged
merged 3 commits into from
Nov 22, 2024

Conversation

dylwil3
Copy link
Collaborator

@dylwil3 dylwil3 commented Nov 22, 2024

This PR adds a sometimes-available, safe autofix for unraw-re-pattern (RUF039), which prepends an r prefix. It is used only when the string in question has no backslahses (and also does not have a u prefix, since that causes a syntax error.)

Closes #14527

Notes:

  • Test fixture unchanged, but snapshot changed to include fix messages.
  • This fix is automatically only available in preview since the rule itself is in preview

@dylwil3 dylwil3 added fixes Related to suggested fixes for violations preview Related to preview mode features labels Nov 22, 2024
if
// The (no-op) `u` prefix is a syntax error when combined with `r`
!literal.flags.prefix().is_unicode()
&& memchr(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't the we need the performance of memchr here. That's why I would use .contains which is also easier to read

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me get a sense of what size strings the performance starts to make a difference? I only mention it because I have seen some long regexes in my life...

But happy to change it to contains!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's less about the length of a string and more that the fix code isn't a hot path.

I don't have too good a sense for when to use memchr otherwise, beyond what the crate documentation mentions

Copy link
Contributor

github-actions bot commented Nov 22, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+1 -20 violations, +272 -0 fixes in 13 projects; 41 projects unchanged)

RasaHQ/rasa (+0 -0 violations, +4 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

- rasa/utils/io.py:222:9: RUF039 First argument to `re.compile()` is not raw string
+ rasa/utils/io.py:222:9: RUF039 [*] First argument to `re.compile()` is not raw string
- rasa/utils/io.py:231:9: RUF039 First argument to `re.compile()` is not raw string
+ rasa/utils/io.py:231:9: RUF039 [*] First argument to `re.compile()` is not raw string

apache/airflow (+0 -0 violations, +64 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- dev/breeze/src/airflow_breeze/params/build_prod_params.py:81:21: RUF039 First argument to `re.match()` is not raw string
+ dev/breeze/src/airflow_breeze/params/build_prod_params.py:81:21: RUF039 [*] First argument to `re.match()` is not raw string
- dev/breeze/src/airflow_breeze/utils/run_tests.py:109:19: RUF039 First argument to `re.sub()` is not raw string
+ dev/breeze/src/airflow_breeze/utils/run_tests.py:109:19: RUF039 [*] First argument to `re.sub()` is not raw string
- dev/perf/dags/elastic_dag.py:73:19: RUF039 First argument to `re.sub()` is not raw string
+ dev/perf/dags/elastic_dag.py:73:19: RUF039 [*] First argument to `re.sub()` is not raw string
- docs/exts/docs_build/lint_checks.py:46:46: RUF039 First argument to `re.findall()` is not raw string
+ docs/exts/docs_build/lint_checks.py:46:46: RUF039 [*] First argument to `re.findall()` is not raw string
- helm_tests/airflow_aux/test_pod_template_file.py:358:26: RUF039 First argument to `re.search()` is not raw string
+ helm_tests/airflow_aux/test_pod_template_file.py:358:26: RUF039 [*] First argument to `re.search()` is not raw string
... 54 additional changes omitted for project

apache/superset (+0 -0 violations, +138 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- RELEASING/changelog.py:276:26: RUF039 First argument to `re.match()` is not raw string
+ RELEASING/changelog.py:276:26: RUF039 [*] First argument to `re.match()` is not raw string
- scripts/build_docker.py:66:23: RUF039 First argument to `re.sub()` is not raw string
+ scripts/build_docker.py:66:23: RUF039 [*] First argument to `re.sub()` is not raw string
- scripts/build_docker.py:68:23: RUF039 First argument to `re.sub()` is not raw string
+ scripts/build_docker.py:68:23: RUF039 [*] First argument to `re.sub()` is not raw string
- scripts/build_docker.py:70:23: RUF039 First argument to `re.sub()` is not raw string
+ scripts/build_docker.py:70:23: RUF039 [*] First argument to `re.sub()` is not raw string
- scripts/build_docker.py:70:51: RUF039 First argument to `re.sub()` is not raw string
+ scripts/build_docker.py:70:51: RUF039 [*] First argument to `re.sub()` is not raw string
- superset/db_engine_specs/athena.py:30:5: RUF039 First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/athena.py:30:5: RUF039 [*] First argument to `re.compile()` is not raw string
- superset/db_engine_specs/bigquery.py:76:5: RUF039 First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/bigquery.py:76:5: RUF039 [*] First argument to `re.compile()` is not raw string
- superset/db_engine_specs/bigquery.py:77:5: RUF039 First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/bigquery.py:77:5: RUF039 [*] First argument to `re.compile()` is not raw string
- superset/db_engine_specs/bigquery.py:90:5: RUF039 First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/bigquery.py:90:5: RUF039 [*] First argument to `re.compile()` is not raw string
- superset/db_engine_specs/denodo.py:29:46: RUF039 First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/denodo.py:29:46: RUF039 [*] First argument to `re.compile()` is not raw string
- superset/db_engine_specs/denodo.py:30:48: RUF039 First argument to `re.compile()` is not raw string
+ superset/db_engine_specs/denodo.py:30:48: RUF039 [*] First argument to `re.compile()` is not raw string
- superset/db_engine_specs/denodo.py:32:9: RUF039 First argument to `re.compile()` is not raw string
... 115 additional changes omitted for project

bokeh/bokeh (+0 -0 violations, +14 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- src/bokeh/util/strings.py:91:19: RUF039 First argument to `re.sub()` is not raw string
+ src/bokeh/util/strings.py:91:19: RUF039 [*] First argument to `re.sub()` is not raw string
- tests/unit/bokeh/io/test_export.py:203:9: RUF039 First argument to `re.compile()` is not raw string
+ tests/unit/bokeh/io/test_export.py:203:9: RUF039 [*] First argument to `re.compile()` is not raw string
- tests/unit/bokeh/io/test_export.py:204:13: RUF039 First argument to `re.compile()` is not raw string
+ tests/unit/bokeh/io/test_export.py:204:13: RUF039 [*] First argument to `re.compile()` is not raw string
- tests/unit/bokeh/io/test_export.py:205:13: RUF039 First argument to `re.compile()` is not raw string
+ tests/unit/bokeh/io/test_export.py:205:13: RUF039 [*] First argument to `re.compile()` is not raw string
- tests/unit/bokeh/io/test_export.py:206:9: RUF039 First argument to `re.compile()` is not raw string
+ tests/unit/bokeh/io/test_export.py:206:9: RUF039 [*] First argument to `re.compile()` is not raw string
... 4 additional changes omitted for project

ibis-project/ibis (+0 -0 violations, +16 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

- ibis/backends/__init__.py:1396:17: RUF039 First argument to `re.match()` is not raw string
+ ibis/backends/__init__.py:1396:17: RUF039 [*] First argument to `re.match()` is not raw string
- ibis/backends/flink/__init__.py:320:17: RUF039 First argument to `re.search()` is not raw string
+ ibis/backends/flink/__init__.py:320:17: RUF039 [*] First argument to `re.search()` is not raw string
- ibis/backends/sql/compilers/pyspark.py:362:27: RUF039 First argument to `re.sub()` is not raw string
+ ibis/backends/sql/compilers/pyspark.py:362:27: RUF039 [*] First argument to `re.sub()` is not raw string
- ibis/backends/tests/test_client.py:1047:13: RUF039 First argument to `re.search()` is not raw string
+ ibis/backends/tests/test_client.py:1047:13: RUF039 [*] First argument to `re.search()` is not raw string
- ibis/backends/tests/test_client.py:1068:13: RUF039 First argument to `re.search()` is not raw string
+ ibis/backends/tests/test_client.py:1068:13: RUF039 [*] First argument to `re.search()` is not raw string
... 6 additional changes omitted for project

latchbio/latch (+0 -0 violations, +8 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

- src/latch_cli/centromere/ctx.py:464:26: RUF039 First argument to `re.match()` is not raw string
+ src/latch_cli/centromere/ctx.py:464:26: RUF039 [*] First argument to `re.match()` is not raw string
- src/latch_cli/services/init/init.py:309:18: RUF039 First argument to `re.search()` is not raw string
+ src/latch_cli/services/init/init.py:309:18: RUF039 [*] First argument to `re.search()` is not raw string
- src/latch_cli/services/init/init.py:316:18: RUF039 First argument to `re.search()` is not raw string
+ src/latch_cli/services/init/init.py:316:18: RUF039 [*] First argument to `re.search()` is not raw string
- src/latch_cli/services/register/register.py:59:36: RUF039 First argument to `re.compile()` is not raw string
+ src/latch_cli/services/register/register.py:59:36: RUF039 [*] First argument to `re.compile()` is not raw string

lnbits/lnbits (+0 -0 violations, +2 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

- lnbits/db.py:151:34: RUF039 First argument to `re.compile()` is not raw string
+ lnbits/db.py:151:34: RUF039 [*] First argument to `re.compile()` is not raw string

pandas-dev/pandas (+0 -7 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

- pandas/tests/indexes/datetimes/test_date_range.py:787:29: RUF039 First argument to `re.split()` is not raw string
- pandas/tests/io/formats/style/test_style.py:834:26: RUF039 First argument to `re.findall()` is not raw string
- pandas/tests/io/formats/test_format.py:1115:35: RUF039 First argument to `re.findall()` is not raw string
- pandas/tests/io/test_html.py:477:41: RUF039 First argument to `re.compile()` is not raw string
- pandas/tests/strings/test_find_replace.py:613:22: RUF039 First argument to `re.compile()` is not raw string
- pandas/tests/strings/test_find_replace.py:639:22: RUF039 First argument to `re.compile()` is not raw string
- web/pandas_web.py:99:31: RUF039 First argument to `re.compile()` is not raw string

pypa/build (+0 -0 violations, +4 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

- tests/test_integration.py:31:21: RUF039 First argument to `re.compile()` is not raw string
+ tests/test_integration.py:31:21: RUF039 [*] First argument to `re.compile()` is not raw string
- tests/test_integration.py:32:21: RUF039 First argument to `re.compile()` is not raw string
+ tests/test_integration.py:32:21: RUF039 [*] First argument to `re.compile()` is not raw string

python/typeshed (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select E,F,FA,I,PYI,RUF,UP,W

+ stdlib/csv.pyi:148:31: W292 No newline at end of file

python-poetry/poetry (+0 -13 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

- src/poetry/console/logging/formatters/builder_formatter.py:11:26: RUF039 First argument to `re.sub()` is not raw string
- src/poetry/console/logging/formatters/builder_formatter.py:13:26: RUF039 First argument to `re.sub()` is not raw string
- src/poetry/console/logging/formatters/builder_formatter.py:15:26: RUF039 First argument to `re.sub()` is not raw string
- src/poetry/console/logging/formatters/builder_formatter.py:18:17: RUF039 First argument to `re.sub()` is not raw string
- src/poetry/mixology/solutions/providers/python_requirement_solution_provider.py:22:13: RUF039 First argument to `re.match()` is not raw string
- src/poetry/mixology/solutions/providers/python_requirement_solution_provider.py:23:13: RUF039 First argument to `re.match()` is not raw string
- src/poetry/utils/dependency_specification.py:192:13: RUF039 First argument to `re.sub()` is not raw string
- tests/conftest.py:378:20: RUF039 First argument to `re.compile()` is not raw string
- tests/installation/conftest.py:43:20: RUF039 First argument to `re.compile()` is not raw string
- tests/installation/test_chooser.py:108:20: RUF039 First argument to `re.compile()` is not raw string
... 3 additional changes omitted for project

... Truncated remaining completed project reports due to GitHub comment length restrictions

Changes by rule (2 rules affected)

code total + violation - violation + fix - fix
RUF039 292 0 20 272 0
W292 1 1 0 0 0

@MichaReiser
Copy link
Member

Can you manually run your change on the poetry repo. I suspect that it panics and that this is the reason why we see fewer violations

@dylwil3
Copy link
Collaborator Author

dylwil3 commented Nov 22, 2024

Can you manually run your change on the poetry repo. I suspect that it panics and that this is the reason why we see fewer violations

Good catch, will do!

@MichaReiser
Copy link
Member

Or is this the same as with TC006 that both poetry and panda set fix = true.

@dylwil3
Copy link
Collaborator Author

dylwil3 commented Nov 22, 2024

Or is this the same as with TC006 that both poetry and panda set fix = true.

Yep it's that, just checked!

@dylwil3 dylwil3 merged commit 3fda2d1 into astral-sh:main Nov 22, 2024
20 checks passed
@dylwil3 dylwil3 deleted the raw-fix branch November 22, 2024 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixes Related to suggested fixes for violations preview Related to preview mode features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add autofix for RUF039
2 participants