Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes on Jupyter notebooks cause panics when fix replaces/deletes multiple cells #14445

Open
beskep opened this issue Nov 19, 2024 · 7 comments
Labels
bug Something isn't working notebook Related to (Jupyter) notebooks

Comments

@beskep
Copy link

beskep commented Nov 19, 2024

  • command: ruff check .\src\test.ipynb --fix --diff
    (No panic when --isolated included)

  • ruff version: 0.7.4

  • pyproject.toml

[project]
requires-python = ">=3.13"
dependencies = ["ruff>=0.7.4"]

[tool.ruff]
preview = true

[tool.ruff.lint]
select = ["W"]
@dylwil3
Copy link
Collaborator

dylwil3 commented Nov 19, 2024

Thanks so much for this!

This can be reproduced with a notebook that just consists of three empty cells. We are somehow counting empty cells as newlines and then erroring when we try to "delete" them.

@dylwil3 dylwil3 added bug Something isn't working notebook Related to (Jupyter) notebooks labels Nov 19, 2024
@dylwil3
Copy link
Collaborator

dylwil3 commented Nov 19, 2024

Actually this bug has a wider blast radius.

If you make a notebook with three cells like this:

a = [1]
a.append(2)
# new cell
a.append(3)
# new cell
a.append(4)

and apply the unsafe fix for FURB118 you also get a panic.

@dylwil3 dylwil3 changed the title [Linter panic] when fixing W391 in Jupyter notebook Fixes on Jupyter notebooks cause panics when fix replaces/deletes multiple cells Nov 19, 2024
@dhruvmanila
Copy link
Member

I'm guessing that cell deletion is creating a problem when update_cell_offsets and update_cell_contents are being called.

@dhruvmanila
Copy link
Member

The root cause is because the edit spans across the cell boundary which is an invariant we like to follow. In the case of FURB113, the diagnostics is between 3 cells and thus the edit is as well.

Refer to how the cell offsets are being updated (incorrectly):

// Original cell offsets
[crates/ruff_notebook/src/notebook.rs:229:9] &self.cell_offsets = CellOffsets(
    [
        0,
        20,
        32,
        44,
    ],
)

// The edit
[crates/ruff_linter/src/fix/mod.rs:97:13] edit = Edit {
    range: 8..43,
    content: Some(
        "x.extend((2, 3, 4))",
    ),
}

// Source map created for the edit
[crates/ruff_notebook/src/notebook.rs:230:9] source_map = SourceMap(
    [
        SourceMarker {
            source: 8,
            dest: 8,
        },
        SourceMarker {
            source: 43,
            dest: 27,
        },
    ]
)

// After the above source map is applied to update the cell offsets
[crates/ruff_notebook/src/notebook.rs:263:9] &self.cell_offsets = CellOffsets(
    [
        0,
        20,
        32,
        28,
    ],
)

I don't this rule existed when I implemented this.

@dhruvmanila
Copy link
Member

So, I think we need to consider replacement and deletion across cell boundaries. Insertion is fine because it's at an exact position which isn't going to affect this logic. There's some details on how the source markers are being used to update the cell offsets in the PR description: #4665.

@dylwil3
Copy link
Collaborator

dylwil3 commented Nov 19, 2024

Oh that's neat- and nice drawing! For deletions we could just delete the cell and adjust the cell offsets (so there will be fewer cell offsets than there used to be). What do you think?

For replacements it's a little less clear what to do, since folks separate things across cells in order to control what executes when.

@dhruvmanila
Copy link
Member

For deletions we could just delete the cell and adjust the cell offsets (so there will be fewer cell offsets than there used to be). What do you think?

The thing is we don't really know the kind of edit when trying to update the cell offsets. The main reason for adding the SourceMap abstraction is to avoid passing this information and it would also help in adding multi-language support in Ruff. So, in update_cell_offsets you only have access to the list of SourceMarker and we don't really know which two source markers makes up an edit.

I think we'll need a generic solution which just looks at source markers to determine that certain cell offsets needs to be removed. I'm not exactly sure what this looks like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working notebook Related to (Jupyter) notebooks
Projects
None yet
Development

No branches or pull requests

3 participants