Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cell seems to execute prematurely [regression] #2246

Closed
proycon opened this issue Sep 6, 2024 · 4 comments · Fixed by #2254
Closed

Cell seems to execute prematurely [regression] #2246

proycon opened this issue Sep 6, 2024 · 4 comments · Fixed by #2254
Labels
bug Something isn't working

Comments

@proycon
Copy link

proycon commented Sep 6, 2024

Describe the bug

Certain cells seems to execute prematurely since v0.8.5 (reproduces also with the latest v0.8.11). This issue came to light during #2231, but as it seems an independent issue and separate regression, I therefore add a new report for it with some extra details I found debugging this (see code to reproduce).

Environment

Using marimo v0.8.5 and above

{
  "marimo": "0.8.5",
  "OS": "Linux",
  "OS Version": "6.10.7-arch1-1",
  "Processor": "",
  "Python Version": "3.12.5",
  "Binaries": {
    "Browser": "--",
    "Node": "v22.8.0"
  },
  "Requirements": {
    "click": "8.1.7",
    "importlib-resources": "missing",
    "jedi": "0.19.1",
    "markdown": "3.6",
    "pymdown-extensions": "10.8.1",
    "pygments": "2.18.0",
    "tomlkit": "0.13.0",
    "uvicorn": "0.30.1",
    "starlette": "0.37.2",
    "websockets": "12.0",
    "typing-extensions": "missing",
    "ruff": "0.5.2"
  }
}

Code to reproduce

My notebook at https://github.com/knaw-huc/brieven-van-hooft-notebook starts by downloading some (fairly large) data, up until v0.8.5 this worked fine. But now it often fails with this error (at commit knaw-huc/brieven-van-hooft-notebook@7df70ea ):

Traceback (most recent call last):                                                                                     
  File "/home/proycon/work/brieven-van-hooft-notebook/env/lib/python3.12/site-packages/marimo/_runtime/executor.py", line 170, in execute_cell                                                                                                 
    exec(cell.body, glbls)                      
  File "/tmp/marimo_5466/__marimo__cell_Xref_.py", line 3, in <module>                                                 
    available_datasets = [ x.id() for x in store.datasets() ]                                                          
                                           ^^^^^                                                                                                                                                                                               
NameError: name 'store' is not defined
Traceback (most recent call last):                                                                                     
  File "/home/proycon/work/brieven-van-hooft-notebook/env/lib/python3.12/site-packages/marimo/_runtime/executor.py", line 170, in execute_cell                                                                                                 
    exec(cell.body, glbls)           
  File "/tmp/marimo_5466/__marimo__cell_BYtC_.py", line 2, in <module>                                                 
    dataset_metadata = store.dataset("brieven-van-hooft-metadata")                                                     
                       ^^^^^                               
NameError: name 'store' is not defined

Two cells seem to get executed before the variable is properly set, this worked fine in at least v0.8.4 and v0.7.7.

On restarting marimo everything works fine because the data will have been downloaded already. Remove the file
hoof001hwva.output.store.stam.json in the current working directory to reproduce the issue again.

I added some extra checks to my notebook to pinpoint and try to resolve this issue (knaw-huc/brieven-van-hooft-notebook@7df70ea). In an earlier version (at commit knaw-huc/brieven-van-hooft-notebook@2fab4f2) the error was:

Traceback (most recent call last):                                                                                                                                                                                                             
  File "/home/proycon/work/brieven-van-hooft-notebook/env/lib/python3.12/site-packages/marimo/_runtime/executor.py", line 170, in execute_cell                                                                                                 
    exec(cell.body, glbls)                                                                                                                                                                                                                     
  File "/tmp/marimo_4017146/__marimo__cell_lEQa_.py", line 3, in <module>                                                                                                                                                                      
    store = stam.AnnotationStore(file="hoof001hwva.output.store.stam.json")                                                                                                                                                                    
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                    
stam.PyStamError: [StamError] DeserializationError: Deserialization failed: EOF while parsing a string at line 2548242 column 16   

This error indicates that this JSON file was being loaded whilst it wasn't fully downloaded yet (because the cell ran prematurely). In the aforementioned commit I tried to work around this but as you see the problem still persists and didn't occur in earlier versions, leading me to conclude it is a regression in marimo itself.

@akshayka
Copy link
Contributor

akshayka commented Sep 6, 2024

Thanks for reporting, and sorry your notebook isn't working any longer.

I would be surprised if this were a recent regression in marimo, because the runtime logic hasn't changed.

I took a look at your notebook and it looks like there are various branches which determine whether or not a variable is defined. In particular store may not be defined if certain conditions are not met.

I would suggest refactoring your notebooks to ensure that variables are unconditionally defined. That will help determine whether cells are being run prematurely as you suggest, or whether there is some other bugg.

@dmadisetti
Copy link
Collaborator

Possibly related to this: #1509

I addressed it at the time with strict execution which prevented the cell from running if something was not defined (but it didn't solve the underlying problem). I assumed it was an issue with stop breaking the tree at the time, but might just be a topology sorting issue.

For what it's worth, that's the only notebook that had had that behavior for me

@dirkroorda
Copy link

The execution order as derived from which cells define and use global variables is not, in general, a total order, and where cells are not comparable, the execution order is not defined, and Marimo can choose its own order. I have seen different orders in different runs.

@proycon Have you tried to add "dummy" variables to some of the cells to enforce a total ordering, and does that make the problem disappear?

proycon added a commit to knaw-huc/brieven-van-hooft-notebook that referenced this issue Sep 6, 2024
…tore`, made other cells mo.stop if store is not loaded

Ref: marimo-team/marimo#2246
@proycon
Copy link
Author

proycon commented Sep 6, 2024

Thanks for all the suggestions, I refactored things a bit now and also moved all download and loading logic into one cell.

Still, I sometimes get the error that the JSON file download is not successful (the checksum fails), but since all the logic is now in one cell we can rule out marimo being the cause here and I may even have to look into possible networking problems. I'll close this issue. It remains a bit odd that I never encountered this issue this with older versions of Marimo, but the behaviour doesn't seem to be entirely deterministic (the more reason to consider networking issues).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants