Kedro on PyCafe is sort of possible #4271

antonymilne · 2024-10-30T20:43:00Z

Greetings friends 😀 and FYI @maxschulz-COL @maartenbreddels

Just wanted to report that it is sort of possible to run Kedro in the browser using PyCafe! ☕ 🚀 There's a couple of workarounds needed but for the spaceflights project at least it's not too hard. Take a look here for my example project: https://py.cafe/antonymilne/kedro-vizro-dashboards

I did kedro new with kedro==0.19.9 and used the spaceflights example code (the project is in the folder descriptively named blah). The project is run in app.py. At the moment PyCafe requires some kind of app so there's also a simple Vizro app just so that the kedro code can run. Of course in reality this could be an actual interesting dashboard reporting results of the pipeline.

The catches/workarounds that I noticed with this simple example. No doubt there will be further difficulties for more realistic projects, and I didn't even try to install all the requirements in blah/requirements.txt, just took what looked like the bare minimum to get the pipeline to run.

No wheel file available for the version of antlr4-python3-runtime that omegaconf currently requires. This is fixed in omegaconf 2.4.0 but that's only in dev release. I see that you're aware of this difficulty already in Relax dependency on antlr-python3-runtime facebookresearch/hydra#2699 and Release schedule v2.4 omry/omegaconf#1158. So this is why I set omegaconf==2.4.0.dev3 in the requirements.txt, which appears to work well
pre-commit-hooks has ruamel-yaml-clib as a transitive dependency, which doesn't have a wheel file available. I was actually quite surprised to see pre-commit-hooks as a kedro requirement, and I see this was a bit of a controversial addition at the time (Sort requirements.txt based on package name only #3436). It's easy to fix if you don't actually want to run pre-commit though, just with ruamel-yaml-clib # mock in requirements.txt.
Can't remember exactly what it was but I had a problem with some of the datasets in catalog.yml so I only usedpandas.CSVDataset or pandas.ExcelDataset.
To avoid No module named '_multiprocessing' I mocked out the parallel runner.
By default the kedro pipeline runs fine and then just starts again and again once it's finished so will never get to the Vizro app part. I guess this might be because running the pipeline produces files, which is then detected as a filesystem change by PyCafe which causes the app to refresh or something like that? This happens even with "Save on Type" set to Off. To solve this I've just commented out all the output datasets in catalog.yml so the pipeline only runs twice (not sure why it's twice rather than once but it's not infinite now anyway 😅 ...) If you uncomment the output datasets in catalog.yml it will just execute on loop. @maartenbreddels do you know what's happening here?

The text was updated successfully, but these errors were encountered:

antonymilne · 2024-10-30T20:44:49Z

P.S. this isn't really a feature request but it didn't fit in any other category so there you are... I guess the feature request might be please make it easier to not use parallel runner, omegaconf or pre-commit-hooks but this is kind of an edge case I guess and it's possible already with some workarounds, so I don't think you need to do anything about it for my purposes anyway. Maybe it's another small data point for @astrojuanlu's considerations on how modular kedro should be though. The biggest problem here I think is the omegaconf dependency but if and when they release 2.4.0 that will be resolved for this particular setup anyway.

Mainly I just put this here as a report of what's currently possible since there was nowhere better to put it.

Edit: oh wait, I see that discussions are open now. Maybe I should have put it there. I leave it up to you to decide whether you want to move it anyway!

antonymilne added the Issue: Feature Request New feature or improvement to existing feature label Oct 30, 2024

kedro-org locked and limited conversation to collaborators Oct 31, 2024

astrojuanlu converted this issue into discussion #4278 Oct 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Kedro on PyCafe is sort of possible #4271

Kedro on PyCafe is sort of possible #4271

antonymilne commented Oct 30, 2024

antonymilne commented Oct 30, 2024 •

edited

Loading

This issue was moved to a discussion.

This issue was moved to a discussion.

Kedro on PyCafe is sort of possible #4271

Kedro on PyCafe is sort of possible #4271

Comments

antonymilne commented Oct 30, 2024

antonymilne commented Oct 30, 2024 • edited Loading

This issue was moved to a discussion.

antonymilne commented Oct 30, 2024 •

edited

Loading