-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metadata files generated with RayTaskRunner #16009
Comments
Hey @dqueruel-fy - those files are a consequence of persisting task and flow results.
This setting has an effect at workflow runtime and therefore setting it on the server will have no effect (all server configuration is prefixed with For more information, check out the documentation on results and settings: |
hi @dqueruel-fy - yes this sounds like expected behavior, that metadata is your serialized result » PREFECT_LOCAL_STORAGE_PATH=/tmp/result ipython
In [1]: from prefect import task
In [2]: @task(persist_result=True)
...: def f():
...: return 42
...:
In [3]: f()
16:35:23.491 | INFO | Task run 'f' - Finished in state Completed()
Out[3]: 42
In [4]: !ls /tmp/result
109c10d275731f842f4b08dd51b397aa when you say
... was about to type the same as @cicdw above, nevermind 🙂 |
@zzstoatzz @cicdw thanks for your quick answers ! I do understand that the files need to be generated but I don't understand why they are generated in my code base. My mention of These files are generated from where I call the python scripts. Have you tried my minimal python script and and run it from let's say I guess that the expected behavior is to have these files generated in the |
Ah I think I have a suspicion for what's going on! A few details (some of which are repetitive just for completeness sake):
If my suspicion is correct that you are only setting this setting on the "parent" process that executes the flow and not on the Ray workers, the easiest solution is probably to use a |
Thanks @cicdw for your insight ! So I've tested again with using % prefect config view
🚀 you are connected to:
http://127.0.0.1:4200
PREFECT_PROFILE=<profile>
PREFECT_API_URL='http://127.0.0.1:4200/api' (from profile)
PREFECT_LOCAL_STORAGE_PATH='/tmp/test' (from profile) And when running my example scripts, I still have one file generated to I've also tried providing the env var to the RayTaskRunner like this but that didn't help. @flow(log_prints=True, persist_result=True, task_runner=RayTaskRunner(init_kwargs={"runtime_env": {"env_vars":{"PREFECT_LOCAL_STORAGE_PATH": "/tmp/test"}}}))
def myFlow():
... Could you provide more information on how to use the |
hi @dqueruel-fy - I am looking into this now (this seems like a bug). it looks like when in ray, the task is unable to discover the parent context's result store and falls back to a default, relative path will update hopefully soon! I don't think |
Bug summary
Issue description
I don't know if it's a bug or a desired behavior but some metadata files are generated each time I run my flows locally. That's annoying because the files are generated in my source directory (or from where I run the flows/tasks). I'd like to have more info, please, on what these files are and if we can generate it somewhere else or, ideally, not generate them at all.
It generates files with filenames like
89e55eaee58e8ce3567e87801196d9d5
in the same folder that I call the python script (see below) with the following content:The minimal reproducible python script is
Version info
Additional context
Some notes:
persist_result
toFalse
.3.0.0rc14
to3.1.1
in my code base, and I reproduced it in this minimal example.PREFECT_LOCAL_STORAGE_PATH
to/tmp/result
but it didn't helpThe text was updated successfully, but these errors were encountered: