Reactive execution graphs: expressing event-based dependencies #3464
Replies: 3 comments
-
Devils advocate: One could achieve the same by doing exactly this in combination with a conditional, right? Why is this solution not good enough?
I think this is a very interesting feature. A while back I prototyped a workflow trigger that starts an execution when newly labeled data is added to a bucket. For that I used GCP pub/sub notifications that fire when such data is added and a cloud function listening to pub/sub that used flyte remote to trigger the workflow. That worked but it would be very nice if such external triggers could be implemented purely in Flyte. |
Beta Was this translation helpful? Give feedback.
-
Hey!
Use-case:We have a workflow with a task (head node) depending on upstream files written to s3. These files can arrive anytime between midnight and 5am or later, latencies are not uncommon.
How did we use to handle delayed external dependencies:
If the external dependencies are not satisfied, then, depending on our retry policy (
Something to note here is that the workflow won't fail if the external dependencies are not fulfilled. Most of the time the external dependencies are just delayed and not failed. Plus, we wouldn't want to have 20+ Flyte workflows failing many-many times. Problem:
This solution doesn't look suitable for complex pipelines with many delayed external dependencies. Desired behaviour: On a schedule kick-in:
The approach we're suggesting is more of a poll/pull compared to the push/event-based suggested by @cosmicBboy. |
Beta Was this translation helpful? Give feedback.
-
2023-11-09 Contributor's meetup notes: more investigation is needed to confirm if using Airflow Operators enables what’s described in this entry. |
Beta Was this translation helpful? Give feedback.
-
This is an RFC draft for supporting event-based dependencies between workflows in Flyte.
User Story
As a Flyte workflow author, I want to kick-off a task/workflow based on an event so that I can establish flexible dependencies between tasks/workflows without having to compose them into yet another workflow.
Use Case: Model Re-training
Suppose I have three workflows:
model_training
: takes data/hyperparams as input, outputs a trained modelassess_model_drift
: takes the trained model and assesses it for data/concept drift using the latest datadeploy_model
: deploys model to some external serving layerWhen
assess_model_drift
runs, it emits an output value that indicates whether the model needs to be re-trained. If this workflow returnsTrue
, themodel_training
workflow should be executed.Example Code
Event Types
True
orFalse
)bool
orstr
Resources
Beta Was this translation helpful? Give feedback.
All reactions