Performance limits for partitions #20802
Replies: 2 comments
-
Hi, I just stumbled on this post and I was wondering then how to mitigate this problem? The assets is partitioned in 5 minutes because we want to trigger a job every 5 minutes Thanks for all advices |
Beta Was this translation helpful? Give feedback.
-
Hello I'm also having thoughts about this. I'm new to dagster but I think this could be resolve with a job that consolidates your old partitions into a single one. Here's an idea: Partition Compaction Job For example, say you create a new orchestration partition every minute, you proceed as usual and ensure it is stored properly in your final storage. Then you have a job that runs at midnight every day (or every hour), and compacts the previous partitions into one. I believe this would need to use something like a Sensor. That sensor can publish AddPartitionRequest events and DeletePartitionRequests, that could be processed immediately, or on demand via the Ui. There should be a way to tag those requests with a "compaction" label. Downstream jobs Would that work? |
Beta Was this translation helpful? Give feedback.
-
Assets with over 25,000 partitions can cause performance issues in Dagster.
What performance is impacted?
The main impact is to the Dagster UI. Specifically, the UI will be slower when loading the asset graph and asset detail pages.
Is this a hard limit?
No. You won't see errors if your asset has over 25,000 partitions. But you will likely see slower UI performance.
Does this limit apply across all assets, or per-asset?
This applies per asset. For example, 25 assets with 1,000 partitions each will have normal UI performance, but one asset with 25,000 partitions will result in a slower UI
Are specific kinds of partitions less performant than others?
No. Performance is generally the same across partition types
Beta Was this translation helpful? Give feedback.
All reactions