You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently, the DynamoDB source will grab up to 150 active shards in one data prepper container, and continue to hold onto those shards until the shard is closed and the end of the shard iterator is reached, which will happen either after 4 hours, or after the shard has a certain amount of data.
This means that for DynamoDB tables with a large amount of shards on the streams, regardless of how much data is being sent to the streams, many data prepper containers (a minimum of shard count / 150) must be used to achieve low latency on the DDB stream data.
Describe the solution you'd like
A single data prepper container should grab ownership of a shard, process it for some time, then checkpoint it with a sequence number, before giving up that shard and moving to the next one. This would allow for one data prepper container to process all of the shards in a DynamoDB stream in a somewhat timely manner, with the trade off that latency may be slightly higher when using a large amount of data prepper containers
Describe alternatives you've considered (Optional)
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Currently, the DynamoDB source will grab up to 150 active shards in one data prepper container, and continue to hold onto those shards until the shard is closed and the end of the shard iterator is reached, which will happen either after 4 hours, or after the shard has a certain amount of data.
This means that for DynamoDB tables with a large amount of shards on the streams, regardless of how much data is being sent to the streams, many data prepper containers (a minimum of
shard count / 150
) must be used to achieve low latency on the DDB stream data.Describe the solution you'd like
A single data prepper container should grab ownership of a shard, process it for some time, then checkpoint it with a sequence number, before giving up that shard and moving to the next one. This would allow for one data prepper container to process all of the shards in a DynamoDB stream in a somewhat timely manner, with the trade off that latency may be slightly higher when using a large amount of data prepper containers
Describe alternatives you've considered (Optional)
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: