Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uneven distribution of shards over the consumer application workers #1247

Open
omerbarlev opened this issue Jan 15, 2024 · 1 comment
Open

Comments

@omerbarlev
Copy link

omerbarlev commented Jan 15, 2024

In the Dynamo table that holds the leases and checkpoint information, I can see that workers have an uneven distribution of shards.
In my case for example - for 256 shards and 41 workers (pods in Kubernetes), the majority have ~6-8 shards assigned, but I also see ~5 workers with only 1-2 shards.

How can I make the distribution more evenly? is there some configuration I can change for it?

This causes workers to not fully utilize their resources (the workers with the 1-2 shards barely use any CPU and I would rather have another worker with ~6-8 shards than 5 workers with 1-2 shards each).

It is worth mentioning I am using the amazon-kinesis-client-python library.

@brendan-p-lynch
Copy link
Contributor

How are you identifying the number of shards that are being processed on each worker?

A pattern we have seen in the past is that a child shard will have a lease that is being taken by a worker, and that they are dispersed onto the workers with less shards. Because workers don't have a distinction of which shards are active and which are not the active shards are not always taken evenly.

As of right now there is no configuration to improve the distribution of shards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants