-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Sort state client cache keys before flush #1856
base: main
Are you sure you want to change the base?
Conversation
in some concurrent scenarios several cache keys can be updated by several routines, thus, in order to avoid the deadlocks (in postgresql for example) we should update keys in sorted order.
Hi @askurydzin 👋 Thanks for the PR. Do you mind opening an issue reproducing the scenario you're experiencing? That would help us understand the fix better. You can open the issue in our main repository. Alternatively can you add a unit test that's failing without this fix (e.g. one that uses the state client and has a deadlock in Postgres)? |
Hello, @erezrokah! It would be quite hard to reproduce it as it requires writing some demo plugin from scratch which uses state client. Would it be fine if I just briefly explain why primary keys for db needed to be sorted during insert/update? When it goes about two concurrent transaction T1 with set of keys { K1, K2, K3 } and T2 with set of keys T2 { K2, K3, K4 }, and we don't sort keys, following issue happens in postgres destination:
In the end both transactions appear to be in a dead-lock state. So, sorting state table primary keys before inserting allows to avoid such situations. |
Is the issue not reproducible on any of our existing plugins? I think it's better to have a reproduction otherwise we can't confirm the fix is working. |
@erezrokah no, I think it's not reproducible in any of existing plugins as I'm not sure if any of existing plugins (that I may have an access to) use state client. I stumbled across the issue when writing my own plugin (it's private). The issue is quite straight-forward, but if existence of the plugin vulnerable to the issue is the case for this SDK, I cannot write some demo plugin because it wouldn't be an existing plugin per se, and it may seem too artificial example. Hence, closing the PR. Sorry for wasting your time. |
Not wasting anyone's time that's for sure. We appreciate the feedback. There's a public plugin with a state client here https://github.com/cloudquery/cloudquery/blob/e94dda546631c5ef801852bbae997ad11f615f2e/plugins/source/hackernews/resources/services/items/items_fetch.go#L34 To clarify the request is not for one of our own plugins to reproduce this, but to have a test that fails without the fix, and passes with the fix, or a minimal reproduction scenario that shows the bug (maybe a very trimmed down version of your own plugin without any private code). |
@erezrokah Hm, if the run of the mentioned plugin consistently emits the set of keys that may overlap, I think running it in parallel may reproduce the thing, then give me some time and I'll check if it work. |
Hi @askurydzin, where you able to write a unit test that reproduces the issue? |
@erezrokah sorry, not yet, will provide steps to reproduce this week. |
@erezrokah I've prepared a repo with demo plugin with somewhat ridiculous logic, but it's aimed to demonstrate the deadlocks occurring with state-table overlapping keys: |
Summary
in some concurrent scenarios several cache keys can be updated by several routines, thus, in order to avoid the deadlocks (in postgresql for example) we should update keys in sorted order.