-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added proposal for finalizing distributed lock building block #43
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: ItalyPaleAle <[email protected]>
Overall LGTM with a preference for option 1 + callback API for lost locks. It would be great if we could utilize existing etags to create a generic mechanism that works across all etag supported state stores, as it would provide the following major benefits:
|
May I ask the reasoning for preferring option 1?
This is certainly possible. My proposal here was to use actors. They offer a higher-level abstraction over just state stores, with better concurrency control (not optimistic, reducing the likelihood of conflicts that need to be retried), and support for things like reminders to perform a cleanup. Also, dogfooding is always good :) |
@ItalyPaleAle can you also copy the proposal text directly into the description here? Makes it easier to read :) |
I would rather not, as then they'd need to be kept in sync. I also prefer if people leave comments in the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, some comments from me.
I'm more in favor of option 2- I think from a programing perspective this feels a lot more natural as you are acquiring a lock "object" in code which you can pass around etc. Are we concerned about clients not supporting HTTP/2? Are there any SDKs which don't?
|
||
message LockRequest { | ||
// When set to true, if the resource is already locked, the method returns right away with an error, and does not wait for the resource to become available. | ||
bool immediate = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't immediately (pun not intended) clear to me what this flag did without reading the description. Could we use some variation of wait
which might be more natural? If we want to keep the default of no wait, then perhaps noWait
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
// ID of the resource whose lock needs to be renewed. | ||
// This is an arbitrary value supplied by the user. | ||
ResourceID string `json:"resourceId"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm mistaken, isn't LockID
all we need here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the component we definitely need the resource ID. The component itself won't maintain the state - the runtime will
// ID of the resource whose lock needs to be renewed. | ||
// This is an arbitrary value supplied by the user. | ||
ResourceID string `json:"resourceId"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are a bit inconsistent in the Dapr project about Id
vs ID
(even in this doc), and even between variable names and their json field encodings. I think we should try and be consistent moving forward.. I am firmly in the ID
camp 🙂
FWIW, Kubernetes also goes with ID
.
// ID of the resource that is locked. | ||
ResourceID string `json:"resourceId"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again re: needed because we have the LockID
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, the component does need the resource ID too. The state will be kept in the runtime, not component.
- Dapr renews locks automatically in background. Apps do not need to invoke a method to renew a lock explicitly. The TTL sent in requests is only used to control how frequently locks are to be renewed (lower TTLs cause locks to be renewed more frequently, but allow for quicker detection of failed hosts). | ||
- If the app becomes unhealthy (when app healthchecks are enabled) or if the sidecar crashes, locks are not renewed and are left to expire when the TTL ends. | ||
|
||
> About force unlocking: this is implemented primarily to recover from situations where the lock was owned by an app that is known to have crashed. This is a potentially dangerous activity, as the previous owner of the lock is not informed (at least, not immediately) that the lock has been taken away, at least not until it tries to renew or relinquish the lock. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm quite torn as to whether we should include this. I can see the use case, however the ramifications of two apps thinking they both have a lock could potentially be huge and is giving developers a big foot gun. Perhaps we hold off on exposing this API and reevaluating if it comes up as a feature request? Not being implemented by all components is also a kicker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been on the fence myself the entire time about this.
Proposals often indicate the "desired state", and they don't have to be implemented in full at the beginning. We can always create the API in the component and then expose it later.
|
||
message LockServerStream { | ||
oneof lock_server_stream { | ||
LockPing ping = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need ping? Surely the presence of the stream means that the lock is still active? Or does this mean if the network has failed downstream, worst case scenario the lock is improperly locked for is max ping interval time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for a ping is because it allows the server to send a message over the wire. Without that, if the connection drops, it's not immediately clear to the server that the connection is dropped.
Signed-off-by: Alessandro (Ale) Segala <[email protected]>
The distributed lock building block (dapr/dapr#3549) is currently in alpha and only partially-implemented. This is a proposal for completing the design and guiding the implementation to eventually stabilize the building block.