First collection of Restate patterns #49

StephanEwen · 2023-12-03T23:21:10Z

No description provided.

typescript/patterns/src/xa_transactions.ts

typescript/patterns/src/async_calls_as_rpc.ts

typescript/patterns/src/deterministic_idempotency_tokens.ts

typescript/patterns/src/single_writer_concurrency.ts

typescript/patterns/src/cross_db_transactions.ts

typescript/patterns/src/xa_transactions.ts

tillrohrmann

Thanks a lot for creating the patterns @StephanEwen. They are really helpful :-) I've got a question concerning the locking pattern whether there is an unkeyed service missing to make it work correctly?

tillrohrmann · 2023-12-04T10:25:55Z

typescript/patterns/src/distributed_locks.ts

+    acquireBlocking: async (ctx: restate.RpcContext, lockId: string): Promise<string> => {
+        const awakeable = ctx.awakeable<string>();
+        ctx.send(lockServiceApi).acquireAsync(lockId, awakeable.id);
+        return awakeable.promise;
+    },


To my understanding, this call should not work since an Awakeable is bound to a journal and we are not releasing the lock until the awakeable is fulfilled (assuming that returning a promise in the TS SDK awaits its completion). I think the problem is that acquireAsync is defined on the same service.

oh, yes. I actually had it like that before, and then thought the two services could be merged after all, but you are right, I need to change it back.

typescript/patterns/src/cross_db_transactions.ts

typescript/patterns/src/deterministic_idempotency_tokens.ts

tillrohrmann · 2023-12-04T10:30:15Z

typescript/patterns/package.json

+  "type": "commonjs",
+  "scripts": {
+    "build": "tsc --noEmitOnError"
+  },


The patterns don't seem to be integrated into CI which runs verify for all TS examples. To ensure that the patterns not break, it would be great to integrate them. https://github.com/restatedev/examples/blob/main/.github/workflows/test.yml

tillrohrmann · 2023-12-04T10:42:13Z

typescript/patterns/package.json

+  "devDependencies": {
+    "typescript": "^5.0.2"
+  }
+}


Should we also add a README.md listing the different patterns?

Later, once this is converged

typescript/patterns/src/xa_transactions.ts

tillrohrmann · 2023-12-04T10:46:33Z

typescript/patterns/src/xa_transactions.ts

+
+    prepareTxn(txnId: string): Promise<void>          // prepared txn for commit under given ID
+    commitPreparedTxn(txnId: string): Promise<void>   // commits a txn prepared under given ID
+    rollbackPreparedTxn(txnId: string): Promise<void> // rolls bacl a txn prepared under given ID


Suggested change

rollbackPreparedTxn(txnId: string): Promise<void> // rolls bacl a txn prepared under given ID

rollbackPreparedTxn(txnId: string): Promise<void> // rolls back a txn prepared under given ID

typescript/patterns/src/xa_transactions.ts

tillrohrmann · 2023-12-04T11:10:37Z

typescript/patterns/src/xa_transactions.ts

+
+        if (commitDecision.commit) {
+            // now we need to ensure this one gets committed
+            await ctx.sideEffect(() => database.commitPreparedTxn(txnId));


Can the commitPreparedTxn call ever fail permanently (so no amount of retries would let it succeed)?

Probably can, yes. This whole code is very PoC-ish - the basic pattern is in place, but detailed handing of the various possible errors is critical here.

tillrohrmann · 2023-12-04T11:16:42Z

typescript/patterns/src/xa_transactions.ts

+        } else {
+            // we clean up this query. it might be that this query was never prepared, but
+            // that does not matter here
+            await ctx.sideEffect(() => database.rollbackPreparedTxn(txnId));


What happens in the following scenario?

Assume we have a first call that manages to persist its txnId and then before running the weRunTheQuery if block it pauses executing. Now Restate retries the invocation. The retry attempt will reach this point and rolls back the prepared txn which hasn't been created. Now the original invocation continues running and creates a prepared txn which acquires all the locks. If then the second invocation attempt enters the weRunTheQuery block and runs runSql what would happen? Would the runSql execution fail because it cannot acquire the locks or would it wait on acquiring the locks?

You are right, this seems still possible.

Co-authored-by: Ahmed Farghal <[email protected]>

Co-authored-by: Till Rohrmann <[email protected]>

pcholakov

The XA example is definitely a bit difficult to follow! Maybe that's inherent though.

pcholakov · 2023-12-04T08:55:57Z

typescript/patterns/src/xa_transactions.ts

+        const txnId = await ctx.sideEffect(() => {
+            weRunTheQuery = true;
+            return Promise.resolve(randomUUID());
+        });


I'm wondering if the state of the transaction isn't better off persisted as a state key instead? I worry that we're training people to "leak" state past the Restate API with this approach.

I'd like to offer simple guidance to customers on how to write side effect bodies: the functions must communicate with their containing scope only via read-only parameters and their return values.

In the future we might want to expose a version of sideEffect that returns some extra metadata. I don't know whether it's a great idea to even know whether the side effect committed for the first-time, or it's a replay execution, but that could avoid having to use a "side channel" variable. I imagine the extra metadata would be very useful for other purposes though. E.g. we could return the journal entry's SN, as it's super useful to have a source of guaranteed monotonic SNs in almost any transactional system!

In the mean time, maybe something like this would be more clear?

const transactionStarted = ctx.get("transactionId"); if (!transactionStarted) { transactionId = ctx.rand.uuidv4(); ctx.set("transactionId", transactionId); } ...

Not 100% sure this is semantically equivalent – working through an updated version myself – but I find it much easier to follow. 😊

tillrohrmann · 2023-12-04T12:19:30Z

typescript/patterns/src/xa_transactions.ts

+        let weRunTheQuery = false;
+        const txnId = await ctx.sideEffect(() => {
+            weRunTheQuery = true;
+            return Promise.resolve(randomUUID());
+        });


Is this pattern also working with a Lambda deployment? From skimming over the typescript SDK it seems that before returning the side effect result we await the storage ack from the runtime. If this is the case, then reaching beyond this point on Lambda would require a replay and therefore not setting weRunTheQuery = true, I believe.

Yes, Jack and me found that issue as well. Currently thinking if there is a way to express this that is compatible with "always suspend" executions.

pcholakov · 2023-12-04T12:43:13Z

typescript/patterns/src/xa_transactions.ts

+}
+
+
+export async function runXaDatabaseTransaction<QueryT, ResultT>(


All the side effects here are making this hard to follow and think through. I don't think we need any of them – just a single journal operation to persist the id - to get at-most-once execution. This is the simplest code I could come up with.

export async function runXaDatabaseTransaction<QueryT, ResultT>( ctx: restate.RpcContext, database: Database<QueryT, ResultT>, query: QueryT): Promise<ResultT> { let txnId = ctx.get("transactionId"); if (txnId === undefined) { // First attempt - generate a new ID and save it. (We do it this way to avoid writing a "leaky side effect".) txnId = randomUUID(); ctx.set("transactionId", txnId); } else { // A previous run failed. Roll back the prepared transaction (if any). await database.rollbackPreparedTxn(txnId); ctx.clear("transactionId"); return { status: "ABORTED" } as ResultT; } try { await database.beginTxn(); const result = await database.runSql(query); await database.prepareTxn(txnId); await database.commitPreparedTxn(txnId); ctx.clear("transactionId"); return result; } catch (e) { await database.abortTxn(); // Note! We deliberately don't do this in a "finally" block - if aborting fails, we want to remember the id! ctx.clear("transactionId"); return { status: "ABORTED" } as ResultT; } }

Do we need anything more complicated than this?

I think this example breaks in multiple places:

through the durable execution, we always follow the path where the txn ID is initially unset.

A failure between await database.prepareTxn(txnId); and await database.commitPreparedTxn(txnId); means the prepared transaction stays. Because it won't ever be cleaned up (point 1), it will block the database forever.

Imagine you have a process that makes it to const result = await database.runSql(query); and then stalls. A second (failover execution) makes it all the way though, then the stalled process continues. You now have committed everything twice.

Those are exactly the points I was trying to address in my implementation. Maybe there is a slightly simpler way than I did it, but I think this is inherently complex, because

(a) Postgres misses some primitives to make this easier (like fence off previous IDs)

(b) we are implementing the txn coordinator, with all recovery logic

(c) for that txn coordinator, durable execution is surprisingly not the perfect match, because we want to look at the log while replaying, not skip over actions just like that.

Doh, of course ctx.get() won't work on replay – and in Lambda in particular where everything is replayed even on the first run.

A failure between await database.prepareTxn(txnId); and await database.commitPreparedTxn(txnId); means the prepared transaction stays. Because it won't ever be cleaned up (point 1), it will block the database forever.

Won't such a failure trigger automatic rollback, because presumably the session/connection gets terminated?

Imagine you have a process that makes it to const result = await database.runSql(query); and then stalls. A second (failover execution) makes it all the way though, then the stalled process continues. You now have committed everything twice.

This was absolutely not part of my mental failure model, thank you for pointing that out!

Thanks for indulging me – this was very educational!

I think the hardest problems (and the main class of errors Jepsen tends to find) are related to

partial network partitions

long stalls in processes that make another process think it timed out and disappeared (but where the process then comes back). Rare corner cases, but saw those problems actually happening during container migrations.

StephanEwen · 2023-12-05T20:25:31Z

Thanks for all the feedback. I updated this by

addressing inline comments
fixing distributed lock pattern
removing XA transactions and cross DB actions for now
Adding Pavel's DynamoDB pattern

First collection of Restate patterns

32697ba

pcholakov reviewed Dec 4, 2023

View reviewed changes

typescript/patterns/src/xa_transactions.ts Outdated Show resolved Hide resolved

typescript/patterns/src/xa_transactions.ts Outdated Show resolved Hide resolved

Add initial readme for patterns

da9cfe9

AhmedSoliman reviewed Dec 4, 2023

View reviewed changes

tillrohrmann reviewed Dec 4, 2023

View reviewed changes

StephanEwen and others added 8 commits December 4, 2023 12:53

Update typescript/patterns/src/single_writer_concurrency.ts

f9b11df

Co-authored-by: Ahmed Farghal <[email protected]>

Update typescript/patterns/src/xa_transactions.ts

3e2fc82

Co-authored-by: Ahmed Farghal <[email protected]>

Update typescript/patterns/src/cross_db_transactions.ts

a0518c8

Co-authored-by: Till Rohrmann <[email protected]>

Update typescript/patterns/src/deterministic_idempotency_tokens.ts

f0f143d

Co-authored-by: Till Rohrmann <[email protected]>

Update typescript/patterns/src/xa_transactions.ts

64cf1fa

Co-authored-by: Till Rohrmann <[email protected]>

Update typescript/patterns/src/xa_transactions.ts

bb12d6b

Co-authored-by: Till Rohrmann <[email protected]>

Update typescript/patterns/src/xa_transactions.ts

3b86259

Co-authored-by: Till Rohrmann <[email protected]>

Address review comments

a7dc102

pcholakov reviewed Dec 4, 2023

View reviewed changes

tillrohrmann reviewed Dec 4, 2023

View reviewed changes

pcholakov reviewed Dec 4, 2023

View reviewed changes

pcholakov mentioned this pull request Dec 4, 2023

Add DynamoDB idempotency example StephanEwen/restate-examples#1

Closed

StephanEwen force-pushed the patterns_pr branch from 4026a63 to da9cfe9 Compare December 5, 2023 12:08

StephanEwen merged commit ca40202 into restatedev:main Dec 5, 2023
2 checks passed

StephanEwen mentioned this pull request Dec 5, 2023

Placeholder for the first pattern #48

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First collection of Restate patterns #49

First collection of Restate patterns #49

StephanEwen commented Dec 3, 2023

tillrohrmann left a comment

tillrohrmann Dec 4, 2023

StephanEwen Dec 4, 2023

tillrohrmann Dec 4, 2023

tillrohrmann Dec 4, 2023

StephanEwen Dec 4, 2023

tillrohrmann Dec 4, 2023

tillrohrmann Dec 4, 2023

StephanEwen Dec 4, 2023

tillrohrmann Dec 4, 2023

StephanEwen Dec 4, 2023

pcholakov left a comment

pcholakov Dec 4, 2023

tillrohrmann Dec 4, 2023

StephanEwen Dec 4, 2023

pcholakov Dec 4, 2023

StephanEwen Dec 4, 2023

pcholakov Dec 4, 2023

StephanEwen Dec 5, 2023

StephanEwen commented Dec 5, 2023

	rollbackPreparedTxn(txnId: string): Promise<void> // rolls bacl a txn prepared under given ID
	rollbackPreparedTxn(txnId: string): Promise<void> // rolls back a txn prepared under given ID

		}


		export async function runXaDatabaseTransaction<QueryT, ResultT>(

First collection of Restate patterns #49

First collection of Restate patterns #49

Conversation

StephanEwen commented Dec 3, 2023

tillrohrmann left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pcholakov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StephanEwen commented Dec 5, 2023