-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retryable transactions + async exception handling #1482
base: master
Are you sure you want to change the base?
Retryable transactions + async exception handling #1482
Conversation
-- @since 2.14.6.0 | ||
runSqlPoolWithExtensibleHooksRetry | ||
:: forall backend m a. (MonadUnliftIO m, BackendCompatible SqlBackend backend) | ||
=> (UE.SomeException -> Bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth pointing out that the API has been intentionally kept simple here: so long as synchronous exceptions match the predicate, the transaction will always be retried. In the case of serialization failures at repeatable-read and serializable isolation specifically, the recommendation from the PostgreSQL docs and elsewhere is to retry these failing transactions unconditionally.
That guidance is what drove this simpler API, but users may have other failures (e.g. uniqueness violations) they would like to only retry up to a fixed number of times or retry based on some more sophisticated policy. The simple exception predicate approach in this PR would not work for those more complicated cases. Considering the more complicated cases are rare, it seemed prudent to keep the retry API as simple as possible for now. However, depending upon the exception predicate the user specifies, they may get themselves into a situation where persistent
indefinitely retries. For these cases, they could wrap their runSqlPoolWithExtensibleHooksRetry
call in a timeout
.
We've had success running these changes in production for a few months now. Is there anything I can help with in regards to moving the PR along? |
…level in test infra
…t case This test case will fail until async exception handling is added to runSqlPoolWithExtensibleHooks
…ooks Note that the test added in the previous commit now passes.
…n handler's masking state
…ns test case This test case will fail until retryable transaction support is added in a new runSqlPoolWithExtensibleHooks variant.
…resql: Add isSerializationFailure and isDeadlockDetected Note that the test added in the previous commit now passes.
…syncExceptionsTest to here
91b68a4
to
0c165fd
Compare
The main change in this PR is adding support for retryable transactions. For example, when using the various
runSqlPool*
functions at a repeatable-read or serializable isolation level, application authors need a means to retry transactions that encounter serialization failures. Without official support for this inpersistent
,runSqlPool*
users could technically still manually catch a serialization failure exception around the wholerunSqlPool*
and retry the transaction, but this is a nonstarter as the connection would need to be returned then reacquired from the pool. To support retryable transactions, arunSqlPoolWithExtensibleHooksRetry
function was added that takes in an exception predicate to determine retrying the transaction. The existingrunSqlPoolWithExtensibleHooks
is now implemented in terms of the new function.Another change in this PR is around
runSqlPoolWithExtensibleHooks
's async exception handling. The previous version of this function would not run therunOnException
hook when the user-specified database action was aborted via async exception, as async exceptions were ignored entirely due to use ofunliftio
'scatchAny
. This is not as problematic as it sounds on the surface: if an async exception came in, the enclosingwithResource
on the pool would catch it and terminate the connection. For PostgreSQL, when the connection is terminated, the database discards whatever changes were made in the transaction even though there was no explicit rollback. With the change in this PR, if users have custom logic defined in theirrunOnException
hook on top of just rolling back the transaction, they should now be able to rely onpersistent
to execute this hook when the user-specified database action encounters any type of exception.There were also multiple spots where the masking state was being restored (basically everything was in a
restore
except for the installation of thecatchAny
handler). The masking has been changed in this PR such thatalterBackend
is still in arestore
, as is the user-specified action, butrunBefore
andrunAfter
are no longer in arestore
. The previous version'srestore
usage came in from #1207, so I verified that theconn-killed
binary still producesRight
with the new masking. Additionally, it's worth noting thatrunOnException
is now implicitly in anuninterruptibleMask
(viaunliftio
'swithException
).The PR might be easiest to review commit-by-commit, as intentionally failing tests were added prior to changes being made to the libraries.
Before submitting your PR, check that you've:
@since
declarations to the Haddockstylish-haskell
on any changed files..editorconfig
file for details)After submitting your PR:
(unreleased)
on the Changelog