You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One container starts to have "Redis command timed out" and it does not recover.
Current Behavior
we have started to use aws redis serveless, since the switch from redis cluster-mode enabled to redis serverless, out of the blue one client from many tasks running starts to have command timed out continuously. Depending on the service and size of objects and if put or get commands are being timed out, we see different side effects, sometimes it results in #705, sometimes only increased latency.
Note that in the services this happens, usually there are more than 12 tasks running, and only one container has this issue.
I could correlate any CPU spike with the event causing the timeouts.
org.springframework.dao.QueryTimeoutException: Redis command timed out; nested exception is io.lettuce.core.RedisCommandTimeoutException: Command timed out after 5 second(s)
at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:70)
at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:41)
at org.springframework.data.redis.PassThroughExceptionTranslationStrategy.translate(PassThroughExceptionTranslationStrategy.java:44)
at org.springframework.data.redis.FallbackExceptionTranslationStrategy.translate(FallbackExceptionTranslationStrategy.java:42)
at org.springframework.data.redis.connection.lettuce.LettuceConnection.convertLettuceAccessException(LettuceConnection.java:277)
at org.springframework.data.redis.connection.lettuce.LettuceConnection.await(LettuceConnection.java:1085)
at org.springframework.data.redis.connection.lettuce.LettuceConnection.lambda$doInvoke$4(LettuceConnection.java:938)
at org.springframework.data.redis.connection.lettuce.LettuceInvoker$Synchronizer.invoke(LettuceInvoker.java:665)
at org.springframework.data.redis.connection.lettuce.LettuceInvoker.just(LettuceInvoker.java:125)
at org.springframework.data.redis.connection.lettuce.LettuceHashCommands.hSet(LettuceHashCommands.java:61)
at org.springframework.data.redis.connection.DefaultedRedisConnection.hSet(DefaultedRedisConnection.java:1332)
at org.springframework.data.redis.connection.DefaultStringRedisConnection.hSet(DefaultStringRedisConnection.java:631)
at org.springframework.data.redis.core.DefaultHashOperations.lambda$put$14(DefaultHashOperations.java:254)
at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:224)
at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:191)
at org.springframework.data.redis.core.AbstractOperations.execute(AbstractOperations.java:97)
at org.springframework.data.redis.core.DefaultHashOperations.put(DefaultHashOperations.java:253)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.lettuce.core.RedisCommandTimeoutException: Command timed out after 5 second(s)
at io.lettuce.core.internal.ExceptionFactory.createTimeoutException(ExceptionFactory.java:59)
at io.lettuce.core.internal.Futures.awaitOrCancel(Futures.java:246)
at io.lettuce.core.LettuceFutures.awaitOrCancel(LettuceFutures.java:74)
at org.springframework.data.redis.connection.lettuce.LettuceConnection.await(LettuceConnection.java:1083)
... 22 common frames omitted`
Expected behavior/code
the expected behaviour is that with periodic_refresh, the issue does not persist.
RedisCommandTimeoutException intermittently:
in some cases this could be related, as we sometimes see timeouts for a minute, and they are recovered, however in this case it persists.
Lettuce cannot recover from connection problems:
maybe setting TCP_USER_TIMEOUT would help, default is 1 minutes, so for the recovering case, maybe this helps, but for the cases that this continues and gets stacked probably not.
The text was updated successfully, but these errors were encountered:
Bug Report
One container starts to have "Redis command timed out" and it does not recover.
Current Behavior
we have started to use aws redis serveless, since the switch from redis cluster-mode enabled to redis serverless, out of the blue one client from many tasks running starts to have command timed out continuously. Depending on the service and size of objects and if put or get commands are being timed out, we see different side effects, sometimes it results in #705, sometimes only increased latency.
Note that in the services this happens, usually there are more than 12 tasks running, and only one container has this issue.
I could correlate any CPU spike with the event causing the timeouts.
Expected behavior/code
the expected behaviour is that with periodic_refresh, the issue does not persist.
Environment
aws redis serverless 7.1
<lettuce.version>6.3.2.RELEASE</lettuce.version>
<netty.version>4.1.109.Final</netty.version>
related configurations:
ssl: true
timeout: 5s
connect-timeout: 500ms
additional:
dns-ttl: 5s
periodic-refresh: 10s
reconnect-delay-min: 100ms
reconnect-delay-max: 5s
read-from: replicaPreferred
Possible Solution
RedisCommandTimeoutException intermittently:
in some cases this could be related, as we sometimes see timeouts for a minute, and they are recovered, however in this case it persists.
Lettuce cannot recover from connection problems:
maybe setting TCP_USER_TIMEOUT would help, default is 1 minutes, so for the recovering case, maybe this helps, but for the cases that this continues and gets stacked probably not.
The text was updated successfully, but these errors were encountered: