Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added check with exceptionOverride in case of connection dead #2045

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

janario
Copy link

@janario janario commented Mar 2, 2023

I'm open to suggestions and exploring more on the scenario. :-)

Scenario:

We are trying to take advantage of the aws-mysql-jdbc failover plugin.

According to their documentation, they can detect the failover make the switch writer<>reader but an exception will still be thrown to be handled by the pool provider, in our case HikariCP. https://github.com/awslabs/aws-mysql-jdbc#connection-pooling

For the normal scenario, it is ok:

  • in the middle of a transaction failover happens
  • exception is thrown e.g. Connection.createStatement
  • hikaricp will use the SQLExceptionOverride and decide to discard or not based on its results
    • with our custom the failover codes 08S02 and 08007 connection is kept and the next interaction everything will be fine

The problem is when acquiring a connection from the pool, the driver has to throw the exception since it is not aware it is in the middle of a transaction of not. But HikariCP doesn't use the SQLExceptionOverride at this point, making a good connection to be discarded.

On top of that, in an idle application, the entire pool will be discarded due to the failover, which can occur even hours after the failover has happened.

With the current PR we would use the SQLExceptionOverride to decide if the connection is really dead or not by doing one more retry if the error code is a DO_NOT_EVICT.

@janario
Copy link
Author

janario commented Mar 2, 2023

This is the current stack on version 4.x where the connection failover is detected, switched and hikari ends up discarding the connections.

https://github.com/awslabs/aws-mysql-jdbc/blob/main/src/main/user-impl/java/com/mysql/cj/jdbc/ha/plugins/failover/FailoverConnectionPlugin.java#L518

java.sql.SQLException: The active SQL connection has changed due to a connection failure. Please re-configure session state if required.
	at software.aws.rds.jdbc.mysql.shading.com.mysql.cj.jdbc.ha.plugins.failover.FailoverConnectionPlugin.failover(FailoverConnectionPlugin.java:507)
	at software.aws.rds.jdbc.mysql.shading.com.mysql.cj.jdbc.ha.plugins.failover.FailoverConnectionPlugin.pickNewConnection(FailoverConnectionPlugin.java:622)
	at software.aws.rds.jdbc.mysql.shading.com.mysql.cj.jdbc.ha.plugins.failover.FailoverConnectionPlugin.dealWithOriginalException(FailoverConnectionPlugin.java:866)
	at software.aws.rds.jdbc.mysql.shading.com.mysql.cj.jdbc.ha.plugins.failover.FailoverConnectionPlugin.execute(FailoverConnectionPlugin.java:253)
	at software.aws.rds.jdbc.mysql.shading.com.mysql.cj.jdbc.ha.plugins.ConnectionPluginManager.execute(ConnectionPluginManager.java:138)
	at software.aws.rds.jdbc.mysql.shading.com.mysql.cj.jdbc.ha.ConnectionProxy.invoke(ConnectionProxy.java:193)
	at jdk.proxy2/jdk.proxy2.$Proxy96.setNetworkTimeout(Unknown Source)
	at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:566)
	at com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:173)
	
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:186)
	at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162)
	at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128)
	at org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection(DatasourceConnectionProviderImpl.java:122)

poolName, connection, e.getMessage());
}

boolean isConnectionDead(final Connection connection)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The major change is here.

In case of exception let the exceptionOverride decide if the connection is good or not, and give it another try since it can be an exception of recovery (e.g. failover switch)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants