Eventuate CDC took a long time to connect to binlog #140

feanor777 · 2023-02-22T08:57:36Z

We have the next situation:
1 eventuate CDC instance is deployed and healthy. One more additional instance is deployed to the same environment and when it's ready the old one is stopped (aka blue-green deployment).
But it takes almost 1 minute for the 2nd instance to become healthy. When we took a look into the healthcheck it looks next:

17:34:06
{"status":"UP","details":{"binlogEntryReaderHealthCheck":{"status":"UP","details":{"detail-1":"MySqlReader40 is not the leader"}},"cdcDataPublisherHealthCheck":{"status":"UP"},"zookeeperHealthCheck":{"status":"UP"},"kafkaHealthCheck":{"status":"UP"},"diskSpace":{"status":"UP","details":{"total":772767236096,"free":581354954752,"threshold":10485760}}}}
17:34:08
{"status":"DOWN","details":{"binlogEntryReaderHealthCheck":{"status":"DOWN","details":{"error-1":"Reader with id MySqlReader40 has not received message for 1676968448708 milliseconds","error-2":"Reader with id MySqlReader40 disconnected"}},"cdcDataPublisherHealthCheck":{"status":"UP"},"zookeeperHealthCheck":{"status":"UP"},"kafkaHealthCheck":{"status":"UP"},"diskSpace":{"status":"UP","details":{"total":772767236096,"free":581354532864,"threshold":10485760}}}}
........ The same Binlog DOWN status as above
17:34:57
{"status":"DOWN","details":{"binlogEntryReaderHealthCheck":{"status":"DOWN","details":{"error-1":"Reader with id MySqlReader40 has not received message for 1676968498207 milliseconds","error-2":"Reader with id MySqlReader40 disconnected"}},"cdcDataPublisherHealthCheck":{"status":"UP"},"zookeeperHealthCheck":{"status":"UP"},"kafkaHealthCheck":{"status":"UP"},"diskSpace":{"status":"UP","details":{"total":772767236096,"free":581288603648,"threshold":10485760}}}}
17:34:59
{"status":"UP","details":{"binlogEntryReaderHealthCheck":{"status":"UP","details":{"detail-1":"Reader with id MySqlReader40 received message 483 milliseconds ago","detail-2":"Reader with id MySqlReader40 is connected"}},"cdcDataPublisherHealthCheck":{"status":"UP"},"zookeeperHealthCheck":{"status":"UP"},"kafkaHealthCheck":{"status":"UP"},"diskSpace":{"status":"UP","details":{"total":772767236096,"free":581288198144,"threshold":10485760}}}}

The question is why after detail-1":"MySqlReader40 is not the leader status it's becoming unhealthy for ~1 minute and it took so long to become healthy (~1 minute).
Is it possible to speed it up somehow?

P.S I think this issue could be related to this one #86

The text was updated successfully, but these errors were encountered:

cer · 2023-02-23T06:07:51Z

Let me investigate. I can't remember the timer configuration for the monitoring code.

feanor777 · 2023-02-23T23:26:54Z

@cer Thank you so much for your help! Please let me know if you need any additional information

feanor777 · 2023-02-27T03:22:10Z

Just one more point.
I'm unsure if it is related to the monitoring timer.
As I can see from the logs CDC is just "stuck" and does not read data from the binlog for a certain period of time.
Please take a look at the image of logs which I've attached.
It's basically freezing after the "trying to connect to mysql binlog" line.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eventuate CDC took a long time to connect to binlog #140

Eventuate CDC took a long time to connect to binlog #140

feanor777 commented Feb 22, 2023

cer commented Feb 23, 2023

feanor777 commented Feb 23, 2023

feanor777 commented Feb 27, 2023

Eventuate CDC took a long time to connect to binlog #140

Eventuate CDC took a long time to connect to binlog #140

Comments

feanor777 commented Feb 22, 2023

cer commented Feb 23, 2023

feanor777 commented Feb 23, 2023

feanor777 commented Feb 27, 2023