Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eventuate CDC took a long time to connect to binlog #140

Open
feanor777 opened this issue Feb 22, 2023 · 3 comments
Open

Eventuate CDC took a long time to connect to binlog #140

feanor777 opened this issue Feb 22, 2023 · 3 comments

Comments

@feanor777
Copy link

We have the next situation:
1 eventuate CDC instance is deployed and healthy. One more additional instance is deployed to the same environment and when it's ready the old one is stopped (aka blue-green deployment).
But it takes almost 1 minute for the 2nd instance to become healthy. When we took a look into the healthcheck it looks next:

17:34:06
{"status":"UP","details":{"binlogEntryReaderHealthCheck":{"status":"UP","details":{"detail-1":"MySqlReader40 is not the leader"}},"cdcDataPublisherHealthCheck":{"status":"UP"},"zookeeperHealthCheck":{"status":"UP"},"kafkaHealthCheck":{"status":"UP"},"diskSpace":{"status":"UP","details":{"total":772767236096,"free":581354954752,"threshold":10485760}}}}
17:34:08
{"status":"DOWN","details":{"binlogEntryReaderHealthCheck":{"status":"DOWN","details":{"error-1":"Reader with id MySqlReader40 has not received message for 1676968448708 milliseconds","error-2":"Reader with id MySqlReader40 disconnected"}},"cdcDataPublisherHealthCheck":{"status":"UP"},"zookeeperHealthCheck":{"status":"UP"},"kafkaHealthCheck":{"status":"UP"},"diskSpace":{"status":"UP","details":{"total":772767236096,"free":581354532864,"threshold":10485760}}}}
........ The same Binlog DOWN status as above
17:34:57
{"status":"DOWN","details":{"binlogEntryReaderHealthCheck":{"status":"DOWN","details":{"error-1":"Reader with id MySqlReader40 has not received message for 1676968498207 milliseconds","error-2":"Reader with id MySqlReader40 disconnected"}},"cdcDataPublisherHealthCheck":{"status":"UP"},"zookeeperHealthCheck":{"status":"UP"},"kafkaHealthCheck":{"status":"UP"},"diskSpace":{"status":"UP","details":{"total":772767236096,"free":581288603648,"threshold":10485760}}}}
17:34:59
{"status":"UP","details":{"binlogEntryReaderHealthCheck":{"status":"UP","details":{"detail-1":"Reader with id MySqlReader40 received message 483 milliseconds ago","detail-2":"Reader with id MySqlReader40 is connected"}},"cdcDataPublisherHealthCheck":{"status":"UP"},"zookeeperHealthCheck":{"status":"UP"},"kafkaHealthCheck":{"status":"UP"},"diskSpace":{"status":"UP","details":{"total":772767236096,"free":581288198144,"threshold":10485760}}}}

The question is why after detail-1":"MySqlReader40 is not the leader status it's becoming unhealthy for ~1 minute and it took so long to become healthy (~1 minute).
Is it possible to speed it up somehow?

P.S I think this issue could be related to this one #86

@cer
Copy link
Contributor

cer commented Feb 23, 2023

Let me investigate. I can't remember the timer configuration for the monitoring code.

@feanor777
Copy link
Author

@cer Thank you so much for your help! Please let me know if you need any additional information

@feanor777
Copy link
Author

Just one more point.
I'm unsure if it is related to the monitoring timer.
As I can see from the logs CDC is just "stuck" and does not read data from the binlog for a certain period of time.
Please take a look at the image of logs which I've attached.
It's basically freezing after the "trying to connect to mysql binlog" line.
Screen Shot 2023-02-27 at 12 17 42

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants