Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OoM exception when updating saga_instance database rows #156

Open
rrrship opened this issue Jan 23, 2024 · 1 comment
Open

OoM exception when updating saga_instance database rows #156

rrrship opened this issue Jan 23, 2024 · 1 comment

Comments

@rrrship
Copy link
Contributor

rrrship commented Jan 23, 2024

I will preface it with that our pod was running on fairly low memory limits with 128mb, but this hasn't caused any issues before.

What happened was that in our main database, which CDC service has access to, we did an update on more than 10_000 rows and this caused the CDC service to shutdown repeatedly until we boosted the memory to a sufficient level. I think the problem is that we wouldn't expect an update on non-CDC related table to affect our CDC service at all. Maybe there are some improvements that can be done to the WAL consumption logic (we're using PostgeSQL).

Why even update the saga_instance table? We wanted to add some custom additional data there for some ad-hoc stuff. But this kind of led us to the thought that might the same issue happen with any update on any table? Because we would expect updates to be ignored totally.

@cer
Copy link
Contributor

cer commented Feb 20, 2024

Oh dear. It looks like I only replied in my head. Sorry about the delay.

By default, the CDC has to 'process' all updates to the database including those for tables other than the MESSAGE table.
However, if the CDC is the only WAL consumer you could use the add-tables property to ignore the other tables.

I suspect that the OOM problem is due to how the wal2json plugin is configured.

It's likely to be using format version 1 which results in a JSON object per transaction.
As a result, a transaction that updates a large number of rows generates a large JSON object, which probably causes the OOM error.

The solution would be to use format version 2 (format-version property), which is a (smaller) JSON object per update. Can you try this and let me know?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants