split import transaction #32

stmichael · 2012-09-06T06:48:41Z

Currently each import block runs in a transaction. This may cause some issues with large datasets. My research showed that there is (practically) no hard limit of statements per transaction. But the bottleneck is the transaction log (or write ahead log) which will use a lot of memory.

I propose that we split the whole transaction into a configurable amount of transactions. That way the memory usage of the transaction log will be kept low. The performance loss will be feasible if we keep the number of statements per transaction above a few thousand.

stmichael · 2012-09-06T07:31:07Z

Importing 500'000 records from MSSQL to Postgres using a single transaction used approximately 200MB RAM on my machine. If you extrapolate that to a few million records the memory usage explodes.

senny · 2012-09-06T15:47:46Z

What do we do about data consistency? At the moment when the import fails, it rolls-back the current transaction, leaving the import script in a somewhat useable state. If we have multiple transactions, this could lead to half-migrated tables. I think at the moment it would not be a big deal but if we want to implement #5 this will be important. @stmichael what do you think?

stmichael · 2012-09-10T06:27:42Z

That's true data consistency is a problem. I just opened this ticket so we don't forget about this. As I said, there is no hard limit for the amount of commits per transaction. That depends on the resources of your system.

I propose we leave this ticket open as an idea and postpone it to a later point in time when it is actually needed.

senny · 2012-09-11T07:22:19Z

I'm fine with that.

stmichael mentioned this issue Sep 10, 2012

script block should run in a transaction #33

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

split import transaction #32

split import transaction #32

stmichael commented Sep 6, 2012

stmichael commented Sep 6, 2012

senny commented Sep 6, 2012

stmichael commented Sep 10, 2012

senny commented Sep 11, 2012

split import transaction #32

split import transaction #32

Comments

stmichael commented Sep 6, 2012

stmichael commented Sep 6, 2012

senny commented Sep 6, 2012

stmichael commented Sep 10, 2012

senny commented Sep 11, 2012