Replies: 1 comment 20 replies
-
Hi @trantor -- you might check out Another idea is to maybe swap the role of the 'left' and 'right' file. Without |
Beta Was this translation helpful? Give feedback.
20 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello everyone.
I am processing a multi-GB CSV file, attempting to
join
with a different CSV file of just a few lines... and quickly I've got to abortmlr
since the RSS of the process skyrockets to multi-GB levels as well, quite larger than the original CSV as well.Much the same things happens when one tries, for instance, a
grep -f
with dozens of regex patterns (not fixed strings), so I am not exactly surprised. However, since the problem ofjoin
ing (essentialy to filter and associate down to a relatively small amount of records) on sizeable CSV files is relatively common, I would be curious about options on alternative approaches to the problem.Beta Was this translation helpful? Give feedback.
All reactions