Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low memory XML decoding (parsing scans iteratively) #23

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

lomereiter
Copy link

@lomereiter lomereiter commented Mar 1, 2017

This PR provides an alternative solution to #13: each scan is parsed once, all necessary information is extracted from it, then the node is freed.
On a large imzML file this brought top memory consumption from 2.5GB down to 90MB, albeit the processing time increased from 6s to 13s.

@althonos
Copy link
Member

althonos commented Mar 1, 2017

This seems a lot cleaner than what we hacked through at first. I'll review it as soon as I can.

@Tomnl
Copy link
Member

Tomnl commented Mar 1, 2017

Thanks @lomereiter, this is a great contribution.

No unit tests yet... but it seems to be passing the travis and Appveyor tests with no problem

@Tomnl
Copy link
Member

Tomnl commented Apr 3, 2017

Hey @althonos, do you think we should merge this now? or perhaps we should wait until we have the unit test functionality?

@althonos
Copy link
Member

althonos commented Apr 3, 2017 via email

@althonos
Copy link
Member

althonos commented Apr 3, 2017

Maybe (because of the increased time) we should still leave both methods and let the user choose (like lxml.etree.iterparse allows to give a huge_tree parameter).

@Tomnl
Copy link
Member

Tomnl commented Apr 4, 2017

Yeah I think you are right @althonos, keeping both methods seems like the best idea as memory consumption might not be a problem for some.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants