Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event source cannot handle a 200+ file run. #16

Open
sizun opened this issue Jun 22, 2021 · 12 comments
Open

Event source cannot handle a 200+ file run. #16

sizun opened this issue Jun 22, 2021 · 12 comments

Comments

@sizun
Copy link
Contributor

sizun commented Jun 22, 2021

Crash caused by memory comsumption.

In [1]: from ctapipe_io_nectarcam import NectarCAMEventSource
In [6]: input_url = '/data/nvme/ZFITS/2021/20210618/NectarCAM.Run2521.[0-9][0-9][0-9][0-9].fits.fz'
In [7]: source = NectarCAMEventSource(input_url=input_url, n_gains=1)
Killed
@maxnoe
Copy link
Member

maxnoe commented Jun 16, 2023

I recently rewrote the MultiFiles of ctapipe_io_lst to handle this nicely.

The only downside is that you cannot longer know __len__ of the multifiles / the source since you load subruns on the fly.

@maxnoe
Copy link
Member

maxnoe commented Jun 16, 2023

@tibaldo
Copy link
Member

tibaldo commented Jun 16, 2023

Thank you @maxnoe. You are just suggesting we copy this to ctapipe_io_nectarcam, is that right?

@maxnoe
Copy link
Member

maxnoe commented Jun 16, 2023

If that would solve your issue, sure, go ahead, you probably only need to adapt the regexes / patterns for the filenames.

A more general version of this could be in the common event source, but at least in the next one or two weeks I probably don't have time to start that.

@mdpunch
Copy link
Contributor

mdpunch commented May 5, 2024

Question: Did this get implemented?

On the CC-IN2P3 Jupyter platform, which is limited to 2G memory unless requesting more, if I do a full wildcard regex then the memory explodes. So, I'm doing my own glob for looping over files 4 at a time and then passing with input_filelist. [subsidiary question, how to know how many files to open at each time? Previous EVB was 2 files, I think, and it's now 4, I see, with EVBv6].

If this can be fixed instead by using Max's code, I could implement it... unless Luigi prefers to.

@tibaldo
Copy link
Member

tibaldo commented May 5, 2024

Hi @mdpunch, I will not be able to work on this inn the coming weeks. Please, go ahead and implement it if you want.

@vmarandon
Copy link
Contributor

2GB is quite low. You won't do much with it... :-)
(I never could reproduce the issue with my MacBook Pro M2 on recent night long run. The only issue was the too large number of files, that I circumvent with a ulimit.)
The run we took for you is only 4 files, so I'm surprised it complains with regex but your approach makes it work...

@mdpunch
Copy link
Contributor

mdpunch commented May 5, 2024

Hi Vincent,

I can ask the CC for more, I guess. I have lots more on my PC, but not enough disk space for the other set of runs I'm analyzing, which are the "Throttler" runs where there are a few tens of files.

Even with my trick (and probably also with Max's code) I still hit the 2GB limit after 20 files or so (even with deleting objects along the way), so there is maybe some kind of memory leak (though I thought Python was better with that).

So, I'll do both things: look into implementing Max's code in NectarCAM, and ask the CC for more memory.

BTW, do you know where we can find how many files the data stream is spread between at a time for a given EVB?

@vmarandon
Copy link
Contributor

I don't think this is possible. To me the number of data stream is "arbitrary". Nevertheless, as far as I know we had 2 for data before EVBv6 and 4 after (@sizun : is that correct ?) so you can use that.
You can identify the type of EVB from the file directly (see how I did it in NectarCAMEventSource PR #51). If you're lazy, you can open one file with this version of NectarCAMEventSource and get the information on data before v6 using the pre_v6_data property.
(What I don't like with Max solution is that the len does not work anymore)

Have you try to get the information one file at a time and to call the garbage collector at the end of each file ?
You could then re-order the information from memory. (I assume that you are interested only in trigger times, so you should be able to have quite some margin with 2GB)

@sizun
Copy link
Contributor Author

sizun commented May 6, 2024

To me the number of data stream is "arbitrary". Nevertheless, as far as I know we had 2 for data before EVBv6 and 4 after (@sizun : is that correct ?)
Yes

@maxnoe
Copy link
Member

maxnoe commented May 6, 2024

What I don't like with Max solution is that the len does not work anymore

You can certainly fix that, we don't really need in at the moment so I didn't bother. It would increase the startup-time by quite a bit, but you can build a list of files that match the given options in __init__ and look at ZNAXIS2 header keyword in all files to build the total number of events.

You could also do that lazily, only if someone actually asks for __len__.

@mdpunch
Copy link
Contributor

mdpunch commented May 6, 2024

Dirk tells me that the number of streams is indeed arbitrary, and he won't bother to add an element to the header saying which number to use, because the whole thing will be deprecated when the ADH replaces it (which I guess is "soon" but on CTAO time-scales).

If you're lazy, you can open one file with this version of NectarCAMEventSource and get the information on data before v6 using the pre_v6_data property.

I'll look into that.

You could then re-order the information from memory. (I assume that you are interested only in trigger times, so you should be able to have quite some margin with 2GB)

Indeed, I could read them, get the times, and then do a big ol' sort. But I was trying to do things properly (but as Voltaire said, "Il meglio è l'inimico del bene").

Anyway, I have now 16GB memory on the CC Jupyter hub, so it's no longer a problem, but I will still look at making it a little bit more automatic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants