Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Error Messages? #3939

Open
klannon opened this issue Sep 20, 2024 · 1 comment
Open

Improve Error Messages? #3939

klannon opened this issue Sep 20, 2024 · 1 comment

Comments

@klannon
Copy link

klannon commented Sep 20, 2024

As reported by @ywan2:
Error messages like this:

accumulating task id 25313 item accum_25312 with 944356 events on [d12chas569.crc.nd.edu](http://d12chas569.crc.nd.edu/). return code 0 (monitor error)
allocated cores: 2.0, memory: 10800 MB, disk 14400 MB, gpus: 0.0
measured cores: -1.0, memory: -1 MB, disk -1 MB, gpus: -1.0, runtime 204.0 s
WARNING: task id 25313 item accum_25312 failed: monitor error
accum_25312 without result. accumulating: [('UL16APV_WWTo2L2Nu_NDSkim',
'file:///project01/ndcms//store/user/awightma/skims/mc/new-lepMVA-v2/central_bkgd_p1/2016APV/v1/UL16APV_WWTo2L2Nu/output_540.root'
, 'Events', 76218, 101624) p_25095 with result.

Are difficult to interpret. As I understand, a message like this can be generated because the monitor fails to receive data, possibly because disk space was exhausted or some other problem that causes the monitor to fail to extract monitoring data (see the -1 values in the message). It would be better if the message clearly identified that the error is related to the inability of the monitor to extract the necessary data.

Note: I believe (@ywan2 can confirm) this error message was generated with coffea 0.7 + WQ, so if that makes it obsolete because it's fixed in the new coffea + DaskVine code, then you can close this issue without objection from me.

@dthain
Copy link
Member

dthain commented Sep 23, 2024

@btovar what do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants