-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition 500 Internal Server Error when submitting multiple builds to a directory that has never been used #3358
Comments
Triage: Two issues to solve ... 1. Why 500? 2. Return something reasonable if 500 |
In my experience, 500 happens when there is an unhandled Python exception. If the webserver runs in debug mode, the exception is shown, but if it is in production mode, it is hidden. If you have a development copr server with debug mode enabled, we could try reproducing there. |
I am looking at the code, searching where this could have happened and I found c1fa04b -- if this wasn't deployed yet, perhaps this fixed the issue. |
Hello @hroncok, We decided to not prioritize this issue for the next 3 months because although annoying, it seems there should be an easy workaround. I suppose only the reproducer is done via |
No, I use parallel to submit thousands of builds. The workaround I use is to resubmit the failed ones later (a bit tricky to figure out which failed, but I can manage). Another workaround is to submit the first one manually and use parallel to submit the rest after. |
Probably related to #3372 |
This happens to me fairly regularly when I run Copr impact checks to see if an upgrade of some Fedora package does not break anything. I decided to create a smaller reproducer and report it.
Using the copr CLI:
Some of the builds will fail with:
Adding
--debug
does not reveal much:Reproducer (uses moreutils-parallel):
Often some of the first builds errors:
If it does not happen to you, repeat with a new directory name (
$COPR:custom:2
,$COPR:custom:3
...) until it does.Use this to cancel the running/pending builds after you run the above in case you want to preserve resources for others:
I hypothesize that a first build in the custom directory does something special (wrt creating the directory) and when multiple builds think they are first, they all attempt to do the special thing at the same time and some of them get an unhandled exception because of a race condition.
The text was updated successfully, but these errors were encountered: