-
Notifications
You must be signed in to change notification settings - Fork 347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP 503s from Azure Devops Test Result API in dnceng-public/public #11872
Comments
@MichalStrehovsky not sure this should be using the "known build error" feature here but I will investigate and let you know what's happening here either way. |
Thanks! Sorry, I just followed the "Report infrastructure issue" link from build analysis and this was the template. None of the labels were intentional. |
From the logs of this work item's last attempt, the problem here is that despite passing, when trying to insert test results into Azure DevOps the server returns http 503:
This is definitely an infrastructure problem with our Azure DevOps org. When this happens and all the retries are exhausted, the work item reports a -4 exit code (it has to; runtime tests will return 0 exit code when tests fail, so we need to make sure that any failures are detected) and the work item then goes to another machine for a retry. When this retry happenened, the problem kept occurring; after the third attempt the work item is considered un-runnable and goes to deadletter, as you've seen. I'll retitle this issue to reflect what's going on, and create a ticket asking for investigation. Please let us know if this becomes 100% repro (We'll keep an eye on it too) and the severity of the incident can be bumped up in that case. |
@MichalStrehovsky I created https://portal.microsofticm.com/imp/v3/incidents/details/353884008 to ask for Azure DevOps to perform an investigation here. Looking at the time window where your test failed, it only occurred to something like 0.2% of the work items in the same 2 hour period, so unless this starts happening more consistently we'll keep this at severity 3. |
Put in tracking until we hear back from the IcM |
Checked in today, no update on the IcM since 12/9 when it was assigned to the DRI, pinged for status. |
checked in again, graphed values over time and made myself available to meet |
I met with two folks from the IDC Azure Test Results team today and was able to show them several examples of 503s being hit in our logging. They will investigate and get back to me. |
No updates since 1/4 meeting, so I pinged the IcM. |
IcM seems to have made some traction with related Azure teams. @ilyas1974 we only saw 2 of these in the last week and both succeeded via Helix infra retry. Should we just close it? |
Checked in today. 614 instances in the past 10 days. Pinged IcM. |
How does this issue differ from #11723, other than not tracking failures automatically here❔ |
This is about two different services. While ostensibly both "Azure Devops", this is about 503s when posting test results from Helix machines, while #11723 is about NuPkg feeds. They are handled by different teams despite being in the same overall organization, so I think it merits being tracked differently. |
Assigning to @ilyas1974 as I am no longer able to care about this issue. |
duplicate of #11723 |
Build
https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=104274
Build leg reported
System.Net.Http.Json.Unit.Tests.WorkItemExecution
Pull Request
dotnet/runtime#79332
Action required for the engineering services team
To triage this issue (First Responder / @dotnet/dnceng):
If this is an issue that is causing build breaks across multiple builds and would get benefit from being listed on the build analysis check, follow the next steps:
Release Note Category
Release Note Description
Additional information about the issue reported
If I follow the link from https://github.com/dotnet/runtime/pull/79332/checks?check_run_id=9936286963 to the console log, the console log says success. Build analysis said deadletter.
https://helix.dot.net/api/2019-06-17/jobs/878ed4a3-d870-465c-9160-161f619b846d/workitems/System.Net.Http.Json.Unit.Tests/console
Report
Summary
The text was updated successfully, but these errors were encountered: