-
Notifications
You must be signed in to change notification settings - Fork 869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: PowerShell task idles before generating script #4758
Comments
Same here, except in our case the agents will just time out jobs before they generate the script. Reboots do help, but only for a short amount of time before things start hanging up again. Ironic that the WMIC call seems to be meant to measure CPU but running 50 of them in parallel is capping CPU. |
Hi @jwikman, thanks for reporting. Looks like it's related to agent tries to get memory usage on each context and since there are a lot of context on your case, it might be produce more WMIC processes than expected. We will continue to investigate |
Hi @ismayilov-ismayil I was running a very simple "Hello World" pipeline at the same time that we were experiencing this issue, with System Diagnostics enabled. |
I ran v3.232.4 for a few hours. 50+ pipelines with 1000+ tasks being run - All looked very good. Then I upgraded back to v3.326.1 and now all WMIC processes are back, hundreds of processes are being executed almost all the time. I also see a lot of CPU spikes, that wasn't there on v3.232.4. After looking into the code, I thought I could set How do I disable such a feature? Meanwhile, I'll revert back to v3.232.4. |
Last night I updated to v3.238.0 to see if anything has changed. But after three agents was updated, the CPU started to reach 100% CPU 50% of the time. And it seems as if it was due to all the WMIC processes... @ismayilov-ismayil any progress on this issue?
Right now, I cannot test the first issue, due to the second. |
Hi @jwikman |
Ok, Thanks @ismayilov-ismayil. |
@jwikman, we rolled back recent changes that may be the RC of the issue. |
@KonstantinTyukalov, in which version of the agent can I expect this to be included? |
@jwikman it should work with your current agent as we disabled the server-side feature flag |
@KonstantinTyukalov aah, ok. I'll test to upgrade our agents later tonight. |
This is current status: All agents are now running v3.238.0. So now we can focus on the original problem: The delays when the PowerShell tasks are creating scripts. After I updated all agents, I still could see a lot of these delays. From 40s to several minutes. Hence, I had to restart the build server, and after reboot the delays are neglectable again. Let's see how this progress in a few days. |
Unfortunately, you can't disable this feature by yourself now. Regarding the original issue: this looks like a delay of work from powershell handler to prepare a task for execution. |
I see, thanks for letting me know.
Ok, I've uploaded new logs on the DC ticket. I created 6 agents in a new pool for this test, on the same server as the other agent pools. Except these 6 agents, there where 8 other agents running other pipelines on the server. All is running v3.238.0.
I added the variable I few observations:
Could the Just let me know if you need anything else. |
Hmmm. I run my pipeline once again, but without setting the Now the delays are down to about 40s, and the finishing task is running fine. The total execution time went from over 11 minutes down to 2:40. Uploaded those logs as well, if the system diagnostics option added too much overhead or something. |
Just as an update, almost a week since last reboot - the delays before starting a pipeline task is between 2 to 5 minutes. |
Oh, and this maybe can be another clue on this issue: The logs from that step:
As seen above, there are first a delay for 4 minutes before step actually starts. After rebooting server I get these times for the imports (20 + 7 seconds):
|
This issue has had no activity in 180 days. Please comment if it is not actually stale |
@KonstantinTyukalov any news on this? We are still rebooting the build server daily as a workaround... |
What happened?
This is reported at https://developercommunity.visualstudio.com/t/PowerShell-task-idles-before-generating-/10636846, but got the suggestion to report here as well.
We’ve got several self hosted Azure DevOps Agent that runs on one of our servers.
We’ve seen an issue that escalated in the last month or so, I believe that it correlates with updating all Agents to latest release (v3.236.1) and at the same time we started to run on PowerShell 7 (with pwsh: true).
After server is started, our pipelines runs fast for several days, or maybe a week or so.
Then we start to see performance problems, where a lot of tasks in the pipeline takes a lot longer to complete.
It seems as it is the PowerShell tasks (PowerShell@2) that takes longer and longer to start.
When checking the logs we can see that the PowerShell task header is written to the log, then it idles before the text “Generating script” is being logged.
Looking at old logs I can see that it used to take about 3 or 4s before “Generating script” is being printed.
But when the server slows down lately this time increases and increases and increases until some colleague yells about it. Yesterday there was about 10 minutes between those two steps.
If I restart the server, all is fine for some days, then this starts again.
When monitoring the server, I can see that the pipeline agents create a LOT (50 to 100 or so) of WMIC processes, which seems to cause high CPU load oon WmiPrvSE.exe… Is this expected behavior? They all just seems to check the available memory on the host.
Yesterday, just before restarting the server, I saw above list of WMIC processes and they just seemed to live on.
Today, I still see a lot of WMIC processes being started and we get a CPU spike in WmiPrvSE, but after a few seconds (3 to 4) they are terminated, and CPU goes down.
We’ve got 12 agents running on the same machine, with 32 cores and 128GB of RAM, but since they are divided into two different AgentPools there are most often no more than 6 agents running at the same time.
I believe I found the code that is starting the WMIC processes:
azure-pipelines-agent/src/Agent.Worker/ResourceMetricsManager.cs
Line 255 in b34a9c3
But why it is creating that many processes puzzles me… There is a loop in RunMemoryUtilizationMonitor() that should wait 5s between each check, but as it seems - this cannot work as intended, or am I mistaken?
Versions
Agent version: 3.236.1
Windows Version: Windows Server 2022, v21H2 (OS Build 20348.2402)
Environment type (Please select at least one enviroment where you face this issue)
Azure DevOps Server type
dev.azure.com (formerly visualstudio.com)
Azure DevOps Server Version (if applicable)
No response
Operation system
No response
Version controll system
No response
Relevant log output
See under "What happened" above.
The text was updated successfully, but these errors were encountered: