[proposal] cpu burst should also burst some pods when the node util nears threshold #2164
Labels
area/koordlet
help wanted
Extra attention is needed
kind/proposal
Create a report to help us improve
Milestone
What is your proposal:
when the utilization around cpuBurst.sharePoolThresholdPercent, koordlet should not reset all but try to select some pods and burst them.
Why is this needed:
cpu burst strategy is proposed to solve problems of cpu resource throttling for pods like java app, which needs much resource than its resource.limit during start time.
When there are multi pods start at the same time, the strategy may not handle wll. Since now if node utilization > cpuBurst.sharePoolThresholdPercent, all bursted cfs_quota will be reset, which may make the node utilizaiton << cpuBurst.sharePoolThresholdPercent. koordlet will try burst again util the cooling duration finished.
This leads to none pods could get burst in long time and stuck at starting stage.
here are some related issues: kubernetes/kubernetes#3312
Is there a suggested solution, if so, please add it:
Only reset PART of over-utilized pods when the node.util > cpuBurst.sharePoolThresholdPercent to avoid none pod could get burst.
Only burst SOME throttled pods when node.util < cpuBurst.sharePoolThresholdPercent to avoid node util too high.
The text was updated successfully, but these errors were encountered: