-
Notifications
You must be signed in to change notification settings - Fork 952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
volcano restarts due to panic #3710
Comments
Is this from volcano repo? |
yes |
Did you install volcano-admission correctly? |
You add a custom plugin? |
yes,and add a custom plugin |
yes. I've been using the new plugin for two years. I found this problem recently. |
From the picture you pasted, seems it's caused by your own plugin, maybe you can check the custom plugin first. |
Description
When Volcano is used for scheduling, the system restarts for multiple times.Restart cause: out-of-bounds array causes panic.
error details:
kubectl logs -f -n volcano-system volcaho-scheduler-xx -p
2024/08/27 10:47:39 maxprocs: Updating GOMAXPROCS=20: determined from CPU quota
W0827 10:47:39.514559 1 client config go:617] Neither --kubeconfig nor --master was specified. Using the inclusterConfig.s This might not work.
I0827 10:48:01.142318 1 trace.go:205] Trace[1974509093]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169 (27-Aug-2024 10:47:39.584)(total time: 2155
7ms):
Trace[1974509093]: ---"Objects listed" error: 21521ms (10:48:01.105)
Trace[1974509093]: [21.55737372s1 [21.55737372s] END
I0827 10:48:03.669284 1 trace.go:205] Trace[585251636]: "DeltaFIFO Pop Process" ID:kube-system/mindx-dl-deviceinfo-kcs-haerbin-agi-s-thbdc,Depth:16,Reason:slow event handlers blocking the queue
(27-Aug-2024 10:48:03.472)(total time: 195ms):
Trace[585251636]: [195.826968ms][195.826968ms] END
E0827 10:48:10.501173 1 cache.go:1245] error occurred in updating Queue : Operation cannot be fulfilled on queues.scheduling.volcano.sh "default": the object has been modified; please
apply your changes to the latest version and try again
E0827 10:48:10.501199 1 session.go:216] failed to update queue status: Operation cannot be fulfilled on queues.scheduling.volcano.sh "default": the object has been modified; please ap
ply your changes to the latest version and try again
10827 10:48:13.363051 1 trace.go:205] Trace[616662840]: "DeltaFIFO Pop Process" ID:kube-system/mindx-dl-deviceinfo-kcs-haerbin-agi-s-jh95z,Depth:32,Reason:slow event handlers blocking the queue
(27-Aug-2024 10:48:13.181)(total time: 181ms):
Trace[616662840]: [181.390665ms] [181.390665ms] END
E0827 10:48:24.691004 1 cache go:1245] error occurred in updating Queue : Operation cannot be fulfilled on queues.scheduling.volcano.sh "default": the object has been modified; please
apply your changes to the latest version and try again
E0827 10:48:24.691046 1 session.go:216] failed to update queue status: Operation cannot be fulfilled on queues.scheduling.volcano.sh "default": the object has been modified; please ap
ply your changes to the latest version and try again
panic: runtime error: index out of range [4] with length 4
goroutine 81093 [running]:
k8s.io/api/core/v1.(*PodStatus).DeepCopyInto(Ox402dd59be8, 0x4019da0af8)
/devcloud/slavespace/slave1-new/workspace/j_CmcZc9ms/pkg/mod/k8s.io/[email protected]/core/v1/zz generated.deepcopy.go:3982 +0x5cc
kgs.io/api/core/v1.(*Pod).DeepCopyInto(Ox402dd598f0, 0x4019da0800)
/devcloud/slavespace/slave1-new/workspace/j_CmcZc9ms/pkg/mod/k8s.io/[email protected]/core/v1/zz_generated.deepcopy.go:3309 +0x100
k8s.io/api/core/v1.(*Pod).DeepCopy(...)
/devcloud/slavespace/slave1-new/workspace/j_CmcZc9ms/pkg/mod/k8s.io/[email protected]/core/v1/zz generated deepcopy go:3319
volcano.sh/volcano/pkg/scheduler/cache.(*defaultEvictor).Evict(0x400088e2e8, 0x402dd598f0, {0x4046557f68, 0x4})
/devcloud/slavespace/slave1-new/workspace/j_CmcZc9ms/src/volcano.sh/volcano/pkg/scheduler/cache/cache.go:209 +0x1a4
volcano.sh/volcano/pkg/scheduler/cache.(*SchedulerCache).Evict.func1()
/devcloud/slavespace/slave1-new/workspace/j_CmcZc9ms/src/volcano.sh/volcano/pkg/scheduler/cache/cache.go:751 +0x44
created by volcano.sh/volcano/pkg/scheduler/cache.(*SchedulerCache).Evict
/devcloud/slavespace/slave1-new/workspace/j CmcZc9ms/src/volcano.sh/volcano/pkg/scheduler/cache/cache.go:750 +0x338
Steps to reproduce the issue
Describe the results you received and expected
solve the panic problem
What version of Volcano are you using?
1.7
Any other relevant information
No response
The text was updated successfully, but these errors were encountered: