Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating jobs through JobFlow and JobTemplate was unsuccessful #2983

Closed
Mufengzhe opened this issue Jul 20, 2023 · 9 comments
Closed

Creating jobs through JobFlow and JobTemplate was unsuccessful #2983

Mufengzhe opened this issue Jul 20, 2023 · 9 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@Mufengzhe
Copy link
Contributor

What happened:
When I want to create JobFlow and JobTempate using examples in the volcano/example/jobflow directory of the volcano project, the JobTemplate and JobFlow are successfully created, but the job object resources under JobFlow are not created;And the log information in the controller is: Failed to create jobs of JobFlow default/test: jobs.batch.volcano.sh is forbidden: User "system:serviceaccount:volcano-system:volcano-controllers" cannot create resource "jobs" in API group "batch.volcano.sh" in the namespace "default"
What you expected to happen:
The jobs object under JobFlow is successfully created, and the resource information can be found by using kubectl get po
How to reproduce it (as minimally and precisely as possible):

# deploy jobTemplate first
cd volcano
kubectl apply -f example/jobflow/JobTemplate.yaml
# deploy jobFlow second
kubectl apply -f example/jobflow/JobFlow.yaml

# check them
kubectl get jt
kubectl get jf
kubectl get po

Anything else we need to know?:
The configuration file in the example/jobflow directory is used
Environment:

  • Volcano Version: v1.8.0-alpha.0
  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools: helm
  • Others:
@Mufengzhe Mufengzhe added the kind/bug Categorizes issue or PR as related to a bug. label Jul 20, 2023
@lowang-bh
Copy link
Member

what is the output of kubectl get clusterrole volcano-controllers -o yaml?

@lowang-bh
Copy link
Member

It will works after PR #2951 merged. You can validate it use that mr.

@Mufengzhe
Copy link
Contributor Author

This problem was solved, but I had other unexpected errors when I proceeded to the next step. When I execute the kubectl apply -f example/jobflow/JobFlow.yaml , I find that job-a starts to create, but the pod state has always been ContainerCreating, when I check the log, I find that it has been pulling image, and when I check the controller, the log print content is I0724 02:28:57.686263 1 jobflow_controller_action.go:115] No test-b Job found! I0724 02:28:57.686274 1 jobflow_controller_action.go:115] No test-b Job found! I0724 02:28:57.686283 1 jobflow_controller_action.go:115] No test-c Job found! E0724 02:28:57.688560 1 jobflow_controller_action.go:69] Failed to update status of JobFlow default/test: jobflows.flow.volcano.sh "test" is forbidden: User "system:serviceaccount:volcano-system:volcano-controllers" cannot update resource "jobflows/status" in API group "flow.volcano.sh" in the namespace "default".
I'm not sure if pulling images all the time is related to this error, Is this a bug or a problem with my environment

@lowang-bh
Copy link
Member

@Mufengzhe There is a error shows no permit to update status of JobFlow. Did you have add all the changed in that pr?

@Mufengzhe
Copy link
Contributor Author

Mufengzhe commented Jul 26, 2023

I added the following to the ClusterRole.volcano-scheduler.rules in the volcano-depolyment.yaml to solve the above problem
image

@lowang-bh
Copy link
Member

The following is also need.

  - apiGroups: [ "flow.volcano.sh" ]
    resources: [ "jobflows/status", "jobs/finalizers","jobtemplates/status", "jobtemplates/finalizers"  ]
    verbs: [ "update", "patch" ]

@Mufengzhe
Copy link
Contributor Author

I tried, only need to add this for volcano-controllers, other modifications are not needed, looking forward to modifying the code

  - apiGroups: [ "flow.volcano.sh" ]
    resources: [ "jobflows/status", "jobs/finalizers","jobtemplates/status", "jobtemplates/finalizers"  ]
    verbs: [ "update", "patch" ]

@lowang-bh
Copy link
Member

Have fixed
/close

@volcano-sh-bot
Copy link
Contributor

@lowang-bh: Closing this issue.

In response to this:

Have fixed
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants