feat: dynamically change klog level through socket #3146

xiao-jay · 2023-09-29T13:40:05Z

Why not change klog level throuth change configmap

Volcano is a commercial project. Volcano is deployed in a customer cluster. The configmap cannot be modified easily, but curl can be executed. I thought of using curl to pass in the klog value that you want to modify through the scoket method.

Implementation Introduction

The vc-scheduler pod has an internal socket service to monitor changes in the klog.scok file. Because the socket service in the pod cannot be directly accessed in the k8s Node, the klog.sock file needs to be mounted from the pod to the cluster. Use The curl command modifies the klog.sock file.

This machine is a mac, use minikube to simulate the k8s cluster, enter mimikube, execute curl to modify the /tmp/socks/klog.sock file through the socket

 sudo curl --unix-socket /tmp/klog-socks/klog-klog.sock "http://localhost/setlevel?level=5&duration=60s"

Klog has been set to 5. This log can only be printed at level 5.
klog.V(5).Info("only klog v 5 can print this log ")
not using sudo will curl: (7) Couldn't connect to server

example image:xiaojie99999/vc-scheduler:1.1.7
@wangyang0616 @william-wang @Monokaix @lowang-bh

pkg/util/socket.go

lowang-bh · 2023-10-07T03:11:58Z

pkg/util/socket.go

+		var loglevel klog.Level
+		mutex.Lock()
+		defer mutex.Unlock()
+		if err := loglevel.Set(startupLogLevel); err != nil {


how about record an error log here?

good suggestion,done

lowang-bh · 2023-10-07T03:14:00Z

pkg/util/socket.go

+		// Therefore, put reset function in mutex range.
+		go reset(prevCtx, duration)
+		responseOk(&w, fmt.Sprintf("Change klog log level to %s successfully and  for %v\n", currentLogLevel, duration))
+		mutex.RUnlock()


During the read lock, there is also an write lock requirement in gorutime, is that as your expect?

I didn't understand what you meant. Can you provide a more detailed scenario?

I mean that you have locked with a readlock, and then start a goruntine in which a write lock is required.

The wirte lock is in func modifyLoglevel,using to protect prevCtx.

func modifyLoglevel(newLogLevel string) error { mutex.Lock() defer mutex.Unlock() // Change klog log level to new value var loglevel klog.Level if err := loglevel.Set(newLogLevel); err != nil { return err } currentLogLevel = newLogLevel // Cancel the previous timer. if prevCtxCancelFunc != nil { prevCtxCancelFunc() } prevCtx, prevCtxCancelFunc = context.WithCancel(context.Background()) return nil }

There is also write lock in the reset gorutine function. In the scenario previous timer is timeup and the select case goes into the write lock require, but at the same time you are doing modifyLoglevel and have acquired the lock. Now the modifyLoglevel runs the cacel func of previous context. What will happen?

You considered the scenario so carefully👍
if modifyLoglevel get the wirte lock first,modifyLoglevel will set klog and the reset will set klog after, it is abnormal.
if reset get the write lock first ,it will reset klog and the modifyLoglevel will set klog ,it will be normal.

This usage scenario is to temporarily increase the klog level during debugging without considering concurrency, so there should be no need to deal with the low-probability events you mentioned.
what do you think？
@lowang-bh

Signed-off-by: 晓杰 <[email protected]>

william-wang · 2023-10-11T12:40:19Z

pkg/util/socket.go

@@ -0,0 +1,210 @@
+package util


please add the copy right.

Signed-off-by: 晓杰 <[email protected]>

william-wang

/lgtm

volcano-sh-bot · 2023-10-13T06:40:20Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: william-wang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [william-wang]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

hwdef · 2023-10-13T07:30:40Z

The docs is not merged, and the code is merged.
This is ridiculous.

Totally disagree with this PR code.

@william-wang @lowang-bh

Monokaix · 2023-10-13T07:47:42Z

The docs is not merged, and the code is merged. This is ridiculous.

Totally disagree with this PR code.

@william-wang @lowang-bh

As described in doc in Possible actual use scenarios part, this pr change klog level in a hot-update way, and do not need to restart volcano.

hwdef · 2023-10-13T07:52:44Z

@Monokaix
Did you read what was in my screenshot?
Again, I'm against this approach.

xiao-jay · 2023-10-13T07:59:00Z

@Monokaix Did you read what was in my screenshot? Again, I'm against this approach.

Sometimes we encounter error when scheduling, we need to temporarily increase the log level to see more error messages, if the restart will make it more difficult to locate the problem.So we need dynamic change klog level way.

Monokaix · 2023-10-13T08:01:22Z

@Monokaix Did you read what was in my screenshot? Again, I'm against this approach.

There is some misunderstanding.
When we locate a problem of volcano, usually the existing log level is not enough and the problem cannot be located, however, after increasing the log level, the problem context is lost, making it impossible to continue locating. Therefore, it is convenient to update the log level without restarting volcano to identify the problem.

hwdef · 2023-10-13T08:14:28Z

There are many ways to do this, for example you can persist the log to a file. I don't think this pr is implemented in a good way. There is also something wrong with the code merge process.

But the code has been merged and I won't comment anymore.

volcano-sh-bot requested review from jiangkaihua and lowang-bh September 29, 2023 13:40

volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 29, 2023

xiao-jay force-pushed the dynamic-conf-socket branch 2 times, most recently from c654e7f to 58df5f7 Compare September 29, 2023 14:00

xiao-jay mentioned this pull request Sep 29, 2023

add dynamic conf design docs #3111

Merged

github-advanced-security bot found potential problems Sep 29, 2023

View reviewed changes

pkg/util/socket.go Fixed Show fixed Hide fixed

pkg/util/socket.go Fixed Show resolved Hide resolved

pkg/util/socket.go Fixed Show resolved Hide resolved

xiao-jay force-pushed the dynamic-conf-socket branch 3 times, most recently from 2fd5105 to 7a60952 Compare September 29, 2023 14:34

github-advanced-security bot found potential problems Sep 29, 2023

View reviewed changes

pkg/util/socket.go Fixed Show fixed Hide fixed

pkg/util/socket.go Fixed Show fixed Hide fixed

pkg/util/socket.go Fixed Show fixed Hide fixed

xiao-jay force-pushed the dynamic-conf-socket branch 3 times, most recently from fff5bd3 to 01c421c Compare September 29, 2023 16:10

lowang-bh reviewed Oct 7, 2023

View reviewed changes

xiao-jay force-pushed the dynamic-conf-socket branch 2 times, most recently from 4f18105 to 67c11c0 Compare October 7, 2023 15:20

socket dynamic conf

0436e14

Signed-off-by: 晓杰 <[email protected]>

xiao-jay force-pushed the dynamic-conf-socket branch from 67c11c0 to 0436e14 Compare October 8, 2023 13:34

william-wang reviewed Oct 11, 2023

View reviewed changes

add copy right

b4b499f

Signed-off-by: 晓杰 <[email protected]>

xiao-jay force-pushed the dynamic-conf-socket branch from 5896455 to b4b499f Compare October 12, 2023 15:04

william-wang approved these changes Oct 13, 2023

View reviewed changes

volcano-sh-bot assigned william-wang Oct 13, 2023

volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Oct 13, 2023

volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 13, 2023

volcano-sh-bot merged commit 1be5aa7 into volcano-sh:master Oct 13, 2023
13 checks passed

xiao-jay mentioned this pull request Oct 18, 2023

[cherry-pick for release-1.8]feat:socket dynamic conf #3159

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: dynamically change klog level through socket #3146

feat: dynamically change klog level through socket #3146

xiao-jay commented Sep 29, 2023 •

edited

Loading

lowang-bh Oct 7, 2023

xiao-jay Oct 7, 2023

lowang-bh Oct 7, 2023 •

edited

Loading

xiao-jay Oct 7, 2023

lowang-bh Oct 8, 2023

xiao-jay Oct 8, 2023

lowang-bh Oct 8, 2023 •

edited

Loading

xiao-jay Oct 8, 2023

william-wang Oct 11, 2023

xiao-jay Oct 12, 2023

william-wang left a comment

volcano-sh-bot commented Oct 13, 2023

hwdef commented Oct 13, 2023

Monokaix commented Oct 13, 2023 •

edited

Loading

hwdef commented Oct 13, 2023

xiao-jay commented Oct 13, 2023

Monokaix commented Oct 13, 2023 •

edited

Loading

hwdef commented Oct 13, 2023

feat: dynamically change klog level through socket #3146

feat: dynamically change klog level through socket #3146

Conversation

xiao-jay commented Sep 29, 2023 • edited Loading

Why not change klog level throuth change configmap

Implementation Introduction

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lowang-bh Oct 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lowang-bh Oct 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

william-wang left a comment

Choose a reason for hiding this comment

volcano-sh-bot commented Oct 13, 2023

hwdef commented Oct 13, 2023

Monokaix commented Oct 13, 2023 • edited Loading

hwdef commented Oct 13, 2023

xiao-jay commented Oct 13, 2023

Monokaix commented Oct 13, 2023 • edited Loading

hwdef commented Oct 13, 2023

xiao-jay commented Sep 29, 2023 •

edited

Loading

lowang-bh Oct 7, 2023 •

edited

Loading

lowang-bh Oct 8, 2023 •

edited

Loading

Monokaix commented Oct 13, 2023 •

edited

Loading

Monokaix commented Oct 13, 2023 •

edited

Loading