Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrap the manager process #1525

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Bootstrap the manager process #1525

wants to merge 6 commits into from

Conversation

marccampbell
Copy link
Member

No description provided.

Copy link

github-actions bot commented Nov 19, 2024

This PR has been released (on staging) and is available for download with a embedded-cluster-smoke-test-staging-app license ID.

Online Installer:

curl "https://staging.replicated.app/embedded/embedded-cluster-smoke-test-staging-app/ci/appver-dev-32a67f3" -H "Authorization: $EC_SMOKE_TEST_LICENSE_ID" -o embedded-cluster-smoke-test-staging-app-ci.tgz

Airgap Installer (may take a few minutes before the airgap bundle is built):

curl "https://staging.replicated.app/embedded/embedded-cluster-smoke-test-staging-app/ci-airgap/appver-dev-32a67f3?airgap=true" -H "Authorization: $EC_SMOKE_TEST_LICENSE_ID" -o embedded-cluster-smoke-test-staging-app-ci.tgz

Happy debugging!

cmd/manager/status.go Outdated Show resolved Hide resolved
}

func GetSocketPath() (string, error) {
tmpDir := os.TempDir()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt you use /var/run for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here justifying why its not in the run dir and explaining that this will be created in our data directory rather than /tmp

}

func GetSocketPath() (string, error) {
tmpDir := os.TempDir()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here justifying why its not in the run dir and explaining that this will be created in our data directory rather than /tmp

Comment on lines 18 to 19
if _, err := os.Stat(socketFile); err == nil {
if err := os.Remove(socketFile); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you combine this so it is atomic and use os.Remove and handle os.ErrNotExist?

"k8s.io/apimachinery/pkg/types"
)

func ConnectToKOTSWebSocket(ctx context.Context) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func ConnectToKOTSWebSocket(ctx context.Context) error {
func ConnectToKOTSWebSocket(ctx context.Context) {

logrus.Infof("Sending ping message '%s' to server", pingMsg)

if err := conn.WriteControl(gwebsocket.PingMessage, []byte(pingMsg), time.Now().Add(1*time.Second)); err != nil {
return errors.Wrap(err, "send ping message")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this really exit the loop? this is happening in a go routine. dont you want to just log and retry?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this error typically happens when the connection is closed. when this returns, the manager will try to reconnect to kots again.


func wsPingHandler(conn *gwebsocket.Conn) func(message string) error {
return func(message string) error {
logrus.Infof("Received ping message '%s' from server", message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to cause too much noise in the logs. My philosophy on logging is anything that is happening on a schedule should probably be more highly scrutinized when logging at higher levels. Could you move this to trace or debug level?


func wsPongHandler() func(message string) error {
return func(message string) error {
logrus.Infof("Received pong message '%s' from server", message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar about log level.

logrus.Infof("Received ping message '%s' from server", message)
logrus.Infof("Sending pong message '%s' from server", message)
if err := conn.WriteControl(gwebsocket.PongMessage, []byte(message), time.Now().Add(1*time.Second)); err != nil {
logrus.Errorf("Failed to send pong message to server: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why log and not return the error here? Im not saying it is correct. im just assuming the client will not receive the error so im wondering what is the desired behavior here.

Use: "status",
Short: fmt.Sprintf("Get the status of the %s cluster manager", name),
RunE: func(cmd *cobra.Command, args []string) error {
return getServerStatus(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return getServerStatus(ctx)
return printServerStatus(ctx)

return cmd
}

func getServerStatus(ctx context.Context) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func getServerStatus(ctx context.Context) error {
func printServerStatus(ctx context.Context) error {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants