Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show warning banner on user namespace FailedScheduling event #1211

Merged
merged 5 commits into from
Oct 15, 2024
Merged

Conversation

vinokurig
Copy link
Contributor

@vinokurig vinokurig commented Oct 4, 2024

What does this PR do?

Show the notification banner on the workspace startup screen, informing user that workspace startup would take longer due to a new node being provisioned.
Add a user namespace listener and catch the FaildedScheduling event:

{
  "New item": {
    "eventPhase": "ADDED",
    "event": {
      "kind": "Event",
      "apiVersion": "v1",
      "metadata": {
        "name": "workspace58a0cc9c0eb94bc1-58fd9c7f85-2fbb5.17fae4587f8d7c9c",
        "namespace": "che-kube-admin-che-374upj",
        "uid": "070256c7-eb1e-48bc-a796-5fdc419c34a8",
        "resourceVersion": "85282",
        "creationTimestamp": "2024-10-03T08:34:35Z",
        "managedFields": [
          {
            "manager": "kube-scheduler",
            "operation": "Update",
            "apiVersion": "events.k8s.io/v1",
            "time": "2024-10-03T08:34:35Z",
            "fieldsType": "FieldsV1",
            "fieldsV1": {
              "f:action": {},
              "f:eventTime": {},
              "f:note": {},
              "f:reason": {},
              "f:regarding": {},
              "f:reportingController": {},
              "f:reportingInstance": {},
              "f:type": {}
            }
          }
        ]
      },
      "involvedObject": {
        "kind": "Pod",
        "namespace": "che-kube-admin-che-374upj",
        "name": "workspace58a0cc9c0eb94bc1-58fd9c7f85-2fbb5",
        "uid": "a666e7c2-33dd-48f3-a6a9-fd7beeeace25",
        "apiVersion": "v1",
        "resourceVersion": "85280"
      },
      "reason": "FailedScheduling",
      "message": "0/6 nodes are available: 1 Insufficient memory, 2 node(s) were unschedulable, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/6 nodes are available: 1 No preemption victims found for incoming pod, 5 Preemption is not helpful for scheduling.",
      "source": {},
      "firstTimestamp": null,
      "lastTimestamp": null,
      "type": "Warning",
      "eventTime": "2024-10-03T08:34:35.798237Z",
      "action": "Scheduling",
      "reportingComponent": "default-scheduler",
      "reportingInstance": "default-scheduler-ip-10-0-104-158"
    }
  }
}

Screenshot/screencast of this PR

What issues does this PR fix or reference?

fixes eclipse-che/che#22598

Is it tested? How?

  1. Configure DWOC to ignore FailedScheduling error during workspace startup:
apiVersion: controller.devfile.io/v1alpha1
config:
  workspace:
    ignoredUnrecoverableEvents:
      - FailedScheduling
    progressTimeout: 600s
  1. Decrease the number of working nodes, by setting all theMachineSet yaml replicas to 0:
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  name: ****
spec:
  replicas: 0
  1. Start a workspace, see:
    screenshot-eclipse-che_apps_ci-ln-286d3lk-76ef8_aws-2_ci_openshift_org-2024_10_07-22_08_00

  2. Increase the number of working nodes back and wait until the workspace starts.

Release Notes

Show the notification banner on the workspace startup screen, informing user that workspace startup would take longer due to a new node being provisioned.

Docs PR

@che-bot
Copy link
Contributor

che-bot commented Oct 4, 2024

Click here to review and test in web IDE: Contribute

Copy link

github-actions bot commented Oct 4, 2024

Docker image build succeeded: quay.io/eclipse/che-dashboard:pr-1211

kubectl patch command
kubectl patch -n eclipse-che "checluster/eclipse-che" --type=json -p="[{"op": "replace", "path": "/spec/components/dashboard/deployment", "value": {containers: [{image: "quay.io/eclipse/che-dashboard:pr-1211", name: che-dashboard}]}}]"

Copy link

codecov bot commented Oct 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.87%. Comparing base (449ee7c) to head (1434618).
Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1211      +/-   ##
==========================================
+ Coverage   89.80%   89.87%   +0.07%     
==========================================
  Files         443      444       +1     
  Lines       45793    45905     +112     
  Branches     3051     3069      +18     
==========================================
+ Hits        41126    41259     +133     
+ Misses       4631     4608      -23     
- Partials       36       38       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vinokurig vinokurig force-pushed the che-22598 branch 2 times, most recently from 27c042a to e75022a Compare October 7, 2024 08:36
Copy link

github-actions bot commented Oct 7, 2024

Docker image build succeeded: quay.io/eclipse/che-dashboard:pr-1211

kubectl patch command
kubectl patch -n eclipse-che "checluster/eclipse-che" --type=json -p="[{"op": "replace", "path": "/spec/components/dashboard/deployment", "value": {containers: [{image: "quay.io/eclipse/che-dashboard:pr-1211", name: che-dashboard}]}}]"

}

public async componentDidMount() {
const devWorkspaceListener: ChannelListener = message => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vinokurig,
Can you use selectors for events and devWorkspaces instead of adding these WS listeners?
I mean selectAllEvents and selectAllDevWorkspaces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, @akurinnoy please take a look


const websocketClient = container.get(WebsocketClient);
const text =
'Cluster autoscaler is provisioning a new node at the moment. Please be patient, workspace startup will be taking longer than usual.';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we fall back on only the "FailedScheduling" event we should be more careful with wording since it might happen for different reasons


const websocketClient = container.get(WebsocketClient);
const text =
'Cluster autoscaler is provisioning a new node at the moment. Please be patient, workspace startup will be taking longer than usual.';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'FailedScheduling' event occurred. If cluster autoscaler is enabled it might be provisioning a new node now and workspace startup will take longer than usual. Check the 'Events' tab to get more details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, changed the screenshot image.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DId not add the Check the 'Events' tab to get more details. as the banner is also appears in the main dashboard page:
screenshot-eclipse-che_apps_ci-ln-286d3lk-76ef8_aws-2_ci_openshift_org-2024_10_07-22_04_38

Copy link

github-actions bot commented Oct 7, 2024

Docker image build succeeded: quay.io/eclipse/che-dashboard:pr-1211

kubectl patch command
kubectl patch -n eclipse-che "checluster/eclipse-che" --type=json -p="[{"op": "replace", "path": "/spec/components/dashboard/deployment", "value": {containers: [{image: "quay.io/eclipse/che-dashboard:pr-1211", name: che-dashboard}]}}]"

1 similar comment
Copy link

github-actions bot commented Oct 7, 2024

Docker image build succeeded: quay.io/eclipse/che-dashboard:pr-1211

kubectl patch command
kubectl patch -n eclipse-che "checluster/eclipse-che" --type=json -p="[{"op": "replace", "path": "/spec/components/dashboard/deployment", "value": {containers: [{image: "quay.io/eclipse/che-dashboard:pr-1211", name: che-dashboard}]}}]"

@openshift-ci openshift-ci bot removed the lgtm label Oct 9, 2024
Copy link

Docker image build succeeded: quay.io/eclipse/che-dashboard:pr-1211

kubectl patch command
kubectl patch -n eclipse-che "checluster/eclipse-che" --type=json -p="[{"op": "replace", "path": "/spec/components/dashboard/deployment", "value": {containers: [{image: "quay.io/eclipse/che-dashboard:pr-1211", name: che-dashboard}]}}]"

Copy link
Contributor

@olexii4 olexii4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

openshift-ci bot commented Oct 14, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: akurinnoy, ibuziuk, olexii4, vinokurig

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@artaleks9
Copy link
Contributor

Verified on Eclipse Che with quay.io/eclipse/che-dashboard:pr-1211 image
The functionality works properly

@vinokurig vinokurig merged commit 50072a8 into main Oct 15, 2024
20 checks passed
@vinokurig vinokurig deleted the che-22598 branch October 15, 2024 09:33
@devstudio-release
Copy link

Build 3.18 :: dashboard_3.x/569: Console, Changes, Git Data

@devstudio-release
Copy link

@devstudio-release
Copy link

olexii4 pushed a commit that referenced this pull request Oct 28, 2024
Show the notification banner on the workspace startup screen, informing user that workspace startup would take longer due to a new node being provisioned.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

As a developer, I want to be notified when autoscaler kicks in during the workspaces startup
7 participants