Skip to content

Commit

Permalink
critical alerts ui docs update (#757)
Browse files Browse the repository at this point in the history
Co-authored-by: Ruben Fiszel <[email protected]>
  • Loading branch information
alpetric and rubenfiszel authored Nov 22, 2024
1 parent 86d789f commit 567c4b3
Show file tree
Hide file tree
Showing 10 changed files with 45 additions and 20 deletions.
10 changes: 5 additions & 5 deletions changelog/2024-09-01-critical-alert-channels/index.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
---
slug: critical-alert-channels
slug: critical-alerts
version: v1.389.0
title: Critical alert channels
title: Critical Alerts
tags: ['Workers', 'Security', 'Enterprise Edition']
description: Get a notification everytime a job is re-run after a crash.
description: Get a notification everytime on critical events such as when a job is re-run after a crash.
features:
[
'Get email or Slack notifications everytime a job is re-run after a crash.',
'Get email or Slack notifications for critical events such as everytime a job is re-run after a crash.',
'You can set an alert to receive notification via Email or Slack when the number of running workers in a group falls below a given number.'
]
image: ./critical_alert_slack.png
docs: /docs/core_concepts/critical_alert_channels
docs: /docs/core_concepts/critical_alerts
---
8 changes: 5 additions & 3 deletions docs/advanced/18_instance_settings/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -116,10 +116,12 @@ This setting is only available on [Enterprise Edition](/pricing).

### Critical alert channels

Channels to send [critical alerts](../../core_concepts/37_critical_alert_channels/index.mdx) to. [SMTP](#smtp) must be configured for the email channel. A Slack workspace must be connected to the instance for the Slack channel.
Channels to send [critical alerts](../../core_concepts/37_critical_alerts/index.mdx) to. [SMTP](#smtp) must be configured for the email channel. A Slack workspace must be connected to the instance for the Slack channel.

You can add multiple channels between Email and Slack.

Furthermore you can chose to mute [critical alerts in the UI](../../core_concepts/37_critical_alerts/index.mdx) using the "Mute critical alerts in UI" toggle.

This setting is only available on [Enterprise Edition](/pricing).

### Azure OpenAI base path
Expand Down Expand Up @@ -211,7 +213,7 @@ Add private scoped registries for Bun, See: https://bun.sh/docs/install/registri

## SMTP

Setting SMTP unlocks [sending emails upon adding new users](../../core_concepts/15_authentification/index.mdx) to the workspace or the instance and [sending critical alerts](#critical-alert-channels).
Setting SMTP unlocks [sending emails upon adding new users](../../core_concepts/15_authentification/index.mdx) to the workspace or the instance and [sending critical alerts](#critical-alerts).

![SMTP](./smtp.png "SMTP")

Expand Down Expand Up @@ -310,7 +312,7 @@ When creating a workspace, you have the option to invite automatically everyone

## Slack

Connecting your instance to a Slack workspace enables [critical alerts](#critical-alert-channels) to be sent to a Slack channel.
Connecting your instance to a Slack workspace enables [critical alerts](#critical-alerts) to be sent to a Slack channel.

![Slack](./slack.png "Slack")

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
import DocCard from '@site/src/components/DocCard';

# Critical alert channels
# Critical Alerts

Get a notification everytime a job is re-run after a crash.
Get a notification on critical events.

This feature is available in the [Enterprise Edition](/pricing).

If the node it which it runs halt suddenly (such as a power loss), then the [job](../20_jobs/index.mdx) will be restarted automatically. Windmill itself doesn't crash and other softer interruptions like a pod termination involve a grace period (300s) to let the job finish.

Critical alerts are generated under the following conditions:

- [Job](../20_jobs/index.mdx) is re-run after a crash.
Expand All @@ -16,14 +14,18 @@ Critical alerts are generated under the following conditions:
- Number of running workers in a group falls below a specified threshold (has to be configured in the [worker group](../9_worker_groups/index.mdx) config).
- Number of [jobs waiting in queue](../9_worker_groups/index.mdx#queue-metric-alerts) is above a threshold for more than a specified amount of time.

**Re-run after crash**: If the node it which it runs halt suddenly (such as a power loss), then the [job](../20_jobs/index.mdx) will be restarted automatically. Windmill itself doesn't crash and other softer interruptions like a pod termination involve a grace period (300s) to let the job finish.

You just need to [configure SMTP](../../advanced/1_self_host/index.mdx#set-up-smtp-from-the-ui) and setup a critical alert channel (aka email address) in the [instance settings](../../advanced/18_instance_settings/index.mdx#smtp) or connect your instance to Slack and fill in a channel name.

![Critical alert channels Config](./critical_alert_channels_config.png "Critical alert channels Config")

You can also set an alert to receive notification when the number of running workers in a group falls below a given number. It's available in the [worker group config](../9_worker_groups/index.mdx#alerts).

![Workers alerts Slack](../9_worker_groups/critical_alert_slack.png 'Workers alerts Slack')

Critical alerts notifications will also be surfaced on the Windmill UI where they can viewed and acknowledged. You can mute the critical alert ui in the instance settings.

![Critical Alert UI](./critical_alert_ui.png 'Workers alerts Slack')

<div className="grid grid-cols-2 gap-6 mb-4">
<DocCard
title="Workers Alerts"
Expand Down
23 changes: 22 additions & 1 deletion docs/core_concepts/9_worker_groups/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,27 @@ Last, if the tags contain `$args[argName]` (e.g: `foo-$args[foobar])` then the t
See [Deploy to staging prod](../12_staging_prod/index.md) to see a full UI flow to deploy to staging and prod.
## Alerts
You can set an alert to receive notification via Email or Slack when the number of running workers in a group falls below a given number. It's available in the worker group config.
![Workers alerts Slack](./critical_alert_slack.png 'Workers alerts Slack')
Enable 'Send an alert when the number of alive workers falls below a given threshold', and enter en number of workers below which the notificationi will be sent.
You need to configure [Critical alert channels](../37_critical_alerts/index.mdx) to receive notifications.
![Workers alerts](./workers_alerts.png 'Workers alerts')
<div className="grid grid-cols-2 gap-6 mb-4">
<DocCard
title="Critical Alerts"
description="Get a notification for critical events such as everytime a job is re-run after a crash."
href="/docs/core_concepts/critical_alerts"
/>
</div>
## Create worker group config
<img
Expand Down Expand Up @@ -380,7 +401,7 @@ Only tags for jobs that have been delayed by more than 3 seconds in the last 14
### Queue metric alerts
[Enterprise Edition](/pricing) users can set up [Critical Alerts](../37_critical_alert_channels/index.mdx) on the [Queue Metrics page](#queue-metrics), and be notified when the number of delayed jobs in a queue is above a certain threshold for more than a configured amount of time. The "cooldown" parameter determines the minimum duration between two consecutive alerts if the number of waiting jobs are fluctuating around the configured threshold.
[Enterprise Edition](/pricing) users can set up [Critical Alerts](../37_critical_alerts/index.mdx) on the [Queue Metrics page](#queue-metrics), and be notified when the number of delayed jobs in a queue is above a certain threshold for more than a configured amount of time. The "cooldown" parameter determines the minimum duration between two consecutive alerts if the number of waiting jobs are fluctuating around the configured threshold.
![Queue Metrics](./queue_metrics_alert.png 'Queue Metrics Alert')
Expand Down
4 changes: 2 additions & 2 deletions docs/core_concepts/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -565,9 +565,9 @@ All details & features on [Pricing page](/pricing).
href="/docs/core_concepts/agent_workers"
/>
<DocCard
title="Critical alert channels"
title="Critical Alerts"
description="Get a notification everytime a job is re-run after a crash."
href="/docs/core_concepts/critical_alert_channels"
href="/docs/core_concepts/critical_alerts"
/>
<DocCard
title="Content search"
Expand Down
2 changes: 1 addition & 1 deletion docs/integrations/smtp.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[SMTP](https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol) (Simple Mail Transfer Protocol) is an internet standard for electronic mail transmission.

Note that SMTP can be configured at the instance level to auto-invite users or send [critical alerts](../core_concepts/37_critical_alert_channels/index.mdx). See [Set up SMTP](../advanced/1_self_host/index.mdx#set-up-smtp-from-the-ui).
Note that SMTP can be configured at the instance level to auto-invite users or send [critical alerts](../core_concepts/37_critical_alerts/index.mdx). See [Set up SMTP](../advanced/1_self_host/index.mdx#set-up-smtp-from-the-ui).

To add a SMTP [resource](../core_concepts/3_resources_and_types/index.mdx) to Windmill, you need to save the following elements:

Expand Down
2 changes: 1 addition & 1 deletion sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -861,7 +861,7 @@ const sidebars = {
'advanced/external_auth_with_jwt/index',
'core_concepts/dedicated_workers/index',
'core_concepts/agent_workers/index',
'core_concepts/critical_alert_channels/index',
'core_concepts/critical_alerts/index',
'core_concepts/content_search/index',
'core_concepts/codebases_and_bundles/index',
'apps/css_editor',
Expand Down
2 changes: 1 addition & 1 deletion src/components/Pricing.js
Original file line number Diff line number Diff line change
Expand Up @@ -807,7 +807,7 @@ const sections = [
'tier-free': false,
'tier-team': false
},
link: '/docs/core_concepts/critical_alert_channels'
link: '/docs/core_concepts/critical_alerts'
},
{
name: 'Script-specific workers (dedicated) / High throughput',
Expand Down

0 comments on commit 567c4b3

Please sign in to comment.