Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Anomaly Explorer: Display markers for scheduled events in distribution type anomaly charts #192377

Conversation

rbrtj
Copy link
Contributor

@rbrtj rbrtj commented Sep 9, 2024

Summary

Fix for #129304

Previously, for distribution type charts, we displayed calendar event markers only for anomalous data points. The changes improve the display of event markers for such chart types, including showing calendar event markers even when there is no underlying data point.

Scenario Before After
Rare chart image image
Population chart Zrzut ekranu 2024-09-9 o 16 16 01 image
Single metric chart (no difference) image image

@rbrtj rbrtj added release_note:enhancement :ml Feature:Anomaly Detection ML anomaly detection Team:ML Team label for ML (also use :ml) v8.16.0 labels Sep 9, 2024
@rbrtj rbrtj self-assigned this Sep 9, 2024
@rbrtj rbrtj requested a review from a team as a code owner September 9, 2024 14:22
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@rbrtj
Copy link
Contributor Author

rbrtj commented Sep 9, 2024

It is a small PR, but the changes may have some implications, so I've tagged 3 people.

@@ -105,6 +105,7 @@ export interface ChartPoint {
byFieldName?: string;
numberOfCauses?: number;
scheduledEvents?: any[];
isFakeDataPoint?: boolean;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe we can think of another name for this flag since this isn't really about fake or dummy data but just about the schedule event marker. Something like isScheduledEventMarkerWithoutData but less wordy 😅 . Maybe isScheduledEventOnly/isScheduleEventMarker?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This property isn't really needed so I got rid of it 👍

@walterra
Copy link
Contributor

For rare jobs with just a few entities we show labels and with the update in this PR the scheduled events end up as labeled undefined at the bottom of the chart. To fix this you will probably need to extend the marker data with the fieldname/value of the entity. I wonder if you add that if the custom sorting will still be necessary?

image

@walterra
Copy link
Contributor

The above is a bit tricky to reproduce, I wasn't able to come up with a job that produced these results with Kibana sample data. I tweak the logs sample dataset to come up with these results. Here's what you need to do in Kibana Dev Console:

# This returns docs that are part of rare log messages in the sample data set
GET kibana_sample_data_logs/_search?q=cgi-bin

# Grab then one of the returned docs and index them another time but just with a different response code.
# Make sure to grab your own doc from the result above because the dataset creates dynamic time stamps.
PUT kibana_sample_data_logs/_doc/asdf406?op_type=create
{
          "agent": "Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24",
          "bytes": 1831,
          "clientip": "30.156.16.164",
          "extension": "",
          "geo": {
            "srcdest": "US:IN",
            "src": "US",
            "dest": "IN",
            "coordinates": {
              "lat": 55.53741389,
              "lon": -132.3975144
            }
          },
          "host": "elastic-elastic-elastic.org",
          "index": "kibana_sample_data_logs",
          "ip": "30.156.16.163",
          "machine": {
            "ram": 9663676416,
            "os": "win xp"
          },
          "memory": 73240,
          "message": "30.156.16.163 - - [2018-09-01T12:44:34.756Z] \"GET /cgi-bin/mj_wwwusr HTTP/1.1\" 404 1831 \"-\" \"Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24\"",
          "phpmemory": 73240,
          "referer": "http://www.elastic-elastic-elastic.com/success/timothy-l-kopra",
          "request": "/cgi-bin/mj_wwwusr",
          "response": 406,
          "tags": [
            "success",
            "info"
          ],
          "@timestamp": "2024-08-10T12:44:34.756Z",
          "url": "https://elastic-elastic-elastic.org/cgi-bin/mj_wwwusr",
          "utc_time": "2024-08-10T12:44:34.756Z",
          "event": {
            "dataset": "sample_web_logs"
          },
          "bytes_gauge": 1831,
          "bytes_counter": 79545760
        }

To identify the time to be used for the scheduled event go to log rate analysis and then create a scheduled event some days before the spike:

image

Again, you will have different timestamps here because of the dynamically generated dataset.

Once the calendar with the scheduled event is set up, you can create a rare job on field response.keyword to reproduce.

@rbrtj
Copy link
Contributor Author

rbrtj commented Sep 12, 2024

For rare jobs with just a few entities we show labels and with the update in this PR the scheduled events end up as labeled undefined at the bottom of the chart. To fix this you will probably need to extend the marker data with the fieldname/value of the entity. I wonder if you add that if the custom sorting will still be necessary?

image

I've added a const entity value for markers data, also hidden the Y-axis label for markers, as the tooltip already displays events details.
Custom sorting is still needed tho, as by default we're sorting by data values length, and the event markers should be displayed at the bottom IMO

@@ -590,7 +610,11 @@ export class ExplorerChartDistribution extends React.Component {
});
}
}
} else if (chartType !== CHART_TYPE.EVENT_DISTRIBUTION) {
} else if (
chartType !== CHART_TYPE.EVENT_DISTRIBUTION &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still seeing undefined appear in the axis labels for a rare job:

Screenshot 2024-09-13 at 11 49 45

You can reproduce with the filebeat_ecs data, using the 'rare' job wizard to create a job with this config:

    "bucket_span": "3h",
    "detectors": [
      {
        "detector_description": "rare by \"http.request.method\"",
        "function": "rare",
        "by_field_name": "http.request.method",
        "detector_index": 0
      }
    ],
    "influencers": [
      "http.request.method"
    ],

Copy link
Contributor Author

@rbrtj rbrtj Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I can't seem to reproduce it to have an undefined label with the same dataset and job config:
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retested @rbrtj and I can no longer reproduce this with the undefined label!

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested latest changes and LGTM

Copy link
Contributor

@walterra walterra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work!

@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] Jest Tests #1 / Responder offline callout should be visible when agent type is sentinel_one and host is offline

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
ml 2036 2037 +1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
ml 4.6MB 4.6MB +235.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @rbrtj

@rbrtj rbrtj merged commit d5b1fdf into elastic:main Sep 17, 2024
20 checks passed
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Sep 17, 2024
…bution type anomaly charts (elastic#192377)

## Summary

Fix for [elastic#129304](elastic#129304)

Previously, for distribution type charts, we displayed calendar event
markers only for anomalous data points. The changes improve the display
of event markers for such chart types, including showing calendar event
markers even when there is no underlying data point.

| Scenario | Before    | After    |
| :---:   | :---: | :---: |
| Rare chart |
![image](https://github.com/user-attachments/assets/c3e186c0-0ec8-434f-a845-3f9e703431dd)
|
![image](https://github.com/user-attachments/assets/3dd51cd1-6972-4343-bbc8-8e5f38d7c6bd)
|
| Population chart | ![Zrzut ekranu 2024-09-9 o 16 16
01](https://github.com/user-attachments/assets/df22dc40-3c8b-46fe-9a5a-02a41278245c)
|
![image](https://github.com/user-attachments/assets/c198e75e-14c8-4194-9d71-2358d25f21d5)
|
| Single metric chart (no difference) |
![image](https://github.com/user-attachments/assets/d0546ba0-46b1-4d2e-9976-fe49bcd4d2da)
|
![image](https://github.com/user-attachments/assets/c11ec696-b1f4-4ddf-9542-037b8dd2d31f)
|

(cherry picked from commit d5b1fdf)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Sep 17, 2024
…distribution type anomaly charts (#192377) (#193118)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[ML] Anomaly Explorer: Display markers for scheduled events in
distribution type anomaly charts
(#192377)](#192377)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Robert
Jaszczurek","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-09-17T07:53:32Z","message":"[ML]
Anomaly Explorer: Display markers for scheduled events in distribution
type anomaly charts (#192377)\n\n## Summary\r\n\r\nFix for
[#129304](https://github.com/elastic/kibana/issues/129304)\r\n\r\nPreviously,
for distribution type charts, we displayed calendar event\r\nmarkers
only for anomalous data points. The changes improve the display\r\nof
event markers for such chart types, including showing calendar
event\r\nmarkers even when there is no underlying data point.\r\n\r\n|
Scenario | Before | After |\r\n| :---: | :---: | :---: |\r\n| Rare chart
|\r\n![image](https://github.com/user-attachments/assets/c3e186c0-0ec8-434f-a845-3f9e703431dd)\r\n|\r\n![image](https://github.com/user-attachments/assets/3dd51cd1-6972-4343-bbc8-8e5f38d7c6bd)\r\n|\r\n|
Population chart | ![Zrzut ekranu 2024-09-9 o 16
16\r\n01](https://github.com/user-attachments/assets/df22dc40-3c8b-46fe-9a5a-02a41278245c)\r\n|\r\n![image](https://github.com/user-attachments/assets/c198e75e-14c8-4194-9d71-2358d25f21d5)\r\n|\r\n|
Single metric chart (no difference)
|\r\n![image](https://github.com/user-attachments/assets/d0546ba0-46b1-4d2e-9976-fe49bcd4d2da)\r\n|\r\n![image](https://github.com/user-attachments/assets/c11ec696-b1f4-4ddf-9542-037b8dd2d31f)\r\n|","sha":"d5b1fdf49af8eb23210e3b15a20fe1f9b660eea8","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:enhancement",":ml","Feature:Anomaly
Detection","v9.0.0","Team:ML","v8.16.0"],"title":"[ML] Anomaly Explorer:
Display markers for scheduled events in distribution type anomaly
charts","number":192377,"url":"https://github.com/elastic/kibana/pull/192377","mergeCommit":{"message":"[ML]
Anomaly Explorer: Display markers for scheduled events in distribution
type anomaly charts (#192377)\n\n## Summary\r\n\r\nFix for
[#129304](https://github.com/elastic/kibana/issues/129304)\r\n\r\nPreviously,
for distribution type charts, we displayed calendar event\r\nmarkers
only for anomalous data points. The changes improve the display\r\nof
event markers for such chart types, including showing calendar
event\r\nmarkers even when there is no underlying data point.\r\n\r\n|
Scenario | Before | After |\r\n| :---: | :---: | :---: |\r\n| Rare chart
|\r\n![image](https://github.com/user-attachments/assets/c3e186c0-0ec8-434f-a845-3f9e703431dd)\r\n|\r\n![image](https://github.com/user-attachments/assets/3dd51cd1-6972-4343-bbc8-8e5f38d7c6bd)\r\n|\r\n|
Population chart | ![Zrzut ekranu 2024-09-9 o 16
16\r\n01](https://github.com/user-attachments/assets/df22dc40-3c8b-46fe-9a5a-02a41278245c)\r\n|\r\n![image](https://github.com/user-attachments/assets/c198e75e-14c8-4194-9d71-2358d25f21d5)\r\n|\r\n|
Single metric chart (no difference)
|\r\n![image](https://github.com/user-attachments/assets/d0546ba0-46b1-4d2e-9976-fe49bcd4d2da)\r\n|\r\n![image](https://github.com/user-attachments/assets/c11ec696-b1f4-4ddf-9542-037b8dd2d31f)\r\n|","sha":"d5b1fdf49af8eb23210e3b15a20fe1f9b660eea8"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/192377","number":192377,"mergeCommit":{"message":"[ML]
Anomaly Explorer: Display markers for scheduled events in distribution
type anomaly charts (#192377)\n\n## Summary\r\n\r\nFix for
[#129304](https://github.com/elastic/kibana/issues/129304)\r\n\r\nPreviously,
for distribution type charts, we displayed calendar event\r\nmarkers
only for anomalous data points. The changes improve the display\r\nof
event markers for such chart types, including showing calendar
event\r\nmarkers even when there is no underlying data point.\r\n\r\n|
Scenario | Before | After |\r\n| :---: | :---: | :---: |\r\n| Rare chart
|\r\n![image](https://github.com/user-attachments/assets/c3e186c0-0ec8-434f-a845-3f9e703431dd)\r\n|\r\n![image](https://github.com/user-attachments/assets/3dd51cd1-6972-4343-bbc8-8e5f38d7c6bd)\r\n|\r\n|
Population chart | ![Zrzut ekranu 2024-09-9 o 16
16\r\n01](https://github.com/user-attachments/assets/df22dc40-3c8b-46fe-9a5a-02a41278245c)\r\n|\r\n![image](https://github.com/user-attachments/assets/c198e75e-14c8-4194-9d71-2358d25f21d5)\r\n|\r\n|
Single metric chart (no difference)
|\r\n![image](https://github.com/user-attachments/assets/d0546ba0-46b1-4d2e-9976-fe49bcd4d2da)\r\n|\r\n![image](https://github.com/user-attachments/assets/c11ec696-b1f4-4ddf-9542-037b8dd2d31f)\r\n|","sha":"d5b1fdf49af8eb23210e3b15a20fe1f9b660eea8"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Robert Jaszczurek <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants