Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution][Detections] Add basic error handling to logging methods of Rule Execution Log #120157

Conversation

banderror
Copy link
Contributor

@banderror banderror commented Dec 1, 2021

Ticket: #119595

Summary

With this PR, calling IRuleExecutionLog.logStatusChange() is now safe for callers (rule executors or route handlers). It doesn't throw and writes any exceptions being caught to Kibana logs.

Example error in the logs:

[2021-12-01T20:05:36.620+01:00][DEBUG][plugins.securitySolution] ...
[2021-12-01T20:05:37.591+01:00][ERROR][plugins.securitySolution.ruleExecution] Error logging rule execution status change; status: "succeeded", rule id: "abea42f0-489b-11ec-83fa-1bde56c47ca8", rule name: "SDH-247 my loooooong loooooong name"; Error: TESTING ERROR HANDLING
    at EventLogAdapter.logStatusChangeToSavedObjects (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:91:11)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 0)
    at EventLogAdapter.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:83:5)
    at RuleExecutionLogClient.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/rule_execution_log_client.ts:91:7)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_types/create_security_rule_type_wrapper.ts:345:15)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/rule_registry/server/utils/create_persistence_rule_type_wrapper.ts:20:23)
    at TaskRunner.executeAlertInstances (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:316:31)
    at promiseResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/lib/result_type.ts:47:17)
    at TaskRunner.loadAlertAttributesAndRun (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:569:14)
    at errorAsAlertTaskRunResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:1119:12)
    at TaskRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:635:33)
    at TaskManagerRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/task_manager/server/task_running/task_runner.ts:297:22)
[2021-12-01T20:05:37.591+01:00][DEBUG][plugins.securitySolution] ...

Additional log metadata

For debugging purposes, rule execution log client writes additional log metadata with any caught exception. It's not visible in Kibana console logs by default, but can be enabled in Kibana config.

I used the following config to test it:

logging:
  appenders:
    custom_console:
      type: console
      layout:
        type: pattern
        highlight: true
        pattern: "[%date][%level][%logger] %message %meta"
  root:
    appenders: [custom_console]
    level: warn
  loggers:
    - name: config
      level: info
    - name: plugins
      level: warn
    - name: plugins.securitySolution
      level: debug
    - name: plugins.ruleRegistry
      level: info
    - name: plugins.alerting
      level: info
    - name: plugins.taskManager
      level: info
    - name: savedobjects-service
      level: info
    - name: elasticsearch-service
      level: off
    - name: elasticsearch
      level: off
    - name: elasticsearch.query
      level: off
    - name: server
      level: fatal
    - name: optimize
      level: info
    - name: metrics.ops
      level: off

Error in the logs with additional metadata (find it after the stack trace):

[2021-12-01T20:03:30.807+01:00][DEBUG][plugins.securitySolution] ...
[2021-12-01T20:03:31.812+01:00][ERROR][plugins.securitySolution.ruleExecution] Error logging rule execution status change; status: "succeeded", rule id: "abea42f0-489b-11ec-83fa-1bde56c47ca8", rule name: "SDH-247 my loooooong loooooong name"; Error: TESTING ERROR HANDLING
    at EventLogAdapter.logStatusChangeToSavedObjects (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:91:11)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 0)
    at EventLogAdapter.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:83:5)
    at RuleExecutionLogClient.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/rule_execution_log_client.ts:93:7)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_types/create_security_rule_type_wrapper.ts:345:15)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/rule_registry/server/utils/create_persistence_rule_type_wrapper.ts:20:23)
    at TaskRunner.executeAlertInstances (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:316:31)
    at promiseResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/lib/result_type.ts:47:17)
    at TaskRunner.loadAlertAttributesAndRun (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:569:14)
    at errorAsAlertTaskRunResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:1119:12)
    at TaskRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:635:33)
    at TaskManagerRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/task_manager/server/task_running/task_runner.ts:297:22) {"rule":{"id":"abea42f0-489b-11ec-83fa-1bde56c47ca8","name":"SDH-247 my loooooong loooooong name","type":"siem.queryRule","execution":{"status":"succeeded"}},"kibana":{"spaceId":"default"}}
[2021-12-01T20:03:31.812+01:00][DEBUG][plugins.securitySolution] ...

Here it is extracted from the log:

{"rule":{"id":"abea42f0-489b-11ec-83fa-1bde56c47ca8","name":"SDH-247 my loooooong loooooong name","type":"siem.queryRule","execution":{"status":"succeeded"}},"kibana":{"spaceId":"default"}}

More info on the layout configuration: https://www.elastic.co/guide/en/kibana/master/logging-configuration.html#pattern-layout
I couldn't decently pretty print it with the config options that are available 😞

I was thinking what to include in the log metadata and what to include in the regular log message, and my thoughts so far were:

  • It could be useful to add important attributes like correlation ids (rule id etc) and details of the original exception directly into the log message, because that's what is going to be logged out of the box with the default logging configuration.
  • Log meta could be used for logging optional attributes that could be useful when digging deeper into an issue when debugging Security app and Detection Engine.

Checklist

@banderror banderror added technical debt Improvement of the software architecture and operational architecture v8.0.0 release_note:skip Skip the PR/issue when compiling release notes Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. auto-backport Deprecated - use backport:version if exact versions are needed v8.1.0 Feature:Rule Monitoring Security Solution Detection Rule Monitoring area Team:Detection Rule Management Security Detection Rule Management Team labels Dec 1, 2021
@banderror banderror requested a review from a team as a code owner December 1, 2021 19:36
@banderror banderror self-assigned this Dec 1, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@banderror banderror requested a review from a team December 1, 2021 19:36
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @banderror

Copy link
Member

@spong spong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked out, tested locally verifying exception in logs, and code reviewed -- LGTM! 👍

I didn't realize we could append additional metadata like this with the logger, really cool! While a bummer about the formatting, at least we have another tool in the toolbox for expanding the detail of logs and it'll work on cloud too! 🙂

@banderror banderror merged commit acbee9b into elastic:main Dec 2, 2021
@banderror banderror deleted the add-error-handling-to-rule-execution-log-client branch December 2, 2021 10:04
banderror added a commit to banderror/kibana that referenced this pull request Dec 2, 2021
…ethods of Rule Execution Log (elastic#120157)

**Ticket:** elastic#119595

## Summary

With this PR, calling `IRuleExecutionLog.logStatusChange()` is now safe for callers (rule executors or route handlers). It doesn't throw and writes any exceptions being caught to Kibana logs.

Example error in the logs:

```
[2021-12-01T20:05:36.620+01:00][DEBUG][plugins.securitySolution] ...
[2021-12-01T20:05:37.591+01:00][ERROR][plugins.securitySolution.ruleExecution] Error logging rule execution status change; status: "succeeded", rule id: "abea42f0-489b-11ec-83fa-1bde56c47ca8", rule name: "SDH-247 my loooooong loooooong name"; Error: TESTING ERROR HANDLING
    at EventLogAdapter.logStatusChangeToSavedObjects (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:91:11)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 0)
    at EventLogAdapter.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:83:5)
    at RuleExecutionLogClient.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/rule_execution_log_client.ts:91:7)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_types/create_security_rule_type_wrapper.ts:345:15)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/rule_registry/server/utils/create_persistence_rule_type_wrapper.ts:20:23)
    at TaskRunner.executeAlertInstances (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:316:31)
    at promiseResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/lib/result_type.ts:47:17)
    at TaskRunner.loadAlertAttributesAndRun (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:569:14)
    at errorAsAlertTaskRunResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:1119:12)
    at TaskRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:635:33)
    at TaskManagerRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/task_manager/server/task_running/task_runner.ts:297:22)
[2021-12-01T20:05:37.591+01:00][DEBUG][plugins.securitySolution] ...
```

## Additional log metadata

For debugging purposes, rule execution log client writes additional log metadata with any caught exception. It's not visible in Kibana console logs by default, but can be enabled in Kibana config.

I used the following config to test it:

```yaml
logging:
  appenders:
    custom_console:
      type: console
      layout:
        type: pattern
        highlight: true
        pattern: "[%date][%level][%logger] %message %meta"
  root:
    appenders: [custom_console]
    level: warn
  loggers:
    - name: config
      level: info
    - name: plugins
      level: warn
    - name: plugins.securitySolution
      level: debug
    - name: plugins.ruleRegistry
      level: info
    - name: plugins.alerting
      level: info
    - name: plugins.taskManager
      level: info
    - name: savedobjects-service
      level: info
    - name: elasticsearch-service
      level: off
    - name: elasticsearch
      level: off
    - name: elasticsearch.query
      level: off
    - name: server
      level: fatal
    - name: optimize
      level: info
    - name: metrics.ops
      level: off
```

Error in the logs with additional metadata (find it after the stack trace):

```
[2021-12-01T20:03:30.807+01:00][DEBUG][plugins.securitySolution] ...
[2021-12-01T20:03:31.812+01:00][ERROR][plugins.securitySolution.ruleExecution] Error logging rule execution status change; status: "succeeded", rule id: "abea42f0-489b-11ec-83fa-1bde56c47ca8", rule name: "SDH-247 my loooooong loooooong name"; Error: TESTING ERROR HANDLING
    at EventLogAdapter.logStatusChangeToSavedObjects (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:91:11)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 0)
    at EventLogAdapter.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:83:5)
    at RuleExecutionLogClient.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/rule_execution_log_client.ts:93:7)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_types/create_security_rule_type_wrapper.ts:345:15)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/rule_registry/server/utils/create_persistence_rule_type_wrapper.ts:20:23)
    at TaskRunner.executeAlertInstances (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:316:31)
    at promiseResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/lib/result_type.ts:47:17)
    at TaskRunner.loadAlertAttributesAndRun (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:569:14)
    at errorAsAlertTaskRunResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:1119:12)
    at TaskRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:635:33)
    at TaskManagerRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/task_manager/server/task_running/task_runner.ts:297:22) {"rule":{"id":"abea42f0-489b-11ec-83fa-1bde56c47ca8","name":"SDH-247 my loooooong loooooong name","type":"siem.queryRule","execution":{"status":"succeeded"}},"kibana":{"spaceId":"default"}}
[2021-12-01T20:03:31.812+01:00][DEBUG][plugins.securitySolution] ...
```

Here it is extracted from the log:

```
{"rule":{"id":"abea42f0-489b-11ec-83fa-1bde56c47ca8","name":"SDH-247 my loooooong loooooong name","type":"siem.queryRule","execution":{"status":"succeeded"}},"kibana":{"spaceId":"default"}}
```

More info on the layout configuration: https://www.elastic.co/guide/en/kibana/master/logging-configuration.html#pattern-layout
I couldn't decently pretty print it with the config options that are available 😞 

I was thinking what to include in the log metadata and what to include in the regular log message, and my thoughts so far were:

- It could be useful to add important attributes like correlation ids (rule id etc) and details of the original exception directly into the log message, because that's what is going to be logged out of the box with the default logging configuration.
- Log meta could be used for logging optional attributes that could be useful when digging deeper into an issue when debugging Security app and Detection Engine.

### Checklist

- [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
@banderror
Copy link
Contributor Author

Auto-backport hasn't been done for some reason. A manual one: #120200

banderror added a commit that referenced this pull request Dec 2, 2021
…ethods of Rule Execution Log (#120157) (#120200)

**Ticket:** #119595

## Summary

With this PR, calling `IRuleExecutionLog.logStatusChange()` is now safe for callers (rule executors or route handlers). It doesn't throw and writes any exceptions being caught to Kibana logs.

Example error in the logs:

```
[2021-12-01T20:05:36.620+01:00][DEBUG][plugins.securitySolution] ...
[2021-12-01T20:05:37.591+01:00][ERROR][plugins.securitySolution.ruleExecution] Error logging rule execution status change; status: "succeeded", rule id: "abea42f0-489b-11ec-83fa-1bde56c47ca8", rule name: "SDH-247 my loooooong loooooong name"; Error: TESTING ERROR HANDLING
    at EventLogAdapter.logStatusChangeToSavedObjects (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:91:11)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 0)
    at EventLogAdapter.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:83:5)
    at RuleExecutionLogClient.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/rule_execution_log_client.ts:91:7)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_types/create_security_rule_type_wrapper.ts:345:15)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/rule_registry/server/utils/create_persistence_rule_type_wrapper.ts:20:23)
    at TaskRunner.executeAlertInstances (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:316:31)
    at promiseResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/lib/result_type.ts:47:17)
    at TaskRunner.loadAlertAttributesAndRun (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:569:14)
    at errorAsAlertTaskRunResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:1119:12)
    at TaskRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:635:33)
    at TaskManagerRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/task_manager/server/task_running/task_runner.ts:297:22)
[2021-12-01T20:05:37.591+01:00][DEBUG][plugins.securitySolution] ...
```

## Additional log metadata

For debugging purposes, rule execution log client writes additional log metadata with any caught exception. It's not visible in Kibana console logs by default, but can be enabled in Kibana config.

I used the following config to test it:

```yaml
logging:
  appenders:
    custom_console:
      type: console
      layout:
        type: pattern
        highlight: true
        pattern: "[%date][%level][%logger] %message %meta"
  root:
    appenders: [custom_console]
    level: warn
  loggers:
    - name: config
      level: info
    - name: plugins
      level: warn
    - name: plugins.securitySolution
      level: debug
    - name: plugins.ruleRegistry
      level: info
    - name: plugins.alerting
      level: info
    - name: plugins.taskManager
      level: info
    - name: savedobjects-service
      level: info
    - name: elasticsearch-service
      level: off
    - name: elasticsearch
      level: off
    - name: elasticsearch.query
      level: off
    - name: server
      level: fatal
    - name: optimize
      level: info
    - name: metrics.ops
      level: off
```

Error in the logs with additional metadata (find it after the stack trace):

```
[2021-12-01T20:03:30.807+01:00][DEBUG][plugins.securitySolution] ...
[2021-12-01T20:03:31.812+01:00][ERROR][plugins.securitySolution.ruleExecution] Error logging rule execution status change; status: "succeeded", rule id: "abea42f0-489b-11ec-83fa-1bde56c47ca8", rule name: "SDH-247 my loooooong loooooong name"; Error: TESTING ERROR HANDLING
    at EventLogAdapter.logStatusChangeToSavedObjects (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:91:11)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 0)
    at EventLogAdapter.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:83:5)
    at RuleExecutionLogClient.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/rule_execution_log_client.ts:93:7)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_types/create_security_rule_type_wrapper.ts:345:15)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/rule_registry/server/utils/create_persistence_rule_type_wrapper.ts:20:23)
    at TaskRunner.executeAlertInstances (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:316:31)
    at promiseResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/lib/result_type.ts:47:17)
    at TaskRunner.loadAlertAttributesAndRun (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:569:14)
    at errorAsAlertTaskRunResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:1119:12)
    at TaskRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:635:33)
    at TaskManagerRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/task_manager/server/task_running/task_runner.ts:297:22) {"rule":{"id":"abea42f0-489b-11ec-83fa-1bde56c47ca8","name":"SDH-247 my loooooong loooooong name","type":"siem.queryRule","execution":{"status":"succeeded"}},"kibana":{"spaceId":"default"}}
[2021-12-01T20:03:31.812+01:00][DEBUG][plugins.securitySolution] ...
```

Here it is extracted from the log:

```
{"rule":{"id":"abea42f0-489b-11ec-83fa-1bde56c47ca8","name":"SDH-247 my loooooong loooooong name","type":"siem.queryRule","execution":{"status":"succeeded"}},"kibana":{"spaceId":"default"}}
```

More info on the layout configuration: https://www.elastic.co/guide/en/kibana/master/logging-configuration.html#pattern-layout
I couldn't decently pretty print it with the config options that are available 😞 

I was thinking what to include in the log metadata and what to include in the regular log message, and my thoughts so far were:

- It could be useful to add important attributes like correlation ids (rule id etc) and details of the original exception directly into the log message, because that's what is going to be logged out of the box with the default logging configuration.
- Log meta could be used for logging optional attributes that could be useful when digging deeper into an issue when debugging Security app and Detection Engine.

### Checklist

- [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
TinLe pushed a commit to TinLe/kibana that referenced this pull request Dec 22, 2021
…ethods of Rule Execution Log (elastic#120157)

**Ticket:** elastic#119595

## Summary

With this PR, calling `IRuleExecutionLog.logStatusChange()` is now safe for callers (rule executors or route handlers). It doesn't throw and writes any exceptions being caught to Kibana logs.

Example error in the logs:

```
[2021-12-01T20:05:36.620+01:00][DEBUG][plugins.securitySolution] ...
[2021-12-01T20:05:37.591+01:00][ERROR][plugins.securitySolution.ruleExecution] Error logging rule execution status change; status: "succeeded", rule id: "abea42f0-489b-11ec-83fa-1bde56c47ca8", rule name: "SDH-247 my loooooong loooooong name"; Error: TESTING ERROR HANDLING
    at EventLogAdapter.logStatusChangeToSavedObjects (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:91:11)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 0)
    at EventLogAdapter.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:83:5)
    at RuleExecutionLogClient.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/rule_execution_log_client.ts:91:7)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_types/create_security_rule_type_wrapper.ts:345:15)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/rule_registry/server/utils/create_persistence_rule_type_wrapper.ts:20:23)
    at TaskRunner.executeAlertInstances (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:316:31)
    at promiseResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/lib/result_type.ts:47:17)
    at TaskRunner.loadAlertAttributesAndRun (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:569:14)
    at errorAsAlertTaskRunResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:1119:12)
    at TaskRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:635:33)
    at TaskManagerRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/task_manager/server/task_running/task_runner.ts:297:22)
[2021-12-01T20:05:37.591+01:00][DEBUG][plugins.securitySolution] ...
```

## Additional log metadata

For debugging purposes, rule execution log client writes additional log metadata with any caught exception. It's not visible in Kibana console logs by default, but can be enabled in Kibana config.

I used the following config to test it:

```yaml
logging:
  appenders:
    custom_console:
      type: console
      layout:
        type: pattern
        highlight: true
        pattern: "[%date][%level][%logger] %message %meta"
  root:
    appenders: [custom_console]
    level: warn
  loggers:
    - name: config
      level: info
    - name: plugins
      level: warn
    - name: plugins.securitySolution
      level: debug
    - name: plugins.ruleRegistry
      level: info
    - name: plugins.alerting
      level: info
    - name: plugins.taskManager
      level: info
    - name: savedobjects-service
      level: info
    - name: elasticsearch-service
      level: off
    - name: elasticsearch
      level: off
    - name: elasticsearch.query
      level: off
    - name: server
      level: fatal
    - name: optimize
      level: info
    - name: metrics.ops
      level: off
```

Error in the logs with additional metadata (find it after the stack trace):

```
[2021-12-01T20:03:30.807+01:00][DEBUG][plugins.securitySolution] ...
[2021-12-01T20:03:31.812+01:00][ERROR][plugins.securitySolution.ruleExecution] Error logging rule execution status change; status: "succeeded", rule id: "abea42f0-489b-11ec-83fa-1bde56c47ca8", rule name: "SDH-247 my loooooong loooooong name"; Error: TESTING ERROR HANDLING
    at EventLogAdapter.logStatusChangeToSavedObjects (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:91:11)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 0)
    at EventLogAdapter.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:83:5)
    at RuleExecutionLogClient.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/rule_execution_log_client.ts:93:7)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_types/create_security_rule_type_wrapper.ts:345:15)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/rule_registry/server/utils/create_persistence_rule_type_wrapper.ts:20:23)
    at TaskRunner.executeAlertInstances (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:316:31)
    at promiseResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/lib/result_type.ts:47:17)
    at TaskRunner.loadAlertAttributesAndRun (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:569:14)
    at errorAsAlertTaskRunResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:1119:12)
    at TaskRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:635:33)
    at TaskManagerRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/task_manager/server/task_running/task_runner.ts:297:22) {"rule":{"id":"abea42f0-489b-11ec-83fa-1bde56c47ca8","name":"SDH-247 my loooooong loooooong name","type":"siem.queryRule","execution":{"status":"succeeded"}},"kibana":{"spaceId":"default"}}
[2021-12-01T20:03:31.812+01:00][DEBUG][plugins.securitySolution] ...
```

Here it is extracted from the log:

```
{"rule":{"id":"abea42f0-489b-11ec-83fa-1bde56c47ca8","name":"SDH-247 my loooooong loooooong name","type":"siem.queryRule","execution":{"status":"succeeded"}},"kibana":{"spaceId":"default"}}
```

More info on the layout configuration: https://www.elastic.co/guide/en/kibana/master/logging-configuration.html#pattern-layout
I couldn't decently pretty print it with the config options that are available 😞 

I was thinking what to include in the log metadata and what to include in the regular log message, and my thoughts so far were:

- It could be useful to add important attributes like correlation ids (rule id etc) and details of the original exception directly into the log message, because that's what is going to be logged out of the box with the default logging configuration.
- Log meta could be used for logging optional attributes that could be useful when digging deeper into an issue when debugging Security app and Detection Engine.

### Checklist

- [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed Feature:Rule Monitoring Security Solution Detection Rule Monitoring area release_note:skip Skip the PR/issue when compiling release notes Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. technical debt Improvement of the software architecture and operational architecture v8.0.0 v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Security Solution][Detections] Add error handling to Rule Execution Log
4 participants