Skip to content

Commit

Permalink
[Security Solution][Detections] Add basic error handling to logging m…
Browse files Browse the repository at this point in the history
…ethods of Rule Execution Log (elastic#120157)

**Ticket:** elastic#119595

## Summary

With this PR, calling `IRuleExecutionLog.logStatusChange()` is now safe for callers (rule executors or route handlers). It doesn't throw and writes any exceptions being caught to Kibana logs.

Example error in the logs:

```
[2021-12-01T20:05:36.620+01:00][DEBUG][plugins.securitySolution] ...
[2021-12-01T20:05:37.591+01:00][ERROR][plugins.securitySolution.ruleExecution] Error logging rule execution status change; status: "succeeded", rule id: "abea42f0-489b-11ec-83fa-1bde56c47ca8", rule name: "SDH-247 my loooooong loooooong name"; Error: TESTING ERROR HANDLING
    at EventLogAdapter.logStatusChangeToSavedObjects (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:91:11)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 0)
    at EventLogAdapter.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:83:5)
    at RuleExecutionLogClient.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/rule_execution_log_client.ts:91:7)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_types/create_security_rule_type_wrapper.ts:345:15)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/rule_registry/server/utils/create_persistence_rule_type_wrapper.ts:20:23)
    at TaskRunner.executeAlertInstances (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:316:31)
    at promiseResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/lib/result_type.ts:47:17)
    at TaskRunner.loadAlertAttributesAndRun (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:569:14)
    at errorAsAlertTaskRunResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:1119:12)
    at TaskRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:635:33)
    at TaskManagerRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/task_manager/server/task_running/task_runner.ts:297:22)
[2021-12-01T20:05:37.591+01:00][DEBUG][plugins.securitySolution] ...
```

## Additional log metadata

For debugging purposes, rule execution log client writes additional log metadata with any caught exception. It's not visible in Kibana console logs by default, but can be enabled in Kibana config.

I used the following config to test it:

```yaml
logging:
  appenders:
    custom_console:
      type: console
      layout:
        type: pattern
        highlight: true
        pattern: "[%date][%level][%logger] %message %meta"
  root:
    appenders: [custom_console]
    level: warn
  loggers:
    - name: config
      level: info
    - name: plugins
      level: warn
    - name: plugins.securitySolution
      level: debug
    - name: plugins.ruleRegistry
      level: info
    - name: plugins.alerting
      level: info
    - name: plugins.taskManager
      level: info
    - name: savedobjects-service
      level: info
    - name: elasticsearch-service
      level: off
    - name: elasticsearch
      level: off
    - name: elasticsearch.query
      level: off
    - name: server
      level: fatal
    - name: optimize
      level: info
    - name: metrics.ops
      level: off
```

Error in the logs with additional metadata (find it after the stack trace):

```
[2021-12-01T20:03:30.807+01:00][DEBUG][plugins.securitySolution] ...
[2021-12-01T20:03:31.812+01:00][ERROR][plugins.securitySolution.ruleExecution] Error logging rule execution status change; status: "succeeded", rule id: "abea42f0-489b-11ec-83fa-1bde56c47ca8", rule name: "SDH-247 my loooooong loooooong name"; Error: TESTING ERROR HANDLING
    at EventLogAdapter.logStatusChangeToSavedObjects (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:91:11)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async Promise.all (index 0)
    at EventLogAdapter.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/event_log_adapter/event_log_adapter.ts:83:5)
    at RuleExecutionLogClient.logStatusChange (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/rule_execution_log_client.ts:93:7)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_types/create_security_rule_type_wrapper.ts:345:15)
    at Object.executor (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/rule_registry/server/utils/create_persistence_rule_type_wrapper.ts:20:23)
    at TaskRunner.executeAlertInstances (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:316:31)
    at promiseResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/lib/result_type.ts:47:17)
    at TaskRunner.loadAlertAttributesAndRun (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:569:14)
    at errorAsAlertTaskRunResult (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:1119:12)
    at TaskRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/alerting/server/task_runner/task_runner.ts:635:33)
    at TaskManagerRunner.run (/Users/georgii/Code/elastic/kibana-main/x-pack/plugins/task_manager/server/task_running/task_runner.ts:297:22) {"rule":{"id":"abea42f0-489b-11ec-83fa-1bde56c47ca8","name":"SDH-247 my loooooong loooooong name","type":"siem.queryRule","execution":{"status":"succeeded"}},"kibana":{"spaceId":"default"}}
[2021-12-01T20:03:31.812+01:00][DEBUG][plugins.securitySolution] ...
```

Here it is extracted from the log:

```
{"rule":{"id":"abea42f0-489b-11ec-83fa-1bde56c47ca8","name":"SDH-247 my loooooong loooooong name","type":"siem.queryRule","execution":{"status":"succeeded"}},"kibana":{"spaceId":"default"}}
```

More info on the layout configuration: https://www.elastic.co/guide/en/kibana/master/logging-configuration.html#pattern-layout
I couldn't decently pretty print it with the config options that are available 😞 

I was thinking what to include in the log metadata and what to include in the regular log message, and my thoughts so far were:

- It could be useful to add important attributes like correlation ids (rule id etc) and details of the original exception directly into the log message, because that's what is going to be logged out of the box with the default logging configuration.
- Log meta could be used for logging optional attributes that could be useful when digging deeper into an issue when debugging Security app and Detection Engine.

### Checklist

- [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
  • Loading branch information
banderror committed Dec 2, 2021
1 parent 7b9763f commit 770b0e6
Show file tree
Hide file tree
Showing 7 changed files with 89 additions and 30 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,32 @@ export class EventLogAdapter implements IRuleExecutionLogClient {
// EventLog execution events are immutable, nothing to do here
}

private async logExecutionMetrics(args: LogExecutionMetricsArgs) {
public async logStatusChange(args: LogStatusChangeArgs): Promise<void> {
await Promise.all([
this.logStatusChangeToSavedObjects(args),
this.logStatusChangeToEventLog(args),
]);
}

private async logStatusChangeToSavedObjects(args: LogStatusChangeArgs): Promise<void> {
await this.savedObjectsAdapter.logStatusChange(args);
}

private async logStatusChangeToEventLog(args: LogStatusChangeArgs): Promise<void> {
if (args.metrics) {
this.logExecutionMetrics({
ruleId: args.ruleId,
ruleName: args.ruleName,
ruleType: args.ruleType,
spaceId: args.spaceId,
metrics: args.metrics,
});
}

this.eventLogClient.logStatusChange(args);
}

private logExecutionMetrics(args: LogExecutionMetricsArgs): void {
const { ruleId, spaceId, ruleType, ruleName, metrics } = args;

this.eventLogClient.logExecutionMetrics({
Expand All @@ -98,20 +123,4 @@ export class EventLogAdapter implements IRuleExecutionLogClient {
},
});
}

public async logStatusChange(args: LogStatusChangeArgs) {
await this.savedObjectsAdapter.logStatusChange(args);

if (args.metrics) {
await this.logExecutionMetrics({
ruleId: args.ruleId,
ruleName: args.ruleName,
ruleType: args.ruleType,
spaceId: args.spaceId,
metrics: args.metrics,
});
}

this.eventLogClient.logStatusChange(args);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
* 2.0.
*/

import { SavedObjectsClientContract } from '../../../../../../../src/core/server';
import { Logger, SavedObjectsClientContract } from 'src/core/server';
import { IEventLogClient, IEventLogService } from '../../../../../event_log/server';
import { IRuleStatusSOAttributes } from '../rules/types';
import { EventLogAdapter } from './event_log_adapter/event_log_adapter';
Expand All @@ -20,6 +20,7 @@ import {
GetCurrentStatusArgs,
GetCurrentStatusBulkArgs,
GetCurrentStatusBulkResult,
ExtMeta,
} from './types';
import { truncateMessage } from './utils/normalization';

Expand All @@ -28,13 +29,16 @@ interface ConstructorParams {
savedObjectsClient: SavedObjectsClientContract;
eventLogService: IEventLogService;
eventLogClient?: IEventLogClient;
logger: Logger;
}

export class RuleExecutionLogClient implements IRuleExecutionLogClient {
private client: IRuleExecutionLogClient;
private readonly client: IRuleExecutionLogClient;
private readonly logger: Logger;

constructor(params: ConstructorParams) {
const { underlyingClient, eventLogService, eventLogClient, savedObjectsClient } = params;
const { underlyingClient, eventLogService, eventLogClient, savedObjectsClient, logger } =
params;

switch (underlyingClient) {
case UnderlyingLogClient.savedObjects:
Expand All @@ -44,6 +48,10 @@ export class RuleExecutionLogClient implements IRuleExecutionLogClient {
this.client = new EventLogAdapter(eventLogService, eventLogClient, savedObjectsClient);
break;
}

// We write rule execution logs via a child console logger with the context
// "plugins.securitySolution.ruleExecution"
this.logger = logger.get('ruleExecution');
}

/** @deprecated */
Expand Down Expand Up @@ -74,11 +82,34 @@ export class RuleExecutionLogClient implements IRuleExecutionLogClient {
return this.client.deleteCurrentStatus(ruleId);
}

public async logStatusChange(args: LogStatusChangeArgs) {
const message = args.message ? truncateMessage(args.message) : args.message;
return this.client.logStatusChange({
...args,
message,
});
public async logStatusChange(args: LogStatusChangeArgs): Promise<void> {
const { newStatus, message, ruleId, ruleName, ruleType, spaceId } = args;

try {
const truncatedMessage = message ? truncateMessage(message) : message;
await this.client.logStatusChange({
...args,
message: truncatedMessage,
});
} catch (e) {
const logMessage = 'Error logging rule execution status change';
const logAttributes = `status: "${newStatus}", rule id: "${ruleId}", rule name: "${ruleName}"`;
const logReason = e instanceof Error ? `${e.stack}` : `${e}`;
const logMeta: ExtMeta = {
rule: {
id: ruleId,
name: ruleName,
type: ruleType,
execution: {
status: newStatus,
},
},
kibana: {
spaceId,
},
};

this.logger.error<ExtMeta>(`${logMessage}; ${logAttributes}; ${logReason}`, logMeta);
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
*/

import { Duration } from 'moment';
import { SavedObjectsFindResult } from '../../../../../../../src/core/server';
import { LogMeta, SavedObjectsFindResult } from 'src/core/server';
import { RuleExecutionStatus } from '../../../../common/detection_engine/schemas/common/schemas';
import { IRuleStatusSOAttributes } from '../rules/types';

Expand Down Expand Up @@ -103,3 +103,18 @@ export interface ExecutionMetrics {
lastLookBackDate?: string;
executionGap?: Duration;
}

/**
* Custom extended log metadata that rule execution logger can attach to every log record.
*/
export type ExtMeta = LogMeta & {
rule?: LogMeta['rule'] & {
type?: string;
execution?: {
status?: RuleExecutionStatus;
};
};
kibana?: {
spaceId?: string;
};
};
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ export const createSecurityRuleTypeWrapper: CreateSecurityRuleTypeWrapper =
underlyingClient: config.ruleExecutionLog.underlyingClient,
savedObjectsClient,
eventLogService,
logger,
});

const completeRule = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,7 @@ export const signalRulesAlertType = ({
underlyingClient: config.ruleExecutionLog.underlyingClient,
savedObjectsClient: services.savedObjectsClient,
eventLogService,
logger,
});

const completeRule: CompleteRule<RuleParams> = {
Expand Down
2 changes: 1 addition & 1 deletion x-pack/plugins/security_solution/server/plugin.ts
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ export class Plugin implements ISecuritySolutionPlugin {
const eventLogService = plugins.eventLog;
registerEventLogProvider(eventLogService);

const requestContextFactory = new RequestContextFactory({ config, core, plugins });
const requestContextFactory = new RequestContextFactory({ config, logger, core, plugins });
const router = core.http.createRouter<SecuritySolutionRequestHandlerContext>();
core.http.registerRouteHandlerContext<SecuritySolutionRequestHandlerContext, typeof APP_ID>(
APP_ID,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
* 2.0.
*/

import { KibanaRequest, RequestHandlerContext } from 'kibana/server';
import { Logger, KibanaRequest, RequestHandlerContext } from 'kibana/server';
import { ExceptionListClient } from '../../lists/server';

import { DEFAULT_SPACE_ID } from '../common/constants';
Expand All @@ -28,6 +28,7 @@ export interface IRequestContextFactory {

interface ConstructorOptions {
config: ConfigType;
logger: Logger;
core: SecuritySolutionPluginCoreSetupDependencies;
plugins: SecuritySolutionPluginSetupDependencies;
}
Expand All @@ -44,7 +45,7 @@ export class RequestContextFactory implements IRequestContextFactory {
request: KibanaRequest
): Promise<SecuritySolutionApiRequestHandlerContext> {
const { options, appClientFactory } = this;
const { config, core, plugins } = options;
const { config, logger, core, plugins } = options;
const { lists, ruleRegistry, security } = plugins;

const [, startPlugins] = await core.getStartServices();
Expand Down Expand Up @@ -73,6 +74,7 @@ export class RequestContextFactory implements IRequestContextFactory {
savedObjectsClient: context.core.savedObjects.client,
eventLogService: plugins.eventLog,
eventLogClient: startPlugins.eventLog.getClient(request),
logger,
}),

getExceptionListClient: () => {
Expand Down

0 comments on commit 770b0e6

Please sign in to comment.