-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution][Detections] Proposal: building Rule Execution Log on top of Event Log and ECS #94143
Conversation
Pinging @elastic/security-solution (Team: SecuritySolution) |
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some comments regarding eventLog usage
private _result: IEcsEvent = {}; | ||
|
||
constructor() { | ||
// TODO: Which version does event_log use? Should it be specified here or inside the event log itself? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The event log plugin provides this value, you don't need to provide it. And probably shouldn't since we could end up with different versions in different documents. Though I don't know of anything that makes use of this field right now.
|
||
interface IEcsAdditionalFields { | ||
// https://www.elastic.co/guide/en/ecs/1.9/ecs-event.html | ||
event?: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already use a bunch of the event
fields for fairly "generic" purposes, so I'm a little concerned about having application-specific data in here as well. Not a problem now since there's no overlap, but I'm guessing there could be some day?
kibana/x-pack/plugins/event_log/generated/schemas.ts
Lines 38 to 48 in 6264c56
event: schema.maybe( | |
schema.object({ | |
action: ecsString(), | |
provider: ecsString(), | |
start: ecsDate(), | |
duration: ecsNumber(), | |
end: ecsDate(), | |
outcome: ecsString(), | |
reason: ecsString(), | |
}) | |
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, IEcsAdditionalFields
just represents the fields we need to be supported in Event Log and which are currently missing in its schema. I mean, this PR is just a proposal, this interface is not going to be used in the working code as is and the missing fields will need to be moved to Event Log itself.
event
fields here are all standard ECS fields (spec), and the only custom field I propose is kibana.detection_engine
:)
Not sure if I addressed your comment... Pls let me know :)
|
||
export type IEcsEvent = IEventLogEvent & IEcsAdditionalFields; | ||
|
||
interface IEcsAdditionalFields { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these will actually be written out today, till they're added to our schema - we use dynamic: false
at the top of the mappings, so I believe they will not get indexed, at the very least.
kibana/x-pack/plugins/event_log/generated/mappings.json
Lines 1 to 3 in 6264c56
{ | |
"dynamic": "false", | |
"properties": { |
If this PR is just an experiment, you might want to extend the schema, following the directions here: https://github.com/elastic/kibana/blob/master/x-pack/plugins/event_log/generated/README.md - and let us know if any of it is confusing so we can correct it.
I think we'd want to update the event log separately though, if we intend to merge this, just to keep the PRs a little cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this PR is just an experiment, a proposal. #94143 (comment)
I'll try to explain it better in a separate comment below :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking through this PR it is awesome to see how this not only will replace our statuses but also provide an extensible schema to allow for future "events" to be indexed throughout the phases of the rule executor. I played around with the documents posted in the text files and used the queries posted there too and can really start to see this coming together! I have a few clarifying questions and specifically am interested in more information on the detection_engine
field. Looks good so far!
INFO: 'info', | ||
WARNING: 'warning', | ||
ERROR: 'error', | ||
} as const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming part of this refactor is to also write our log statements to the event log? Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great question. TL;DR: I think we will be able to write both events to the execution log and normal logs. Normal logs are for Kibana sysadmins and debugging, execution logs are for Detections users. What to log where - I'd say TBD, but would be nice to have some flexibility here.
You mean whatever we log through a Kibana logger to a normal log? I didn't imagine it exactly like that, because I'm not sure if it makes sense to see all the logs in the UI from the user perspective: can be too many, too technical or include info that should not be shown to the user. On the other hand, we definitely would be able to log more data, so I added these standard log levels because it's a UI thing familiar to all users.
In Kibana we already have a few views where the user can scroll through logs, e.g. in Fleet there are logs from agents.
"detection_engine": { | ||
"rule_status": "warning", | ||
"rule_status_severity": 20 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious if we need to use this new field and can't repurpose some other fields, maybe message
for the rule status? Or keep message how it is but utilize event.kind
and set it to 'warning' or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a good point... I was hoping that I would be able to map everything to native ECS fields.
message
is mapped as text, not keyword. And we might need to have both status and message in a single event, e.g. if we decide to show status updates in the log. I mean, it would be simple and nice to just have the ability to have an additional message field.
event.kind
has a strict set of allowed values, as long as we decide to conform to ECS, of course. Also, you can see in the queries that it's used to filter out metric events when fetching "event" events for showing in the Execution Log tab.
labels
could be used for that, but this seems to me the same thing if we talk about event log. We'd either need to specify labels mapping in event log (same solution as introducing a custom field set), or turn on dynamic mapping for labels
. I'm not sure about the latter since event log is designed as a single multitenant index. If it could support separate indices per each tenant/provider, that would be a different story. @pmuellr what do you think about these tradeoffs?
As for other native ECS fields, I just didn't find any suitable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay I didn't realize event.kind had limited values.. Maybe we could utilize event.code
? That seems to be a flexible field? I"m just trying to think of ways to remove the detection engine specific fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idk if it would cause any issues here but i know we use the event.code
field for a lot of endpoint exception stuff. It's one of our prepopulated fields
@@ -0,0 +1,230 @@ | |||
# ----------------------------------------------------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for providing these little query scripts to test this out. It helped me visualize how these would be integrated in the executor!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My pleasure ✋ I imagine how confusing would it be to review it as a pile of text without examples or some code.
export const RuleExecutionEventType = { | ||
GENERIC: 'generic', | ||
STATUS_CHANGED: 'status-changed', | ||
} as const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we could diversify this field beyond 'generic' and 'status-changed'. Maybe event type could be 'gap', 'warning / partial failure', 'failed', 'succeeded', 'missing privileges'? But maybe we need these specifically for the queries we are planning against the event log. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah yeah, exactly, I share the same kind of vision. We definitely will be able to add more RuleExecutionEventType
s if we need so. Doesn't mean we will have to. Cases where it would make sense to do later, imho:
- Extract existing stuff like 'gap' to separate
RuleExecutionEvent
s if we want to log it earlier, i.e. exactly when they are captured/created in the code. And decouple them from status updates. Meaning that if we found a gap, we could log it as a separate event with a warning level, but delay or omit changing the status. - New types of events with their own payload.
As for breaking StatusChangedEvent
down to separate StatusGoingToRunEvent
, StatusSucceededEvent
- not sure, it would make sense only if they would have completely separate payloads and wouldn't have anything in common (in this case, it's a common status itself and its severity). Imho for now we could have a few factory methods for creating status events, but keep them as a single type.
const ecsEvents = mapStatusChangedEventToEcsEvents(event); | ||
|
||
// TODO: move to a service implementing the rule execution log | ||
// All generated ecsEvents have the same @timestamp, so we need an additional field for deterministic ordering. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@dhurley14 @pmuellr thank you for your reviews, super appreciate that you switched your context and checked this out. @pmuellr let me try to provide more context. So this PR is not a normal kind of PR with code, it's more like an RFC to demonstrate the idea of storing our rule execution events in Event Log - in theory. My goal was to understand if it's possible or not, makes sense or not, would it be applicable to our problem, would it solve it, etc. I think at this point I can say that in theory yes, we could use event log. The only substantial risk I personally see is conforming to ECS itself. ECS seems to be very limited to allow storing arbitrary structured application logs. To avoid that, we will probably need full flexibility in terms of defining custom fields. Other than that, event log seems to me a good way to go. However, at this point event log doesn't have all the APIs we need, and we need your comments on that as the event log creators and maintainers. I wrote up all the queries to event log index we'd need to be able to run to fetch everything we need in our app. Would you please mind going through "What's missing in the current Event Log API" and "More questions about using Event Log" in the proposal, and also At the end of the day, we need to know if this idea looks ok from your side, if all the proposed changes in |
I think some of these concerns can be addressed by instrumenting the rule execution. Not all of it, so I wouldn't suggest it as a replacement, but APM data might give you higher-fidelity performance data. Happy to pair next week if somebody wants to try it out. Here are some examples of what you'd get, from a local branch I have running: APM data wouldn't be available to end-users, but we do hope to make it available in cloud. Won't solve all your problems but I think the waterfall is very useful in identifying optimisation opportunities. I'm not familiar enough with the requirements here to comment on the proposal, but as @pmuellr knows, I think a log-based approach is very powerful, not just for rule executions, but also alerts as data, so I'm excited this strategy is being explored. |
Again, not sure if I fully understand the requirements here, but it might be worth considering to just enrich the current
If you need to pre-aggregate certain metrics (e.g. because there are multiple searches, or multiple index operations), you can consider a histogram field or an aggregate metric double field. If you are worried about mapping conflicts or ECS compatibility I think the best option here would to just be a scoped event log, which adheres to the common schema of the alerting framework's event log, and matches the index target wildcard that is being used. E.g., you'd end up with |
3f1c5aa
to
dada551
Compare
@dgieselaar Thank you for your feedback and suggestions, this is very-very interesting. It's kind of a lot to comment, let me do it piece-by-piece :) Rule execution log needs to be both writeable and queryable from the Security Solution app. Besides metrics, we'd like to be able to log events and simple generic log messages. I'm not familiar with Elastic APM unfortunately (it's a shame), but seems like it requires a separate APM server to send data to, and also is designed to consume metrics, traces and that kind of stuff as opposed to structured logs. Is that correct? That said, I think the idea of dogfooding APM in Kibana in general and in our detection engine in particular is very interesting. Indeed, we have little data about detection rule execution metrics (only some metrics from
Not sure what do you mean by current execute events, is it events that
Thank you for this suggestion. In this particular case I need to be able to specify 2 sort fields, and seems like
Yes, I was thinking about that: in order to "own" and control custom mappings (as well as potentially RBAC requirements), clients of event log might want to keep their logs in a separate index (that index would have all the default ECS fields, common event log fields like |
Yes - although the concept of logging has many use cases. In APM terminology, a transaction event could be considered equivalent to a log event, depending on what you need. Theoretically, one could send APM data to the same cluster, similar to (an option of) stack monitoring, but it's more valuable to us on the engineering side than end-users.
For the APM app, we use withApmSpan: kibana/x-pack/plugins/apm/server/lib/alerts/chart_preview/get_transaction_duration.ts Line 29 in bb26564
Fleet, Task Manager and Reporting also have custom instrumentation.
I'm not sure if that is conceptual limitation of the top_metrics aggregation. It also might be possible to add it. But maybe you can just add a field that is seq + timestamp and sort on that?
AFAIK there seems to be consensus that this is necessary for alerts at least. To me it seems like a small step to make that available for execute events as well. |
Btw, you can already enable APM today, by running Kibana with |
dada551
to
1f659b7
Compare
Ok, after several discussions we identified the following steps. I'll be working on them and adding these changes to
Other ideas are out of scope for now. Maybe we will revisit these later in the future when we get more clarity with the RAC's alerts-as-data implementation lead by @spong:
Thanks everybody for your feedback - @dgieselaar @dhurley14 @dplumlee @peluja1012 @pmuellr @spong |
1f659b7
to
807ae12
Compare
807ae12
to
a656cde
Compare
a656cde
to
a22bc06
Compare
💔 Build Failed
Failed CI Steps
Test FailuresKibana Pipeline / general / "before all" hook for "should contain the right query".Timeline query tab Query tab "before all" hook for "should contain the right query"Stack Trace
Kibana Pipeline / general / "before all" hook for "should open a modal".Open timeline Open timeline modal "before all" hook for "should open a modal"Stack Trace
Metrics [docs]
History
To update your PR or re-run it, just comment with: cc @banderror |
ECS version: | ||
|
||
- Which version of ECS does the Event Log support? Is it strictly `1.6.0` or several versions can be supported at the same time for different clients? | ||
- Who/when/how will upgrade the version(s) supported by Event Log? | ||
- Should (or can) clients specify `ecs.version` when logging events? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This
…Engine (#95067) **Related to:** #94143 ## Summary This PR adds new fields to the schema (`EventSchema`, `IEvent`): - standard ECS fields: `error.*`, `event.*`, `log.level`, `log.logger`, `rule.*` - custom field set `kibana.detection_engine` We need these fields on the Detections side to implement detection rule execution log. See the related proposal (#94143) for more details. Also, this PR bumps ECS used in Event Log from `1.6.0` to the current `1.8.0` version. They are 100% same in terms of fields used in Event Log, so no changes in the schema were caused by this version increment.
…Engine (elastic#95067) **Related to:** elastic#94143 ## Summary This PR adds new fields to the schema (`EventSchema`, `IEvent`): - standard ECS fields: `error.*`, `event.*`, `log.level`, `log.logger`, `rule.*` - custom field set `kibana.detection_engine` We need these fields on the Detections side to implement detection rule execution log. See the related proposal (elastic#94143) for more details. Also, this PR bumps ECS used in Event Log from `1.6.0` to the current `1.8.0` version. They are 100% same in terms of fields used in Event Log, so no changes in the schema were caused by this version increment.
…Engine (#95067) (#95654) **Related to:** #94143 ## Summary This PR adds new fields to the schema (`EventSchema`, `IEvent`): - standard ECS fields: `error.*`, `event.*`, `log.level`, `log.logger`, `rule.*` - custom field set `kibana.detection_engine` We need these fields on the Detections side to implement detection rule execution log. See the related proposal (#94143) for more details. Also, this PR bumps ECS used in Event Log from `1.6.0` to the current `1.8.0` version. They are 100% same in terms of fields used in Event Log, so no changes in the schema were caused by this version increment. Co-authored-by: Georgii Gorbachev <[email protected]>
**Needed for:** rule execution log for Security #94143 **Related to:** - alerts-as-data: #93728, #93729, #93730 - RFC for index naming #98912 ## Summary This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see #98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself. In this PR I tried to incorporate most of the feedback received in the RFC (#98912), but if you notice I missed/forgot something, please let me know in the comments. Done in this PR: - [x] Schema-agnostic APIs for working with Elasticsearch. - [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs). - [x] Schema-aware write API (logging events). - [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation). - [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time). As for reviewing this PR, perhaps it might be easier to start with: - checking description of #98912 - checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b - checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268 ## Next steps Next steps towards rule execution log in Security (#94143): - define actual schema for rule execution events - inject instance of rule execution log into Security rule executors and route handlers - implement actual execution logging in rule executors - update route handlers to start fetching execution events and metrics from the log instead of custom saved objects Next steps in the context of RAC and unified implementation: - converge this implementation with `RuleDataService` implementation - implement robust index bootstrapping - reconsider using FieldMap as a generic type parameter - implement validation for documents being indexed - cover the final implementation with tests - write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces
**Needed for:** rule execution log for Security elastic#94143 **Related to:** - alerts-as-data: elastic#93728, elastic#93729, elastic#93730 - RFC for index naming elastic#98912 ## Summary This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see elastic#98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself. In this PR I tried to incorporate most of the feedback received in the RFC (elastic#98912), but if you notice I missed/forgot something, please let me know in the comments. Done in this PR: - [x] Schema-agnostic APIs for working with Elasticsearch. - [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs). - [x] Schema-aware write API (logging events). - [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation). - [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time). As for reviewing this PR, perhaps it might be easier to start with: - checking description of elastic#98912 - checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b - checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268 ## Next steps Next steps towards rule execution log in Security (elastic#94143): - define actual schema for rule execution events - inject instance of rule execution log into Security rule executors and route handlers - implement actual execution logging in rule executors - update route handlers to start fetching execution events and metrics from the log instead of custom saved objects Next steps in the context of RAC and unified implementation: - converge this implementation with `RuleDataService` implementation - implement robust index bootstrapping - reconsider using FieldMap as a generic type parameter - implement validation for documents being indexed - cover the final implementation with tests - write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces
**Needed for:** rule execution log for Security #94143 **Related to:** - alerts-as-data: #93728, #93729, #93730 - RFC for index naming #98912 ## Summary This PR adds a mechanism for writing to / reading from / bootstrapping indices for RAC project into the `rule_registry` plugin. Particularly, indices for alerts-as-data and rule execution events. This implementation is similar to existing implementations like `event_log` plugin (see #98353 (comment) for historical perspective), but we're going to converge all of them into 1 or 2 implementations. At least we should have a single one in `rule_registry` itself. In this PR I tried to incorporate most of the feedback received in the RFC (#98912), but if you notice I missed/forgot something, please let me know in the comments. Done in this PR: - [x] Schema-agnostic APIs for working with Elasticsearch. - [x] Schema-aware log definition and bootstrapping API (creating hierarchical logs). - [x] Schema-aware write API (logging events). - [x] Schema-aware read API (searching logs, filtering, sorting, pagination, aggregation). - [x] Support for Kibana spaces, space-aware index bootstrapping (either at rule creation or rule execution time). As for reviewing this PR, perhaps it might be easier to start with: - checking description of #98912 - checking usage examples https://github.com/elastic/kibana/pull/98353/files#diff-c049ff2198cc69bd50a69e92d29e88da7e10b9a152bdaceaf3d41826e712c12b - checking public api https://github.com/elastic/kibana/pull/98353/files#diff-8e9ef0dbcbc60b1861d492a03865b2ae76a56ec38ada61898c991d3a74bd6268 ## Next steps Next steps towards rule execution log in Security (#94143): - define actual schema for rule execution events - inject instance of rule execution log into Security rule executors and route handlers - implement actual execution logging in rule executors - update route handlers to start fetching execution events and metrics from the log instead of custom saved objects Next steps in the context of RAC and unified implementation: - converge this implementation with `RuleDataService` implementation - implement robust index bootstrapping - reconsider using FieldMap as a generic type parameter - implement validation for documents being indexed - cover the final implementation with tests - write comprehensive docs: update plugin README, add JSDoc comments to all public interfaces
We've partially implemented rule execution logging to Event Log (v7.16), and started to read last 5 failures from it on the Rule Details page (v8.0) as part of #101013. I'm going to close this one and we will
|
Ticket: #91265
Summary
This is a proposal for building detection engine Rule Execution Log on top of Event Log and ECS. It is related to the rules management part of the RAC workstream. It also follows up our previous discussions here. Rebuilding Rule Execution Log on top of Event Log would allow us to achieve 3 goals: pay tech debt, optimize performance of rules-related endpoints and unblock the possibility to implement in-memory rules table while we can’t filter/sort/search by rule params on the server side.
You can read the proposal in the PR description below, but I also added a
proposal.md
file so you could comment it line-by-line (I encourage you to do that!). There are also other files in this PR worth checking, all the details below.Proposal: building Rule Execution Log on top of Event Log and ECS
Overview
We're going to get rid of storing rule execution statuses and additional data in custom "sidecar" saved objects. Those are objects stored in
.kibana
index and having type =siem-detection-engine-rule-status
. The corresponding SO attributes are:We're going to start using the Event Log (
event_log
plugin built by the Alerting team).For more context, please read #91265 (comment)
Regarding software design:
RuleStatusService
, and the only execution event we will have is a Status Changed event. That means the Detection Engine will stay mostly untouched. The idea is to reduce the amount of refactoring. The disadvantage is that time-wise events in the log might be written not exactly when they happen. We will probably address that later, when implementing enhancements in the UI ("Execution Log" tab), so that all the logged events can have precise timestamps and show when exactly any of the events happened. Also, we'll be able to split a single fat Status Changed event into separate dedicated events, and use the Status Changed event only for cases when the status actually needs to be changed.siem-detection-engine-rule-status
saved objects with requests to the new rule execution log service (read model).What to review
Please take a look at the code submitted in this PR:
rule_execution_log/common_model
contains the ECS model of events for Event Log, common types and constants, a builder for creating ECS events for this particular logrule_execution_log/write_model
contains the definition of rule execution events,StatusChangedEvent
in particular, and mapping of it to a series of ECS eventsPlease check the ECS events we're going to store and queries we will need to execute. You can play with it in Kibana Dev Tools:
dev_tools_index.txt
to create a test index. This file contains Elasticsearch index mappings for ECS version1.9.0
(https://github.com/elastic/ecs/blob/1.9/generated/elasticsearch/7/template.json) adjusted with a custom mapping ofkibana.detection_engine
field.dev_tools_events.txt
. It contains ECS events as text that you can copy-paste into Kibana Dev Tools.dev_tools_queries.txt
. It contains queries to event log index we will need to be able to execute.generate_events.ts
. It transformsStatusChangedEvent
to ECS events and writes them todev_tools_events.txt
. Run it withnode x-pack/plugins/security_solution/server/lib/detection_engine/rule_execution_log/generate_events.js
.What's missing in the current Event Log API
In short:
event.*
,log.*
,rule.*
kibana.detection_engine.*
term
,terms
) instead of string KQL filter"_source": ["@timestamp", "message", "log.level", "event.action"]
Extended support for ECS
In this proposal, I'm designing our rule execution log events using a few additional fields. These fields are not currently supported by Event Log.
My suggestion would be to add support for all the standard
event.*
,log.*
,rule.*
fields (at least these field sets), as well as for our customkibana.detection_engine.*
fields. If it's easy to add support for the whole ECS, I'd say let's do it.It's not super clear to me how exactly we're gonna specify
kibana.detection_engine
both in terms of TypeScript API and ES mapping. Should it be delegated to Event Log's clients with some registration API rather than hardcoded in Event Log itself?Aggregation queries
We need to be able to execute aggregation queries with arbitrary aggs, like for example specified in this example query:
We need a freedom to pack multiple aggs in a single query or to split it into several queries.
We'd also like to be able to combine aggs with top-level filters, pagination, sorting and other options if we need so. See pre-filtering with
"term": { "event.action": "status-changed" }
:Sorting by multiple fields
The current API allows to specify only a single sort field and restricts the fields that can be used to sort events.
We need to be able to sort by multiple fields both in normal queries and within an aggregation scope.
In the future we will need to sort by arbitrary fields in order to implement, for example, sorting by the current rule execution status (
kibana.detection_engine.rule_status_severity
) in our rules monitoring table.Custom ES DSL filters
Would be nice to have an option to specify custom filters with ES DSL, like for example here:
With the current API it's possible to specify a string KQL filter. It's definitely a great option to have, but in cases like this one, where we exactly know the resulting query to build, we'd prefer to save some server-side CPU cycles by not parsing KQL.
Limiting source fields to return
Sometimes we don't need to fetch the full event document with all its ECS fields. We need to be able to restrict the source fields as we like, both within aggs and the query scope:
More questions about using Event Log
ECS version:
1.6.0
or several versions can be supported at the same time for different clients?ecs.version
when logging events?ECS
event.provider
,event.dataset
,log.logger
:event.provider: 'detection-engine'
, but have a way to have multiple logs within the provider.event.dataset: 'detection-engine.rule-execution-log'
log.logger: 'detection-engine.rule-execution-log'
detection-engine.rule-execution-log
where needed, and not the full log.event.action
s.event.provider
,event.dataset
,log.logger
fields for an event - a client ofIEventLogger
or the logger itself?Could we theoretically use an instance of
ElasticsearchClient
to query the event log index, rather than usingIEventLogClient
? My concern is: while writing to this index and managing the index should be done via dedicated APIs, adding read APIs can become a leaky abstraction, which reminds me of the issues we currently have with saved objects APIs like lack of aggs support etc. On the one hand, for us, application developers, it's hard to predict what read API we will need in the future; would be nice to have freedom to use any that Elasticsearch provides. On the other hand, adding support for all the APIs can be difficult and lead to maintenance issues. Maybe there's a way to provide a decorator on top ofElasticsearchClient
that would preserve its API or a thin adapter that would just slightly change it?