Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x] [Entity Analytics] Add Field Retention Enrich Policy and Ingest Pipeline to Entity Engine (#193848) #195929

Merged
merged 1 commit into from
Oct 11, 2024

Conversation

kibanamachine
Copy link
Contributor

Backport

This will backport the following commits from main to 8.x:

Questions ?

Please refer to the Backport tool documentation

…ine to Entity Engine (elastic#193848)

## Summary

Add the "Ouroboros" part of the entity engine:

- an enrich policy is created for each engine
- the enrich policy is executed every 30s by a kibana task, this will be
1h once we move to a 24h lookback
- create an ingest pipeline for the latest which performs the specified
field retention operations (for more detail see below)

<img width="2112" alt="Screenshot 2024-10-02 at 13 42 11"
src="https://github.com/user-attachments/assets/f727607f-2e0a-4056-a51e-393fb2a97a95">

<details>
<summary> Expand for example host entity </summary>
```
{
    "@timestamp": "2024-10-01T12:10:46.000Z",
    "host": {
        "name": "host9",
        "hostname": [
            "host9"
        ],
        "domain": [
            "test.com"
        ],
        "ip": [
            "1.1.1.1",
            "1.1.1.2",
            "1.1.1.3"
        ],
        "risk": {
            "calculated_score": "70.0",
            "calculated_score_norm": "27.00200653076172",
            "calculated_level": "Low"
        },
        "id": [
            "1234567890abcdef"
        ],
        "type": [
            "server"
        ],
        "mac": [
            "AA:AA:AA:AA:AA:AB",
            "aa:aa:aa:aa:aa:aa",
            "AA:AA:AA:AA:AA:AC"
        ],
        "architecture": [
            "x86_64"
        ]
    },
    "asset": {
        "criticality": "low_impact"
    },
    "entity": {
        "name": "host9",
        "id": "kP/jiFHWSwWlO7W0+fGWrg==",
        "source": [
            "risk-score.risk-score-latest-default",
            ".asset-criticality.asset-criticality-default",
            ".ds-logs-testlogs1-default-2024.10.01-000001",
            ".ds-logs-testlogs2-default-2024.10.01-000001",
            ".ds-logs-testlogs3-default-2024.10.01-000001"
        ],
        "type": "host"
    }
}
```
</details>

### Field retention operators

First some terminology:

- **latest value** - the value produced by the transform which
represents the latest vioew of a given field in the transform lookback
period
- **enrich value** - the value added to the document by the enrich
policy, this represents the last value of a field outiside of the
transform lookback window

We hope that this will one day be merged into the entity manager
framework so I've tried to abstract this as much as possible. A field
retention operator specifies how we should choose a value for a field
when looking at the latest value and the enrich value.

### Collect values
Collect unique values in an array, first taking from the latest values
and then filling with enrich values up to maxLength.

```
{
  operation: 'collect_values',
  field: 'host.ip',
  maxLength: 10
}
```

### Prefer newest value
Choose the latest value if present, otherwise choose the enrich value.

```
{
  operation: 'prefer_newest_value',
  field: 'asset.criticality'
}
```

### Prefer oldest value
Choose the enrich value if it is present, otherwise choose latest.
```
{
  operation: 'prefer_oldest_value',
  field: 'first_seen_timestamp'
}
```

## Test instructions

We currently require extra permissions for the kibana system user for
this to work, so we must

### 1. Get Elasticsearch running from source
This prototype requires a custom branch of elasticsearch in order to
give the kibana system user more privileges.

#### Step 1 - Clone the prototype branch
The elasticsearch branch is at
https://github.com/elastic/elasticsearch/tree/entity-store-permissions.

Or you can use [github command line](https://cli.github.com/) to
checkout my draft PR:
```
gh pr checkout 113942
```
#### Step 2 - Install Java
Install [homebrew](https://brew.sh/) if you do not have it.

```
brew install openjdk@21
sudo ln -sfn /opt/homebrew/opt/openjdk@21/libexec/openjdk.jdk /Library/Java/JavaVirtualMachines/openjdk-21.jdk
```

#### Step 3 - Run elasticsearch
This makes sure your data stays between runs of elasticsearch, and that
you have platinum license features

```
./gradlew run --data-dir /tmp/elasticsearch-repo --preserve-data -Drun.license_type=trial
```

### 2. Get Kibana  Running

#### Step 1 - Connect kibana to elasticsearch

Set this in your kibana config:

```
elasticsearch.username: elastic-admin
elasticsearch.password: elastic-password
```
Now start kibana and you should have connected to the elasticsearch you
made.

### 3. Initialise entity engine and send data!

- Initialise the host or user engine (or both)

```
curl -H 'Content-Type: application/json' \
      -X POST \
      -H 'kbn-xsrf: true' \
      -H 'elastic-api-version: 2023-10-31' \
      -d '{}' \
      http:///elastic:changeme@localhost:5601/api/entity_store/engines/host/init
```

- use your favourite data generation tool to create data, maybe
https://github.com/elastic/security-documents-generator

---------

Co-authored-by: Elastic Machine <[email protected]>
Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit 5131215)
@kibanamachine kibanamachine merged commit 5229bca into elastic:8.x Oct 11, 2024
27 of 39 checks passed
@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 20.7MB 20.7MB -159.0B
Unknown metric groups

ESLint disabled line counts

id before after diff
securitySolution 538 539 +1

Total ESLint disabled count

id before after diff
securitySolution 623 624 +1

cc @hop-dev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants