Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution][DQD] Add new fields to results field map (Phase 1) #184037

Closed
Tracked by #184158
semd opened this issue May 22, 2024 · 2 comments
Closed
Tracked by #184158

[Security Solution][DQD] Add new fields to results field map (Phase 1) #184037

semd opened this issue May 22, 2024 · 2 comments
Assignees
Labels
8.15 candidate Feature:Data Health Quality Data health quality dashboard and related features release_note:skip Skip the PR/issue when compiling release notes Team:Threat Hunting:Explore Team:Threat Hunting Security Solution Threat Hunting Team

Comments

@semd
Copy link
Contributor

semd commented May 22, 2024

Summary

Improve the way we store the incompatible/same family details in the results storage. So we have the flexibility to render that information using react components instead of always relying on the markdownComments static text stored.

Proposal

Add 2 new fields to the mapping, the incompatibleFields and sameFamilyFields, for example:

incompatibleFields: [
  {
    fieldName: 'agent.type'
    expectedValue: 'keyword',
    actualValue: 'text'
    description: 'Type of the agent. The agent type always stays the same and should be given by the agent used. In case of Filebeat the agent would always be Filebeat also if two Filebeat instances are run on the same machine.'
    reason: 'mapping'
  },
  {
    fieldName: 'event.category'
    expectedValue: 'api,authentication,configuration,database,driver,email,file,host,iam,intrusion_detection,library,malware,network,package,process,registry,session,threat,vulnerability,web',
    actualValue: 'behavior'
    description: 'This is one of four ECS Categorization Fields, and indicates the second level in the ECS category hierarchy. `event.category` represents the "big buckets" of ECS categories. For example, filtering on `event.category:process` yields all events relating to process activity. This field is closely related to `event.type`, which is used as a subcategory. This field is an array. This will allow proper categorization of some events that fall in multiple categories.'
    reason: 'value'
  },
]

sameFamilyFields: [
  {
    fieldName: 'agent.type'
    expectedValue: 'keyword',
    actualValue: 'constant_keyword'
    description: 'This is one of four ECS Categorization Fields, and indicates the second level in the ECS category hierarchy. `event.category` represents the "big buckets" of ECS categories. For example, filtering on `event.category:process` yields all events relating to process activity. This field is closely related to `event.type`, which is used as a subcategory. This field is an array. This will allow proper categorization of some events that fall in multiple categories.'
  },
]

To achieve this, we have add this (untested) to the results data stream mapping (code) :

  'incompatibleFields.fieldName': { type: 'keyword', required: true },
  'incompatibleFields.expectedValue': { type: 'keyword', required: true },
  'incompatibleFields.actualValue': { type: 'keyword', required: true },
  'incompatibleFields.description': { type: 'keyword', required: true },
  'incompatibleFields.reason': { type: 'keyword', required: true }, // mapping or value

And remove:

  unallowedMappingFields: { type: 'keyword', required: true, array: true },
  unallowedValueFields: { type: 'keyword', required: true, array: true },

We can keep the markdownComments information in case we want to use it for something else in the future.

@semd semd added release_note:skip Skip the PR/issue when compiling release notes Team:Threat Hunting Security Solution Threat Hunting Team Team:Threat Hunting:Explore 8.15 candidate labels May 22, 2024
@kapral18 kapral18 self-assigned this May 22, 2024
@angorayc
Copy link
Contributor

angorayc commented May 28, 2024

Some thought on this task:

  1. Removing fields in the mapping might not be ideal, as it affects users existing data.
  2. https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html

example:

PUT my-index-000001


PUT my-index-000001/_doc/1
{
  "group" : "fans",
  "user" : [ 
    {
      "first" : "John",
      "last" :  "Smith"
    },
    {
      "first" : "Alice",
      "last" :  "White"
    }
  ]
}

PUT my-index-000001/_doc/2
{
  "group" : "fans",
  "user" : [ 
    {
      "first" : "Alice",
      "last" :  "Smith"
    },
    {
      "first" : "Alice",
      "last" :  "White"
    }
  ]
}


GET my-index-000001/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "user.first": "Alice" }},
        { "match": { "user.last":  "Smith" }}
      ]
    }
  }
}



return:


{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 0.43301374,
    "hits": [
      {
        "_index": "my-index-000001",
        "_id": "2",
        "_score": 0.43301374,
        "_source": {
          "group": "fans",
          "user": [
            {
              "first": "Alice",
              "last": "Smith"
            },
            {
              "first": "Alice",
              "last": "White"
            }
          ]
        }
      },
      {
        "_index": "my-index-000001",
        "_id": "1",
        "_score": 0.36464313,
        "_source": {
          "group": "fans",
          "user": [
            {
              "first": "John",
              "last": "Smith"
            },
            {
              "first": "Alice",
              "last": "White"
            }
          ]
        }
      }
    ]
  }
}


==================


PUT test-index-2
{
  "mappings": {
    "properties": {
      "user": {
        "type": "nested" 
      }
    }
  }
}


PUT test-index-2/_doc/1
{
  "group" : "fans",
  "user" : [ 
    {
      "first" : "John",
      "last" :  "Smith"
    },
    {
      "first" : "Alice",
      "last" :  "White"
    }
  ]
}

PUT test-index-2/_doc/2
{
  "group" : "fans",
  "user" : [ 
    {
      "first" : "Alice",
      "last" :  "Smith"
    },
    {
      "first" : "Alice",
      "last" :  "White"
    }
  ]
}

GET test-index-2/_search
{
  "query": {
    "nested": {
      "path": "user",
      "query": {
        "bool": {
          "must": [
            { "match": { "user.first": "Alice" }},
            { "match": { "user.last":  "Smith" }} 
          ]
        }
      }
    }
  }
}


returns:


{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1.0498221,
    "hits": [
      {
        "_index": "test-index-2",
        "_id": "2",
        "_score": 1.0498221,
        "_source": {
          "group": "fans",
          "user": [
            {
              "first": "Alice",
              "last": "Smith"
            },
            {
              "first": "Alice",
              "last": "White"
            }
          ]
        }
      }
    ]
  }
}

@kapral18 kapral18 changed the title [Security Solution] Add result details to DQD storage [Security Solution] Add result details to DQD storage (Phase 1) May 29, 2024
@kapral18
Copy link
Contributor

per our discussion with @angorayc created this sibling ticket #184427

We decided to implement this as 2 tickets 2 PR into branch in my fork with subsequent merging into 1 final combining PR into main elastic/kibana for ease of review

cc: @semd

@kapral18 kapral18 changed the title [Security Solution] Add result details to DQD storage (Phase 1) [Security Solution][DQD] Add result details to DQD storage (Phase 1) May 30, 2024
kapral18 added a commit to kapral18/kibana that referenced this issue Jun 3, 2024
…same family fields

Addresses elastic#184037

- Add `incompatibleFieldItems` and `sameFamilyFieldItems` as nested fields with required attributes.
kapral18 added a commit that referenced this issue Jun 4, 2024
#184657)

…same family fields

Addresses #184037

- Add `incompatibleFieldItems` and `sameFamilyFieldItems` as nested
fields with required attributes.

Steps to verify the change:

1. Bootup PR branch with local es + kibana
2. Open Kibana DevTools
3. Call `GET .kibana-data-quality-dashboard-results-default/_mapping`
4. Verify existence of properly nested `incompatibleFieldItems` and
`sameFamilyFieldItems` new fields


![image](https://github.com/elastic/kibana/assets/1625373/c92a37d8-3b03-4e70-a881-975355a0c834)

![image](https://github.com/elastic/kibana/assets/1625373/ce23f8d2-0e4a-45bd-b005-3abd975fc47b)

Co-authored-by: Kibana Machine <[email protected]>
@kapral18 kapral18 closed this as completed Jun 4, 2024
@kapral18 kapral18 changed the title [Security Solution][DQD] Add result details to DQD storage (Phase 1) [Security Solution][DQD] Add new fields to results field map (Phase 1) Jun 4, 2024
@kapral18 kapral18 added the Feature:Data Health Quality Data health quality dashboard and related features label Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.15 candidate Feature:Data Health Quality Data health quality dashboard and related features release_note:skip Skip the PR/issue when compiling release notes Team:Threat Hunting:Explore Team:Threat Hunting Security Solution Threat Hunting Team
Projects
None yet
Development

No branches or pull requests

3 participants