Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update old Data Needed docs to the new structure #10

Open
yugoslavskiy opened this issue Nov 23, 2020 · 0 comments · May be fixed by #11
Open

Update old Data Needed docs to the new structure #10

yugoslavskiy opened this issue Nov 23, 2020 · 0 comments · May be fixed by #11

Comments

@yugoslavskiy
Copy link
Member

yugoslavskiy commented Nov 23, 2020

Description

We are migrating to the new Data Needed structure, which will provide us with the ability of tight integration with other sub-projects (i.e. RE&CT, future atc-detection, etc), make the project more usable for users and helpful for 3rd party projects.

Here is the new template:

title: 'Human readable title, ideally — from the official documentation, without EventID if present' # ATC will automatically add EventID (if present) to the beginning of the title if present
description: Human readable description, ideally — from the official documentation.
event_id: 4688                                # [optional] ATC will automatically add it to the beginning of the title if present
attack_data:  # list of sources (key) and components (value); Options: https://github.com/mitre-attack/attack-datasources/blob/main/attack_data_sources.yaml 
  - process: process creation
platform: windows                             # linux | unix | macos | network | etc
provider: Microsoft-Windows-Security-Auditing # Microsoft-Windows-Eventlog | BIND | <exact service/deamon name> | None
channel: Security                             # [optional] System | Microsoft-Windows-Sysmon/Operational | queries_log | None
atc_id: DN0001                                # Counting number (for now)
loggingpolicy:                                # [optional] ATC Logging Policy ID
  - LP0001: Success                           # [optional] ATC Logging Policy ID with audit Success or Failure as a value
  - LP0002: [ "Success", "Failure" ]         # [optional] ATC Logging Policy ID with both audit Success and Failure
  - LP0003                                    # [optional] Could be just ATC Logging Policy ID, if there are no Success/Failure options or they are unknown
contributors:
  - 'your name/nickname/twitter'
references:
  - text: 'MicrosoftDocs: 4688(S): A new process has been created'
    link: 'https://github.com/MicrosoftDocs/windows-itpro-docs/blob/master/windows/security/threat-protection/auditing/event-4688.md'
fields:

  - original_name: EventID                # Original value from the data source
    description: The event identifier.    # A good source of the descripotion is elastic *beats "fields.yml" file. download required: https://www.elastic.co/downloads/beats
    sample_value: '4688'
    elastic_ecs_name: winlog.event_id     # Elastic Common Schema name. Source: https://github.com/elastic/ecs/blob/master/generated/csv/fields.csv ; if not present, check elastic *beats "fields.yml" file
    splunk_cim_name: EventCode            # Splunk CIM name. Source: https://docs.splunk.com/Documentation/CIM/4.18.0/User/CIMfields
    otr_ossem_name: event_id              # OTR OSSEM name. Source: https://github.com/OTRF/OSSEM

sample: | #raw log sample here
  - <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  [...]
  </Event>

Let me explain the update in detail and highlight the reason behind every field.

title

Previously, we were putting ATC ID, Event ID, platform (i.e. windows), and provider (i.e. sysmon) to the title manually.
Here is an example of title from DN_0003_1_windows_sysmon_process_creation:

title: DN_0003_1_windows_sysmon_process_creation

Now here in yaml files, we will just put the normal human-readable name and add everything else automatically (on further steps, i.e. markdown/confluence export).
Also, we will not use the title for any sort of automatic processing, so we will get rid of the requirement to have filename == title.

description

No changes.

event_id

New field. We were relying on the EventID in title before, which is not the best way to do that.

category

Removed. Was used before as a way to highlight the importance of endpoint logs.
Never used (for 2 years), so it seems that when we will need it, we could easily identify which logs are from endpoints by the other fields.

attack_data_source

New field. Categorization from recently released MITRE ATT&CK datasources project.
Replacement of our type field, that never evolved for 2 years.
Will rely on it in the directory structure.
Will be used for integrations with our sub-project and 3rd party projects.

attack_data_component

New field. The same reason as for attack_data_source.

platform

No changes.

provider

No changes.

type

Removed. Was used before as a sort of categorization.
Now ATC relies on attack_data_source.

channel

No changes.

atc_id

New field. Moved out from the title field of previous structure.

loggingpolicy

Old field with one improvement — now we can add audit Success/Failure as a value:

loggingpolicy:
  - LP0001: Success

It is optional, so if there is no audit config (in case of non-windows logs or another kind of data), just left it as-is:

loggingpolicy:
  - LP0001

contributors

Renamed author field. Now it is a list (with many contributors, hopefully). It is just better, more accurate naming. Most of the analytics are collected somewhere, so it's more like contribution, rather than invention with authorship.

references

Previously it was just a link, and it was ugly.
Now it will be a meaningful human-readable reference name and an actual link behind it in all exported documents (markdown/confluence):

references:
  - text: 'MicrosoftDocs: 4688(S): A new process has been created'
    link: 'https://github.com/MicrosoftDocs/windows-itpro-docs/blob/master/windows/security/threat-protection/auditing/event-4688.md'

Thanks to guys from OSSEM for the idea. Not sure if it was their idea, but that was the place where I first saw it.

fields

Previously it was a list of fields that were used by the Data Needed calculation function.
It was a big, useless (for users) list that at the same time included redundant field names that ATC required for proper DN calculation:

fields:
  - EventID
  - Hostname            # redundant
  - SubjectUserSid
  - SubjectUserName
  - SubjectDomainName
  - SubjectLogonId
  - NewProcessId
  - NewProcessName
  - TokenElevationType
  - ProcessId
  - ProcessPid
  - TargetUserSid
  - TargetUserName
  - TargetDomainName
  - TargetLogonId
  - ParentProcessName
  - MandatoryLabel
  - ProcessName         # redundant
  - Image               # redundant

So it was just a mess.

Now it will be a very useful dictionary with the original name, description of the field, sample value, and field namings in different Data Models:

fields:

  - original_name: EventID                # Original value from the data source
    description: The event identifier.    # A good source of the descripotion is elastic *beats "fields.yml" file. download required: https://www.elastic.co/downloads/beats
    sample_value: '4688'
    elastic_ecs_name: winlog.event_id     # Elastic Common Schema name. Source: https://github.com/elastic/ecs/blob/master/generated/csv/fields.csv ; if not present, check elastic *beats "fields.yml" file
    splunk_cim_name: EventCode            # Splunk CIM name. Source: https://docs.splunk.com/Documentation/CIM/4.18.0/User/CIMfields
    otr_ossem_name: event_id              # OTR OSSEM name. Source: https://github.com/OTRF/OSSEM

It will help in cases when our user needs to:

  • know the original field name
  • find the most important fields (by mapping to Detection Rules and Response Actions, that we will add)
  • know the meaning of each and every field
  • choose Data Model (everything is in one place) for corporate needs
  • Identify how a field in one Data Model named in the other (in case of working in a new environment or helping the other organizations with different Data Model implemented)
  • migrate from one data model to another (backlog for 2021)
  • migrate to the custom (internal) data model (backlog for 2021)
  • automatically generate type mappings for ES index
  • apply custom modifications for the fields (generate ES Pipelines)

Next year we will make this data truly "actionable".
There is a PoC for pipeline generation based on Google Santa logs, but it will make sense to add this feature later in 2021, now focusing on migration.

Backlog/Tracking

Here is a PR with a backlog.
If you reading this, you are welcome to join the development (:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant