From 0a5b47d29947a3e712b50bb591a67f7d14b38254 Mon Sep 17 00:00:00 2001 From: Sourin Paul <82123779+SourinPaul@users.noreply.github.com> Date: Fri, 7 Jul 2023 08:59:44 -0700 Subject: [PATCH] [RFC] Asset Integrations & Entity Store RFC - Stage 0 (#2215) --------- Co-authored-by: Eric Beahan --- rfcs/text/0041-asset-integration.md | 249 ++++++++++++++++++++++++++++ 1 file changed, 249 insertions(+) create mode 100644 rfcs/text/0041-asset-integration.md diff --git a/rfcs/text/0041-asset-integration.md b/rfcs/text/0041-asset-integration.md new file mode 100644 index 0000000000..7fe8ee9b31 --- /dev/null +++ b/rfcs/text/0041-asset-integration.md @@ -0,0 +1,249 @@ +# 0041: Asset Integration + + +- Stage: **0 (strawperson)** +- Date: **2023-07-07** + + + + + +This proposal extends the existing ECS field set to store inventory metadata for hosts and users from external application repositories. Using ECS to store such fields will improve metadata querying and retrieval across various use cases. + +Terminologies: +The `Entity Analytics` initiative within Security refers to hosts and users as `entities`. Other generic security and observability use cases may refer to hosts/ users as `assets`. Certain directory services or asset management applications use the term 'device' when referring to a host. In this RFC, I have simplified these terminologies to `users` and `hosts` and these will represent all the neighboring terms. + +This proposal includes the following: +* Additional fields in the `users` and `os` objects. +* Introduces a new field set called `assets`. + + +This proposal will also facilitate storing host and user inventory within the security solution (the entity store). + + + + + + +## Fields + + + +### Proposed New Fields for User object + +Field | Type | Example | Description +--- | --- | --- | --- +user.profile.id | keyword | 1234 | User ID from the identity datasource. +user.profile.type | keyword | Employee | Type of user account. +user.profile.status | keyword | On board | Status of the user account. +user.profile.first_name | keyword | First | First Name of the User. +user.profile.last_name | keyword | Last | Last Name of the user. +user.profile.other_identities | keyword, text | first.last@elk.elastic.co | Array of additional user identities (usually email addresses). +user.profile.manager | keyword | John Doe | Assigned Manager for the user account. +user.profile.employee_type | keyword | Regular | Further classification type for the user account. +user.profile.job_family | keyword | 65-Sales | Job family associated with the user account. +user.profile.job_family_group | keyword | GTM | Job family group associated with the user account. +user.profile.management_level | keyword | Individual Contributor | If the user account is identified as a Manager or Individual contributor. +user.profile.job_title | keyword | Field Sales | Job title assigned to the user account. +user.profile.department | keyword | x256 | Department name associated with the user account. +user.profile.organization | keyword | Elasticsearch Inc. | Organization name associated with the account. +user.profile.location | keyword | US - Washington - Distributed | Assigned location for the user account. +user.profile.mobile_phone | keyword | 222-222-2222 +user.profile.primaryPhone | keyword | 222-222-2222 +user.profile.secondEmail | keyword | first.l@elastic.co | Additional email addresses associated with the user account. +user.profile.sup_org_id | keyword | SUP-ORG-75 | Primary organization ID for the user account. +user.profile.supervisory_Org | keyword | Field Sales | Primary organization name for the user account. +user.profile.assigned_mdm_id | keyword | 2950 | The primary host identifier (usually `asset.id` value) assigned to the user. This field acts as a correlation identifier for the host event document. +user.account.create_date | date | June 5, 2023 @ 18:25:57.000 | Date account was created. +user.account.activated_date | date | June 5, 2023 @ 18:25:57.000 | Date account was activated. +user.account.change_date | date | June 5, 2023 @ 18:25:57.000 | Date user account record was last updated at source +user.account.status.recovery | boolean | true/ false | A flag indicating if account is in recovery +user.account.status.locked_out | boolean | true/ false | A flag indicating if account is currently locked out +user.account.status.suspended | boolean | true/ false | A flag indicating if account has been suspended +user.account.isAdmin | boolean | true/ false | A flag indicating if account is an Admin account +user.account.isDelegatedAdmin | boolean | true/ false | A flag indicating if account has Delegated Admin rights +user.account.isPriviledged | boolean | true/ false | A flag indicating if account is a Privileged account +user.account.status.password_expired | boolean | true/ false | A flag indicating if account password has expired. +user.account.status.deprovisioned | boolean | true/ false | A flag indicating if account has been deprovisioned +user.account.password_change_date | date | June 5, 2023 @ 18:25:57.000 | Last date/time when account password was updated + +### Proposed New Fields for Asset object + +Field | Type | Generic Example | User Entity Example | Host Entity Example | Description +--- | --- | --- | --- | --- | --- +asset.category | keyword | - | Null | hardware | A further classification of the asset type beyond event.category. For example, for host assets {hardware, virtual, container, node}. For user assets {NULL ?} +asset.type | keyword | - | Null | workstation | A sub classification of asset. For host assets {workstation, S3, Compute}. For user assets {NULL?}. +asset.id | keyword | - | 00uhs72c27s6PiK7x1t7 | 2950 | A unique ID for the asset. For inventory integrations, it's the id generated from inventory data source. +asset.name | keyword | - | Sourin Paul | Sourin Paul Macbook Pro | A common name for the asset. +asset.vendor | keyword | - | - | Apple | Used primarily for 'Host' entities, the vendor name or brand associated with the asset. +asset.product | keyword | - | - | MacBook Pro | Used primarily for 'Host' entities, the product name associated with the asset. +asset.model | keyword | - | - |TBD | Used primarily for 'Host' entities, the model name or number associated with this asset. +asset.version | keyword | - | - | TBD | Used primarily for 'Host' entities, the version or year associated with the asset. +asset.owner | keyword | - | - | sourin.paul@elastic.co | The primary user entity identifier (usually an email address) who owns the 'Host' asset. +asset.priority | keyword | Priority 1 | - | - | A priority classification for the asset obtained from outside the solution, such as from some external CMDB or Directory service. +asset.criticality | keyword | Critical | - | - | A criticality classification obtained from outside the solution, such as from some external CMDB or Directory service. +asset.business_unit | keyword | Analyst Experience | - | - | Business Unit associated with the asset (user or host). +asset.costCenter | keyword | Security - Protections | - | - | Cost Center associated with the asset (user or host). +asset.cost_center_hierarchy | keyword | Engineering | - | - | Additional cost center information associated with the asset (user or host). +asset.status | keyword | ACTIVE | - | - | Current status of the asset in the inventory datasource. +asset.last_status_change_date | date | June 5, 2023 @ 18:25:57.000 | - | - | The most recent date/time when the asset.status was updated. +asset.create_date | date | June 5, 2023 @ 18:25:57.001 | - | - | For users, it's the hire date. For other assets, it's the in-service date. +asset.end_date | date | June 5, 2023 @ 18:25:57.002 | - | - | For users, it's the termination date; for other assets, it's the out-of-service date. +asset.first_seen | date | June 5, 2023 @ 18:25:57.003 | - | - | The first date/time the directory service or the security solution observed this asset. +asset.last_seen | date | June 5, 2023 @ 18:25:57.004 | - | - | The most recent date/time the directory service or the security solution observed this asset. +asset.last_updated | date | June 5, 2023 @ 18:25:57.005 | - | - | The most recent date/time this asset was updated in directory services. +asset.serial_number | keyword | C02FG1G1MD6T | - | - | Serial number of the asset. +asset.tags | keyword | watch, mdmaccess | - | - | Tags assigned at the MDM. +asset.assigned_users | keyword | user1@email.com, user2@email.com | - | - | List of user ids (usually email addresses) assigned to the asset. The value from the `asset.owner` field should always be included. +asset.assigned_users_are_admin | boolean | TRUE | - | - | Flag to identify if the assigned users have admin privileges. +asset.is_managed | boolean | TRUE | - | - | If asset is managed by the organization. +asset.last_enrolled_date | date | June 5, 2023 @ 18:25:57.005 | - | - | The most recent date/time the asset checked in with MDM. +asset.data_classification | keyword | restricted | - | - | Data classification tier for the asset. +asset.installed_extensions | keyword | Nested objects | List of installed extensions along with their metadata +asset.installed_applications | keyword | Nested objects | List of installed applications along with their metadata + +#### Nesting of existing risk.* fields under asset object +* We have a set of risk.* fields in ECS that can be further nested under the asset.* object. Reference to [Risk RFC](https://github.com/elastic/ecs/blob/main/rfcs/text/0031-risk-fields.md). + + + +### Proposed New Fields for os.* object +Field | Type | Example | Description +--- | --- | --- | --- +os.build | keyword | 22F66 | Host OS Build information + + + + + +## Usage + + + +* As part of Entity Analytics, we are ingesting metadata about Users and from various external vendor applications. We are storing all ingested metadata in Elasticsearch. After we map these fields to ECS, we will enrich these ingested events for risk-scoring scenarios (e.g., context enrichments) and detecting advanced analytics (UBA) use cases. + +* This schema will persist `Observed` (queried) entities from the ingested security log dataset in an Entity store. This entity store can be further extended to meet broader Asset Management needs. + +* Additional enrichment use cases for existing prebuilt detection rules will leverage these ECS fields. + + +## Source data + + + +There are many sources of asset inventory repositories. In the mid-term, we are planning to ingest data from the following application providers: + +### User (Identity) repository sources: +* Azure Active Directory +* Active Directory DS +* Okta +* Workday +* GSuite +* GitHub + +### Host repository sources: +* Azure Active Directory +* Jamf +* Active Directory DS +* MS Intune +* ServiceNow Asset CMDB + + + + + +## Scope of impact + + + +* Ingestion mechanisms: Entity Analytics fleet integrations are the primary ingesting mechanism for this dataset. + +* Usage mechanism: Elastic Security solution (Entity Analytics & Threat Hunting workflows) will be the primary user of the proposed ECS fields and values. + + + +## Concerns + + + +* We have a couple of fleet integrations under development. We want them to use these proposed ECS before being released. +* Schema/ field sets defined here focus on asset inventory data sources. Additional fields may need to be appended (ideally within this RFC lifecycle) to support the entity store needs. +* Due diligence is needed to avoid the proliferation of field sets and validate business requirements. +* In stage1, @jasonrhodes identified fields from o11y use cases and a potential conflict: https://github.com/elastic/ecs/pull/2215#pullrequestreview-1498781860 + + + + + +## People + +The following are the people that consulted on the contents of this RFC. + +* @sourinpaul | author +* @andrewkroh | subject matter expert +* @jamiehynds | subject matter expert +* @lauravoicu | subject matter expert +* @MikePaquette | subject matter expert +* @sourinpaul | sponsor +* ? + + + + +## References + + + +### RFC Pull Requests + + + +* Stage 0: https://github.com/elastic/ecs/pull/2215 + +