diff --git a/docs/img/correlation.png b/docs/img/correlation.png new file mode 100644 index 000000000..2d01ddc7d Binary files /dev/null and b/docs/img/correlation.png differ diff --git a/docs/schema/observability/README.md b/docs/schema/observability/README.md index 5f101d853..3089a6381 100644 --- a/docs/schema/observability/README.md +++ b/docs/schema/observability/README.md @@ -22,6 +22,63 @@ In many occasions, correlation between the logs, traces and metrics is mandatory For such correlation to be possible the industry has formulated several protocols ([OTEL](https://github.com/open-telemetry), [ECS](https://github.com/elastic/ecs), [OpenMetrics](https://github.com/OpenObservability/OpenMetrics)) for communicating these signals - the Observability schemas. +## Data Correlation + +In order to be able to correlate information across different signal (represented in different indices) we introduced the notion of correlation into the schema. +This information is represented explicitly in both the declarative schema file and the physical mapping file + +This information will enable the knowledge to be projected and allow for analytic engine to produce a join query that will take advantage of these relationships. +The correlation metadata info is exported in the following way: + +### Observability Correlation Example: + +### Schema related +In JSON Schema, there is no built-in way to represent relationships directly between multiple schemas, like you would find in a relational database. However, you can establish relationships indirectly by using a combination of `$id`, `$ref`, and consistent property naming across your schemas. +For example the [`logs.schema`](../../../src/main/resources/schema/observability/logs/logs.schema) file contains the next `$ref` references for the `traceId` & `spanId` fields that belong to the `traces.schema`. + +```json5 + ... + "traceId": { + "$ref": "https://opensearch.org/schemas/observability/Span#/properties/traceId" + }, + "spanId": { + "$ref": "https://opensearch.org/schemas/observability/Span#/properties/spanId" + }, + ... +``` + +We can observe that the `traceId` field is defined by referencing to the [Span](../../../src/main/resources/schema/observability/traces/traceGroups.schema) schema and explicitly to the `#/properties/spanId` field reference location. + +### Mapping related +Each mapping template will contain the foreign schemas that are referenced to in that specific mapping file. For example the [`logs.mapping`](../../../src/main/resources/schema/observability/logs/logs.schema) file will contain the next correlation object in the mapping `_meta` section: + +```json5 + "_meta": { + "description": "Simple Schema For Observability", + "catalog": "observability", + "type": "logs", + "correlations": [ + { + "field": "spanId", + "foreign-schema": "traces", + "foreign-field": "spanId" + }, + { + "field": "traceId", + "foreign-schema": "traces", + "foreign-field": "traceId" + } + ] + } + +``` + +Each `correlations` field contains the F.K field name - `spanId` , the referenced schema - `traces` and the source field name in that schema `spanId` + +This information can be used to generate the correct join queries on a contextual basis. + +![](../../img/correlation.png) + --- ## Schema Aware Components diff --git a/docs/schema/observability/logs/README.md b/docs/schema/observability/logs/README.md index 13db0171e..57f33f583 100644 --- a/docs/schema/observability/logs/README.md +++ b/docs/schema/observability/logs/README.md @@ -143,29 +143,29 @@ This field is expected to appear in any future integration or Observability reso _Inspired by [ECS - http](https://www.elastic.co/guide/en/ecs/current/ecs-http.html), [OTEL - http](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md)_ - - `method` - HTTP request method. `GET; POST; HEAD` (OTEL driven) Correspond with `request.method` (ECS driven) - - `status_code` - [Http response code](https://tools.ietf.org/html/rfc7231#section-6) (OTEL driven) Correspond with `response.status_code` (ECS driven) - - `flavor` - Kind of HTTP protocol used. (OTEL driven) Correspond with `version` (ECS driven) - - `user_agent` - Value of the HTTP User-Agent header sent by the client. (OTEL driven) - - - `request.id` - A unique identifier for each HTTP request to correlate logs between clients and servers in transactions. (ECS driven) - - `request_content_length` - The size of the request payload body in bytes. (OTEL driven) Correspond with `request.bytes` (ECS driven) - - `request.body.content` - The full HTTP request body. (ECS driven) - - `request.referrer` - Referrer for this HTTP request. (ECS driven) - - `request.header` - HTTP request headers key/value object (OTEL driven) - - `request.mime_type` - Mime type of the body of the response. (ECS driven) - - - `response_content_length` - The size of the response payload body in bytes. (OTEL driven) Correspond with `response.bytes` (ECS driven) - - `response.body.content` - The full HTTP response body. (ECS driven) - - `response.header` - HTTP response headers key/value object (OTEL driven) - - - `url` - Full HTTP request URL in the form scheme://host[:port]/path?query[#fragment] (OTEL driven) - - `resend_count` - The ordinal number of request resending attempt (OTEL driven) - - - `scheme` - The URI scheme identifying the used protocol. (OTEL driven) - - `target` - The full request target as passed in a HTTP request line or equivalent. (OTEL driven) - - `route` - The matched route (path template in the format used by the respective server framework) (OTEL driven) - - `client_ip` - The IP address of the original client behind all proxies (OTEL driven) `client.ip` - IP address of the client (IPv4 or IPv6). (ECS driven) +- `method` - HTTP request method. `GET; POST; HEAD` (OTEL driven) Correspond with `request.method` (ECS driven) +- `status_code` - [Http response code](https://tools.ietf.org/html/rfc7231#section-6) (OTEL driven) Correspond with `response.status_code` (ECS driven) +- `flavor` - Kind of HTTP protocol used. (OTEL driven) Correspond with `version` (ECS driven) +- `user_agent` - Value of the HTTP User-Agent header sent by the client. (OTEL driven) + +- `request.id` - A unique identifier for each HTTP request to correlate logs between clients and servers in transactions. (ECS driven) +- `request_content_length` - The size of the request payload body in bytes. (OTEL driven) Correspond with `request.bytes` (ECS driven) +- `request.body.content` - The full HTTP request body. (ECS driven) +- `request.referrer` - Referrer for this HTTP request. (ECS driven) +- `request.header` - HTTP request headers key/value object (OTEL driven) +- `request.mime_type` - Mime type of the body of the response. (ECS driven) + +- `response_content_length` - The size of the response payload body in bytes. (OTEL driven) Correspond with `response.bytes` (ECS driven) +- `response.body.content` - The full HTTP response body. (ECS driven) +- `response.header` - HTTP response headers key/value object (OTEL driven) + +- `url` - Full HTTP request URL in the form scheme://host[:port]/path?query[#fragment] (OTEL driven) +- `resend_count` - The ordinal number of request resending attempt (OTEL driven) + +- `scheme` - The URI scheme identifying the used protocol. (OTEL driven) +- `target` - The full request target as passed in a HTTP request line or equivalent. (OTEL driven) +- `route` - The matched route (path template in the format used by the respective server framework) (OTEL driven) +- `client_ip` - The IP address of the original client behind all proxies (OTEL driven) `client.ip` - IP address of the client (IPv4 or IPv6). (ECS driven) #### Communication Includes client / server part of the communication diff --git a/docs/schema/observability/logs/sample/aws/aws_elb-log.json b/docs/schema/observability/logs/sample/aws/aws_elb-log.json new file mode 100644 index 000000000..7171cc7e6 --- /dev/null +++ b/docs/schema/observability/logs/sample/aws/aws_elb-log.json @@ -0,0 +1,66 @@ +{ + "@timestamp": "2018-07-02T22:23:00.186Z", + "aws": { + "elb": { + "backend": { + "http": { + "response": { + "status_code": 200 + } + }, + "ip": "10.0.0.1", + "port": "80" + }, + "backend_processing_time": { + "sec": 0.001 + }, + "matched_rule_priority": "0", + "name": "app/my-loadbalancer/50dc6c495c0c9188", + "protocol": "http", + "request_processing_time": { + "sec": 0 + }, + "response_processing_time": { + "sec": 0 + }, + "target_group": { + "arn": "arn:aws:elasticloadbalancing:us-east-2:123456789012:targetgroup/my-targets/73e2d6bc24d8a067" + }, + "target_port": [ + "10.0.0.1:80" + ], + "target_status_code": [ + "200" + ], + "traceId": "Root=1-58337262-36d228ad5d99923122bbe354", + "type": "http" + } + }, + "cloud": { + "provider": "aws" + }, + "http": { + "request": { + "body": { + "bytes": 34 + }, + "method": "GET" + }, + "response": { + "body": { + "bytes": 366 + }, + "status_code": 200 + }, + "url": "http://www.example.com:80/", + "schema": "http" + }, + "communication": { + "source": { + "address": "192.168.131.39", + "ip": "192.168.131.39", + "port": 2817 + } + }, + "traceId": "Root=1-58337262-36d228ad5d99923122bbe354" +} \ No newline at end of file diff --git a/docs/schema/observability/logs/sample/cloud.json b/docs/schema/observability/logs/sample/cloud.json new file mode 100644 index 000000000..164e95441 --- /dev/null +++ b/docs/schema/observability/logs/sample/cloud.json @@ -0,0 +1,5 @@ +{ + "cloud": { + "provider": "aws" + } +} \ No newline at end of file diff --git a/docs/schema/observability/logs/sample/communication.json b/docs/schema/observability/logs/sample/communication.json new file mode 100644 index 000000000..e18826bca --- /dev/null +++ b/docs/schema/observability/logs/sample/communication.json @@ -0,0 +1,9 @@ +{ + "communication": { + "source": { + "address": "192.168.131.39", + "ip": "192.168.131.39", + "port": 2817 + } + } +} \ No newline at end of file diff --git a/docs/schema/observability/logs/sample/container.json b/docs/schema/observability/logs/sample/container.json new file mode 100644 index 000000000..8d289c3b2 --- /dev/null +++ b/docs/schema/observability/logs/sample/container.json @@ -0,0 +1,6 @@ +{ + "container": { + "image": "EC2" + } + +} \ No newline at end of file diff --git a/docs/schema/observability/logs/sample/http/http.json b/docs/schema/observability/logs/sample/http/http.json new file mode 100644 index 000000000..319a1a26a --- /dev/null +++ b/docs/schema/observability/logs/sample/http/http.json @@ -0,0 +1,18 @@ +{ + "http": { + "request": { + "body": { + "bytes": 34 + }, + "method": "GET" + }, + "response": { + "body": { + "bytes": 366 + }, + "status_code": 200 + }, + "url": "http://www.example.com:80/", + "schema": "http" + } +} \ No newline at end of file diff --git a/docs/schema/observability/logs/sample/http_client-log.json b/docs/schema/observability/logs/sample/http/http_client-log.json similarity index 100% rename from docs/schema/observability/logs/sample/http_client-log.json rename to docs/schema/observability/logs/sample/http/http_client-log.json diff --git a/docs/schema/observability/logs/sample/http_server-log.json b/docs/schema/observability/logs/sample/http/http_server-log.json similarity index 100% rename from docs/schema/observability/logs/sample/http_server-log.json rename to docs/schema/observability/logs/sample/http/http_server-log.json diff --git a/docs/schema/observability/traces/samples/load_samples.md b/docs/schema/observability/traces/samples/load_samples.md index 9536522c3..792adfc75 100644 --- a/docs/schema/observability/traces/samples/load_samples.md +++ b/docs/schema/observability/traces/samples/load_samples.md @@ -5,11 +5,31 @@ For loading the given samples run the next request once the Opensearch cluster i `PUT sso_traces-default-namespace/_bulk` ```json { "create":{ } } -{"traceId":"4fa04f117be100f476b175e41096e736","spanId":"e275ac9d21929e9b","traceState":[],"parentSpanId":"","name":"client_checkout","kind":"INTERNAL","@timestamp":"2021-11-13T20:20:39+00:00","endTime":"2021-11-14T20:10:41+00:00","droppedAttributesCount":0,"droppedEventsCount":0,"droppedLinksCount":0,"resource":{"telemetry@sdk@name":"opentelemetry","telemetry@sdk@language":"python","telemetry@sdk@version":"0.14b0","service@name":"frontend-client","host@hostname":"ip-172-31-10-8.us-west-2.compute.internal"},"status":{"code":0}} +{"traceId":"4fa04f117be100f476b175e41096e736","spanId":"e275ac9d21929e9b","traceState":[],"parentSpanId":"","name":"client_checkout","kind":"INTERNAL","@timestamp":"2021-11-13T20:20:39+00:00","startTime":"2021-11-13T20:20:39+00:00","endTime":"2021-11-14T20:10:41+00:00","droppedAttributesCount":0,"droppedEventsCount":0,"droppedLinksCount":0,"resource":{"telemetry@sdk@name":"opentelemetry","telemetry@sdk@language":"python","telemetry@sdk@version":"0.14b0","service@name":"frontend-client","host@hostname":"ip-172-31-10-8.us-west-2.compute.internal"},"status":{"code":0},"attributes": {"serviceName":"frontend"}} { "create":{ } } -{"traceId":"15d30e4d211d79e10fcaeab97015c90d","spanId":"5bcca8ba513bb54a","traceState":[],"parentSpanId":"","name":"mysql","kind":"CLIENT","@timestamp":"2021-11-13T20:20:39+00:00","endTime":"2021-11-14T20:10:41+00:00","events":[{"@timestamp":"2021-03-25T17:21:03.044+00:00","name":"exception","attributes":{"exception@message":"1050 %2842S01%29: Table %27User_Carts%27 already exists","exception@type":"ProgrammingError","exception@stacktrace":"Traceback %28most recent call last :File /usr/lib/python3.6/site-packages/opentelemetry/sdk/trace/__init__.py, line 804, in use_span yield spanFile /usr/lib/python3.6/site-packages/opentelemetry/instrumentation/dbapi/__init__.py, line 354, in traced_executionraise exFile /usr/lib/python3.6/site-packages/opentelemetry/instrumentation/dbapi/__init__.py, line 345, in traced_executionresult = query_method%28%2Aargs, %2A%2Akwargs%29File /usr/lib/python3.6/site-packages/mysql/connector/cursor.py"},"droppedAttributesCount":0}],"links":[],"droppedAttributesCount":0,"droppedEventsCount":0,"droppedLinksCount":0,"status":{"message":"1050 %2842S01%29: Table %27User_Carts%27 already exists","code":2},"attributes":{"data_stream":{"type":"span","dataset":"mysql"},"component":"mysql","db@user":"root","net@peer@name":"localhost","db@type":"sql","net@peer@port":3306,"db@instance":"","db@statement":"CREATE TABLE `User_Carts` %28 `ItemId` varchar%2816%29 NOT NULL, `TotalQty` int%2811%29 NOT NULL, PRIMARY KEY %28`ItemId`%29%29 ENGINE=InnoDB"},"resource":{"telemetry@sdk@language":"python","service@name":"database","telemetry@sdk@version":"0.14b0","service@instance@id":"140307275923408","telemetry@sdk@name":"opentelemetry","host@hostname":"ip-172-31-10-8.us-west-2.compute.internal"}} +{"traceId":"15d30e4d211d79e10fcaeab97015c90d","spanId":"5bcca8ba513bb54a","traceState":[],"parentSpanId":"","name":"mysql","kind":"CLIENT","@timestamp":"2021-11-13T20:20:39+00:00","startTime":"2021-11-13T20:20:39+00:00","endTime":"2021-11-14T20:10:41+00:00","events":[{"@timestamp":"2021-03-25T17:21:03.044+00:00","name":"exception","attributes":{"exception@message":"1050 %2842S01%29: Table %27User_Carts%27 already exists","exception@type":"ProgrammingError","exception@stacktrace":"Traceback %28most recent call last :File /usr/lib/python3.6/site-packages/opentelemetry/sdk/trace/__init__.py, line 804, in use_span yield spanFile /usr/lib/python3.6/site-packages/opentelemetry/instrumentation/dbapi/__init__.py, line 354, in traced_executionraise exFile /usr/lib/python3.6/site-packages/opentelemetry/instrumentation/dbapi/__init__.py, line 345, in traced_executionresult = query_method%28%2Aargs, %2A%2Akwargs%29File /usr/lib/python3.6/site-packages/mysql/connector/cursor.py"},"droppedAttributesCount":0}],"links":[],"droppedAttributesCount":0,"droppedEventsCount":0,"droppedLinksCount":0,"status":{"message":"1050 %2842S01%29: Table %27User_Carts%27 already exists","code":2},"attributes":{"serviceName":"database","data_stream":{"type":"span","dataset":"mysql"},"component":"mysql","db@user":"root","net@peer@name":"localhost","db@type":"sql","net@peer@port":3306,"db@instance":"","db@statement":"CREATE TABLE `User_Carts` %28 `ItemId` varchar%2816%29 NOT NULL, `TotalQty` int%2811%29 NOT NULL, PRIMARY KEY %28`ItemId`%29%29 ENGINE=InnoDB"},"resource":{"telemetry@sdk@language":"python","service@name":"mysql","telemetry@sdk@version":"0.14b0","service@instance@id":"140307275923408","telemetry@sdk@name":"opentelemetry","host@hostname":"ip-172-31-10-8.us-west-2.compute.internal"}} { "create":{ } } -{"traceId":"c1d985bd02e1dbb85b444011f19a1ecc","spanId":"55a698828fe06a42","traceState":[],"parentSpanId":"","name":"mysql","kind":"CLIENT","@timestamp":"2021-11-13T20:20:39+00:00","endTime":"2021-11-14T20:10:41+00:00","events":[{"@timestamp":"2021-03-25T17:21:03+00:00","name":"exception","attributes":{"exception@message":"1050 %2842S01%29: Table Inventory_Items already exists","exception@type":"ProgrammingError","exception@stacktrace":"Traceback most recent call last"},"droppedAttributesCount":0}],"links":[{"traceId":"c1d985bd02e1dbb85b444011f19a1ecc","spanId":"55a698828fe06a42w2","traceState":[],"attributes":{"db@user":"root","net@peer@name":"localhost","component":"mysql","db@type":"sql","net@peer@port":3306,"db@instance":"","db@statement":"CREATE TABLE `Inventory_Items` %28 `ItemId` varchar%2816%29 NOT NULL, `TotalQty` int%2811%29 NOT NULL, PRIMARY KEY %28`ItemId`%29%29 ENGINE=InnoDB"},"droppedAttributesCount":0}],"droppedAttributesCount":0,"droppedEventsCount":0,"droppedLinksCount":0,"resource":{"telemetry@sdk@language":"python","telemetry@sdk@version":"0.14b0","service@instance@id":"140307275923408","service@name":"database","telemetry@sdk@name":"opentelemetry","host@hostname":"ip-172-31-10-8.us-west-2.compute.internal"},"status":{"code":2,"message":"1050 %2842S01%29: Table %27Inventory_Items%27 already exists"},"attributes":{"data_stream":{"type":"span","namespace":"exceptions","dataset":"mysql"},"db@user":"root","net@peer@name":"localhost","component":"mysql","db@type":"sql","net@peer@port":3306,"db@instance":"","db@statement":"CREATE TABLE `Inventory_Items` %28 `ItemId` varchar%2816%29 NOT NULL, `TotalQty` int%2811%29 NOT NULL, PRIMARY KEY %28`ItemId`%29%29 ENGINE=InnoDB"}} +{"traceId":"c1d985bd02e1dbb85b444011f19a1ecc","spanId":"55a698828fe06a42","traceState":[],"parentSpanId":"","name":"mysql","kind":"CLIENT","@timestamp":"2021-11-13T20:20:39+00:00","startTime":"2021-11-13T20:20:39+00:00","endTime":"2021-11-14T20:10:41+00:00","events":[{"@timestamp":"2021-03-25T17:21:03+00:00","name":"exception","attributes":{"exception@message":"1050 %2842S01%29: Table Inventory_Items already exists","exception@type":"ProgrammingError","exception@stacktrace":"Traceback most recent call last"},"droppedAttributesCount":0}],"links":[{"traceId":"c1d985bd02e1dbb85b444011f19a1ecc","spanId":"55a698828fe06a42w2","traceState":[],"attributes":{"db@user":"root","net@peer@name":"localhost","component":"mysql","db@type":"sql","net@peer@port":3306,"db@instance":"","db@statement":"CREATE TABLE `Inventory_Items` %28 `ItemId` varchar%2816%29 NOT NULL, `TotalQty` int%2811%29 NOT NULL, PRIMARY KEY %28`ItemId`%29%29 ENGINE=InnoDB"},"droppedAttributesCount":0}],"droppedAttributesCount":0,"droppedEventsCount":0,"droppedLinksCount":0,"resource":{"telemetry@sdk@language":"python","telemetry@sdk@version":"0.14b0","service@instance@id":"140307275923408","service@name":"database","telemetry@sdk@name":"opentelemetry","host@hostname":"ip-172-31-10-8.us-west-2.compute.internal"},"status":{"code":2,"message":"1050 %2842S01%29: Table %27Inventory_Items%27 already exists"},"attributes":{"serviceName":"database","data_stream":{"type":"span","namespace":"exceptions","dataset":"mysql"},"db@user":"root","net@peer@name":"localhost","component":"mysql","db@type":"sql","net@peer@port":3306,"db@instance":"","db@statement":"CREATE TABLE `Inventory_Items` %28 `ItemId` varchar%2816%29 NOT NULL, `TotalQty` int%2811%29 NOT NULL, PRIMARY KEY %28`ItemId`%29%29 ENGINE=InnoDB"}} +``` + +`PUT sso_services_default-namespace/_bulk` +```json +{ "create":{ } } +{"serviceName":"customer","kind":"SPAN_KIND_SERVER","destination":{"resource":"SQL SELECT","domain":"mysql"},"target":null,"traceGroupName":"HTTP GET /dispatch","hashId":"OP/8YTM/rui5D131Dyl3uw=="} +{ "create":{ } } +{"serviceName":"openSearch","kind":"SPAN_KIND_CLIENT","destination":null,"target":{"resource":"OpenSearch","domain":"openSearch"},"traceGroupName":"HTTP GET /dispatch","hashId":"NI5NKDfGj0WxtmvIkr7cZQ=="} +{ "create":{ } } +{"serviceName":"customer","kind":"SPAN_KIND_SERVER","destination":null,"target":{"resource":"HTTP GET /customer","domain":"customer"},"traceGroupName":"HTTP GET /dispatch","hashId":"4sQ0k4k7V5vsvCf1iJzdqQ=="} +{ "create":{ } } +{"serviceName":"search","kind":"SPAN_KIND_SERVER","destination":{"resource":"OpenSearch","domain":"openSearch"},"target":null,"traceGroupName":"HTTP GET /dispatch","hashId":"wfGk6I4tPfRuvTNYkP9dFA=="} +{ "create":{ } } +{"serviceName":"database","kind":"SPAN_KIND_CLIENT","destination":null,"target":{"resource":"SQL SELECT","domain":"mysql"},"traceGroupName":"HTTP GET /dispatch","hashId":"hifEI5Vndn1Hrcttnbf0Ig=="} +{ "create":{ } } +{"serviceName":"frontend","kind":"SPAN_KIND_CLIENT","destination":{"resource":"HTTP GET /customer","domain":"customer"},"target":null,"traceGroupName":"HTTP GET /dispatch","hashId":"u2t8FF1YHF4t/5Qa68XINw=="} +{ "create":{ } } +{"serviceName":"search","kind":"SPAN_KIND_SERVER","destination":null,"target":{"resource":"HTTP GET /search","domain":"search"},"traceGroupName":"HTTP GET /dispatch","hashId":"Zg3QONUPtZIHzS7cN1Yo7Q=="} +{ "create":{ } } +{"serviceName":"frontend","kind":"SPAN_KIND_CLIENT","destination":{"resource":"HTTP GET /search","domain":"search"},"target":null,"traceGroupName":"HTTP GET /dispatch","hashId":"AhcxPYfbDX42HAywX7kimQ=="} ``` Run the next query to get the Spans kind CLIENT: @@ -25,4 +45,19 @@ Run the next query to get the Spans kind CLIENT: } } } +``` + +Run the next query to get the services by name : + +- `GET sso_services-default-namespace/_search` +```json +{ + "query":{ + "term": { + "serviceName":{ + "value":"customer" + } + } + } +} ``` \ No newline at end of file diff --git a/docs/schema/observability/traces/samples/serviceA.json b/docs/schema/observability/traces/samples/serviceA.json new file mode 100644 index 000000000..b4558bbc0 --- /dev/null +++ b/docs/schema/observability/traces/samples/serviceA.json @@ -0,0 +1,11 @@ +{ + "serviceName": "customer", + "kind": "SPAN_KIND_SERVER", + "destination": { + "resource": "SQL SELECT", + "domain": "mysql" + }, + "target": null, + "traceGroupName": "HTTP GET /dispatch", + "hashId": "OP/8YTM/rui5D131Dyl3uw==" +} \ No newline at end of file diff --git a/docs/schema/observability/traces/samples/serviceB.json b/docs/schema/observability/traces/samples/serviceB.json new file mode 100644 index 000000000..64accd8fd --- /dev/null +++ b/docs/schema/observability/traces/samples/serviceB.json @@ -0,0 +1,11 @@ +{ + "serviceName": "customer", + "kind": "SPAN_KIND_SERVER", + "destination": null, + "target": { + "resource": "HTTP GET /customer", + "domain": "customer" + }, + "traceGroupName": "HTTP GET /dispatch", + "hashId": "4sQ0k4k7V5vsvCf1iJzdqQ==" +} \ No newline at end of file diff --git a/docs/schema/observability/traces/samples/serviceC.json b/docs/schema/observability/traces/samples/serviceC.json new file mode 100644 index 000000000..85320e10c --- /dev/null +++ b/docs/schema/observability/traces/samples/serviceC.json @@ -0,0 +1,11 @@ +{ + "serviceName": "frontend", + "kind": "SPAN_KIND_CLIENT", + "destination": { + "resource": "HTTP GET /customer", + "domain": "customer" + }, + "target": null, + "traceGroupName": "HTTP GET /dispatch", + "hashId": "u2t8FF1YHF4t/5Qa68XINw==" +} \ No newline at end of file diff --git a/docs/schema/observability/traces/samples/traceA.json b/docs/schema/observability/traces/samples/traceA.json index 451ae8249..8d1d88fbe 100644 --- a/docs/schema/observability/traces/samples/traceA.json +++ b/docs/schema/observability/traces/samples/traceA.json @@ -5,6 +5,7 @@ "parentSpanId": "", "name": "client_checkout", "kind": "INTERNAL", + "startTime": "2021-11-13T20:20:39+00:00", "@timestamp": "2021-11-13T20:20:39+00:00", "endTime": "2021-11-14T20:10:41+00:00", "droppedAttributesCount": 0, @@ -19,5 +20,8 @@ }, "status": { "code": 0 + }, + "attributes": { + "serviceName": "customer" } } \ No newline at end of file diff --git a/docs/schema/observability/traces/samples/traceB.json b/docs/schema/observability/traces/samples/traceB.json index cf5ab4979..af2df8545 100644 --- a/docs/schema/observability/traces/samples/traceB.json +++ b/docs/schema/observability/traces/samples/traceB.json @@ -6,6 +6,7 @@ "name": "mysql", "kind": "CLIENT", "@timestamp": "2021-11-13T20:20:39+00:00", + "startTime": "2021-11-13T20:20:39+00:00", "endTime": "2021-11-14T20:10:41+00:00", "events": [ { @@ -28,6 +29,7 @@ "code": 2 }, "attributes": { + "serviceName": "database", "data_stream": { "type": "span", "dataset": "mysql" diff --git a/docs/schema/observability/traces/samples/traceC.json b/docs/schema/observability/traces/samples/traceC.json index 7da4fd7a7..8dad03bbf 100644 --- a/docs/schema/observability/traces/samples/traceC.json +++ b/docs/schema/observability/traces/samples/traceC.json @@ -6,6 +6,7 @@ "name": "mysql", "kind": "CLIENT", "@timestamp": "2021-11-13T20:20:39+00:00", + "startTime": "2021-11-13T20:20:39+00:00", "endTime": "2021-11-14T20:10:41+00:00", "events": [ { @@ -52,6 +53,7 @@ "message": "1050 %2842S01%29: Table %27Inventory_Items%27 already exists" }, "attributes": { + "serviceName": "database", "data_stream": { "type": "span", "namespace": "exceptions", diff --git a/docs/schema/observability/traces/samples/traceGroups.json b/docs/schema/observability/traces/samples/traceGroups.json new file mode 100644 index 000000000..46fe39a8d --- /dev/null +++ b/docs/schema/observability/traces/samples/traceGroups.json @@ -0,0 +1,8 @@ +{ + "traceGroupFields": { + "endTime": "2023-03-17T04:40:15.666104Z", + "durationInNanos": 312022000, + "statusCode": 2 + }, + "traceGroup": "HTTP GET /dispatch" +} \ No newline at end of file diff --git a/docs/schema/system/README.md b/docs/schema/system/README.md index bad8cabc7..861b2b15f 100644 --- a/docs/schema/system/README.md +++ b/docs/schema/system/README.md @@ -14,27 +14,27 @@ This folder contains internal representation of assets that are stored in the sy [Application](https://opensearch.org/docs/2.5/observing-your-data/app-analytics/) enables creation of custom observability display to view the availability status of your systems, where you can combine log events with trace and metric data into a single view of overall system health. This lets you quickly pivot between logs, traces, and metrics to dig into the source of any issues. - - [Schema](application.schema) + - [Schema](../../../src/main/resources/schema/system/application.schema) - [Sample](samples/application.json) ### Datasource [Data-source](https://opensearch.org/docs/2.4/dashboards/discover/multi-data-sources/) Enables adding multiple data sources to a single dashboard. OpenSearch Dashboards allows you to dynamically manage data sources, create index patterns based on those data sources, and execute queries against a specific data source and then combine visualizations in one dashboard. - - [Schema](datasource.schema) + - [Schema](../../../src/main/resources/schema/system/datasource.schema) - [Sample](samples/datasource.json) ### Index-Pattern An Index Pattern allows to access data that you want to explore. An index pattern selects the data to use. An index pattern may point to multiple indices, data stream, or index aliases. - - [Schema](index-pattern.schema) + - [Schema]../../../src/main/resources/schema/system/index-pattern.schema) - [Sample](samples/index-pattern.json) ### Integration Integration is a schematized and categorized bundle of assets grouped together to allow simple and coherent way to view, analyze and investigate different aspects of your data. Integrations allow pre-defining dashboards, visualizations, index-templates, saved-queries and additional assets so that they provide a complete meaningful user experience. - - [Schema](integration.schema) + - [Schema](../../../src/main/resources/schema/system/integration.schema) - [Sample](samples/integration.json) ### Notebook @@ -42,24 +42,24 @@ Integrations allow pre-defining dashboards, visualizations, index-templates, sav Choose multiple timelines to compare and contrast visualizations. You can also generate reports directly from your notebooks. Common use cases include creating postmortem reports, designing runbooks, building live infrastructure reports, and writing documentation. - - [Schema](notebook.schema) + - [Schema](../../../src/main/resources/schema/system/notebook.schema) - [Sample](samples/notebook.json) ### Operational-Panel [Operational Panels](https://opensearch.org/docs/2.5/observing-your-data/operational-panels/) in OpenSearch Dashboards are collections of visualizations generated using Piped Processing Language (PPL) queries. - - [Schema](operational-panel.schema) + - [Schema](../../../src/main/resources/schema/system/operational-panel.schema) - [Sample](samples/operationalPanel.json) ### Saved-Query A saved query (saved search) allows to reuse a search created in a dashboard for other dashboards. - - [Schema](saved-query.schema) + - [Schema](../../../src/main/resources/schema/system/saved-query.schema) - [Sample](samples/savedQuery.json) ### Visualization [Visualization](https://opensearch.org/docs/2.5/dashboards/visualize/viz-index/) allows translation of complex, high-volume, or numerical data into a visual representation that is easier to process. OpenSearch Dashboards gives you data visualization tools to improve and automate the visual communication process. By using visual elements like charts, graphs, or maps to represent data, you can advance business intelligence and support data-driven decision-making and strategic planning. - - [Schema](visualization.schema) + - [Schema](../../../src/main/resources/schema/system/visualization.schema) - [Sample](samples/visualization.json) diff --git a/docs/schema/system/samples/integrations-fields-list.json b/docs/schema/system/samples/integrations-fields-list.json new file mode 100644 index 000000000..09893f8d3 --- /dev/null +++ b/docs/schema/system/samples/integrations-fields-list.json @@ -0,0 +1,578 @@ +{ + "template-name": "nginx", + "version": "1.0.0", + "description": "Nginx HTTP server collector", + "catalog": "observability", + "collections": [ + { + "category": "logs", + "components": [ + { + "source": "logs.mapping", + "container": true, + "fields": { + "severity": { + "properties": { + "number": { + "type": "long" + }, + "text": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + } + } + }, + "attributes": { + "type": "object", + "properties": { + "data_stream": { + "properties": { + "dataset": { + "ignore_above": 128, + "type": "keyword" + }, + "namespace": { + "ignore_above": 128, + "type": "keyword" + }, + "type": { + "ignore_above": 56, + "type": "keyword" + } + } + } + } + }, + "body": { + "type": "text" + }, + "@timestamp": { + "type": "date" + }, + "observedTimestamp": { + "type": "date" + }, + "traceId": { + "ignore_above": 256, + "type": "keyword" + }, + "spanId": { + "ignore_above": 256, + "type": "keyword" + }, + "schemaUrl": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "instrumentationScope": { + "properties": { + "name": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 128 + } + } + }, + "version": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "dropped_attributes_count": { + "type": "integer" + }, + "schemaUrl": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + } + } + }, + "event": { + "properties": { + "domain": { + "ignore_above": 256, + "type": "keyword" + }, + "name": { + "ignore_above": 256, + "type": "keyword" + }, + "category": { + "ignore_above": 256, + "type": "keyword" + }, + "type": { + "ignore_above": 256, + "type": "keyword" + }, + "kind": { + "ignore_above": 256, + "type": "keyword" + }, + "result": { + "ignore_above": 256, + "type": "keyword" + }, + "exception": { + "properties": { + "message": { + "ignore_above": 1024, + "type": "keyword" + }, + "type": { + "ignore_above": 256, + "type": "keyword" + }, + "stacktrace": { + "type": "text" + } + } + } + } + } + } + }, + { + "source": "http.mapping", + "container": false, + "fields": { + "http": { + "properties": { + "flavor": { + "type": "keyword", + "ignore_above": 256 + }, + "user_agent": { + "type": "keyword", + "ignore_above": 2048 + }, + "url": { + "type": "keyword", + "ignore_above": 2048 + }, + "schema": { + "type": "keyword", + "ignore_above": 1024 + }, + "target": { + "type": "keyword", + "ignore_above": 1024 + }, + "route": { + "type": "keyword", + "ignore_above": 1024 + }, + "client.ip": { + "type": "ip" + }, + "resent_count": { + "type": "integer" + }, + "request": { + "type": "object", + "properties": { + "id": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "body.content": { + "type": "text" + }, + "bytes": { + "type": "long" + }, + "method": { + "type": "keyword", + "ignore_above": 256 + }, + "referrer": { + "type": "keyword", + "ignore_above": 1024 + }, + "mime_type": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "response": { + "type": "object", + "properties": { + "id": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "body.content": { + "type": "text" + }, + "bytes": { + "type": "long" + }, + "status_code": { + "type": "integer" + } + } + } + } + } + } + }, + { + "source": "communication.mapping", + "fields": { + "communication": { + "properties": { + "sock.family": { + "type": "keyword", + "ignore_above": 256 + }, + "source": { + "type": "object", + "properties": { + "address": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "domain": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "bytes": { + "type": "long" + }, + "ip": { + "type": "ip" + }, + "port": { + "type": "long" + }, + "mac": { + "type": "keyword", + "ignore_above": 1024 + }, + "packets": { + "type": "long" + } + } + }, + "destination": { + "type": "object", + "properties": { + "address": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "domain": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 1024 + } + } + }, + "bytes": { + "type": "long" + }, + "ip": { + "type": "ip" + }, + "port": { + "type": "long" + }, + "mac": { + "type": "keyword", + "ignore_above": 1024 + }, + "packets": { + "type": "long" + } + } + } + } + } + } + } + ] + }, + { + "category": "metrics", + "components": [ + { + "source": "metrics.mapping", + "container": true, + "fields": { + "name": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "attributes": { + "type": "object", + "properties": { + "data_stream": { + "properties": { + "dataset": { + "ignore_above": 128, + "type": "keyword" + }, + "namespace": { + "ignore_above": 128, + "type": "keyword" + }, + "type": { + "ignore_above": 56, + "type": "keyword" + } + } + } + } + }, + "description": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "unit": { + "type": "keyword", + "ignore_above": 128 + }, + "kind": { + "type": "keyword", + "ignore_above": 128 + }, + "aggregationTemporality": { + "type": "keyword", + "ignore_above": 128 + }, + "monotonic": { + "type": "boolean" + }, + "startTime": { + "type": "date" + }, + "@timestamp": { + "type": "date" + }, + "observedTimestamp": { + "type": "date_nanos" + }, + "value": { + "properties": { + "int": { + "type": "integer" + }, + "double": { + "type": "double" + } + } + }, + "buckets": { + "properties": { + "count": { + "type": "long" + }, + "sum": { + "type": "double" + }, + "max": { + "type": "float" + }, + "min": { + "type": "float" + } + } + }, + "bucketCount": { + "type": "long" + }, + "bucketCountsList": { + "type": "long" + }, + "explicitBoundsList": { + "type": "float" + }, + "explicitBoundsCount": { + "type": "float" + }, + "quantiles": { + "properties": { + "quantile": { + "type": "double" + }, + "value": { + "type": "double" + } + } + }, + "quantileValuesCount": { + "type": "long" + }, + "positiveBuckets": { + "properties": { + "count": { + "type": "long" + }, + "max": { + "type": "float" + }, + "min": { + "type": "float" + } + } + }, + "negativeBuckets": { + "properties": { + "count": { + "type": "long" + }, + "max": { + "type": "float" + }, + "min": { + "type": "float" + } + } + }, + "negativeOffset": { + "type": "integer" + }, + "positiveOffset": { + "type": "integer" + }, + "zeroCount": { + "type": "long" + }, + "scale": { + "type": "long" + }, + "max": { + "type": "float" + }, + "min": { + "type": "float" + }, + "sum": { + "type": "float" + }, + "count": { + "type": "long" + }, + "exemplar": { + "properties": { + "time": { + "type": "date" + }, + "traceId": { + "ignore_above": 256, + "type": "keyword" + }, + "spanId": { + "ignore_above": 256, + "type": "keyword" + } + } + }, + "instrumentationScope": { + "properties": { + "name": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 128 + } + } + }, + "version": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "droppedAttributesCount": { + "type": "integer" + }, + "schemaUrl": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + } + } + }, + "schemaUrl": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + } + } + } + ] + } + ] +} diff --git a/src/main/resources/schema/README.md b/src/main/resources/schema/README.md new file mode 100644 index 000000000..cdebae05a --- /dev/null +++ b/src/main/resources/schema/README.md @@ -0,0 +1,136 @@ +# Simple Schema for Open Search + +## Background +Simple Schema for OpenSearch brings the concept of organized and structured catalog data. +A catalog of schemas is a comprehensive collection of all the possible data schemas or structures that can be used to represent information. + +It provides a standardized way of organizing and describing the structure of data, making it easier to analyze, compare, and share data across different systems and applications. +Structured data refers to any data that is organized in a specific format or schema - for example Observability data or security data... + +By using a catalog of schemas, data analysts and scientists can easily identify and understand the different structures of data, allowing them to correlate and analyze information more effectively. +One of the key benefits of a catalog of schemas is that it promotes interoperability between different systems and applications. +By using standardized schema descriptions, data can be shared and exchanged more easily, regardless of the system or application being used. + +The use of a catalog of schemas can also improve data quality by ensuring that data is consistent and accurate. This is because the schema provides a clear definition of the data structure and the rules for how data should be entered, validated, and stored. + +## OpenSearch Schemas + +Opensearch supports out of the box the following schemas + - [Observability](observability/README.md) + - [Security](security/README.md) + - [System](system/README.md) + +### Observability +Simple Schema for [Observability](https://github.com/opensearch-project/observability) allows ingestion of both (OTEL/ECS) formats and internally consolidate them to best of its capabilities for presenting a unified Observability platform. + +### Security +OpenSearch Security is a [plugin](https://github.com/opensearch-project/security) for OpenSearch that offers encryption, authentication and authorization. When combined with OpenSearch Security-Advanced Modules, it supports authentication via Active Directory, LDAP, Kerberos, JSON web tokens, SAML, OpenID and more. It includes fine grained role-based access control to indices, documents and fields. It also provides multi-tenancy support in OpenSearch Dashboards. + +### System +System represents the internal structure of Opensearch's `.dashboard` entities such as `dashboard`, `notebook`, `save-query` `integration`. Every saved object may declare itself in the system +catalog folder and publish a corresponding schema. This capability will allow validation of these objects and also simplify structure evolution using this schema. + +### Catalog Loading +During Integration plugin loading, it will go over all the Opensearch's schema supported catalogs and generate the appropriate templates representing them. +This will allow any future Integration using these catalogs without the need to explicitly defining them thus maintaining a unified common schema. + +Each catalog may support semantic versioning so that it may evolve its schema as needed. +In the future, the catalog will enable to associate domains with catalogs and allow externally importing catalogs into Opensearch for additional collaboration. + +### Catalog Structure +A catalog is structured in the following way: + + - Catalog named folder: `Observability` + - Categories named folder : `Logs`, `Traces`, `Metrics` + - Component named file : `http` , `communication` , `traces` , `metrics` + +Each level encapsulates additional internal structure that allows a greater level of composability and agility. +The details of each catalog structure is described in the [catalog.json](system/samples/catalog.json) file that resides in the root level of each catalog folder. + +**Component** +The component is the leaf level definition of the catalog hierarchy, it details the actual building blocks of the catalog's types and fields. + +Each component has two flavours: + + - `$component.mapping` - describes how the type is physically stored in the underlying index + - `$component.mapping` - describing the actual json schema for this component type + +A component may be classified as a `container` which has the ability to group / combine multiple components inside. + +For example, we can examine the [`logs`](observability/logs/logs.mapping) component that has the capacity to combine additional components (such as `http`, `communication` and more) +```json5 + ... + "composed_of": [ + "http_template", + "communication_template" + ], + ... +``` +A component also has a list of `tags` which are aliases for the component name which can be used to reference it directly by an integration components list. + +```json5 + ... + { + "component": "communication", + "version": "1.0", + "url": "https://github.com/opensearch-project/observability/tree/2.x/schema/observability/logs/communication", + "tags": ["web"], + "container": false + } + ... +``` + +## Data Correlation + +In order to be able to correlate information across different signal (represented in different indices) we introduced the notion of correlation into the schema. +This information is represented explicitly in both the declarative schema file and the physical mapping file + +This information will enable the knowledge to be projected and allow for analytic engine to produce a join query that will take advantage of these relationships. +The correlation metadata info is exported in the following way: + +### Observability Correlation Example: + +### Schema related +In JSON Schema, there is no built-in way to represent relationships directly between multiple schemas, like you would find in a relational database. However, you can establish relationships indirectly by using a combination of `$id`, `$ref`, and consistent property naming across your schemas. +For example the [`logs.schema`](observability/logs/logs.schema) file contains the next `$ref` references for the `traceId` & `spanId` fields that belong to the `traces.schema`. + +```json5 + ... + "traceId": { + "$ref": "https://opensearch.org/schemas/observability/Span#/properties/traceId" + }, + "spanId": { + "$ref": "https://opensearch.org/schemas/observability/Span#/properties/spanId" + }, + ... +``` + +We can observe that the `traceId` field is defined by referencing to the [Span](observability/traces/traceGroups.schema) schema and explicitly to the `#/properties/spanId` field reference location. + +### Mapping related +Each mapping template will contain the foreign schemas that are referenced to in that specific mapping file. For example the [`logs.mapping`](observability/logs/logs.schema) file will contain the next correlation object in the mapping `_meta` section: + +```json5 + "_meta": { + "description": "Simple Schema For Observability", + "catalog": "observability", + "type": "logs", + "correlations": [ + { + "field": "spanId", + "foreign-schema": "traces", + "foreign-field": "spanId" + }, + { + "field": "traceId", + "foreign-schema": "traces", + "foreign-field": "traceId" + } + ] + } + +``` + +Each `correlations` field contains the F.K field name - `spanId` , the referenced schema - `traces` and the source field name in that schema `spanId` + +This information can be used to generate the correct join queries on a contextual basis. diff --git a/src/main/resources/schema/observability/README.md b/src/main/resources/schema/observability/README.md new file mode 100644 index 000000000..51c941db1 --- /dev/null +++ b/src/main/resources/schema/observability/README.md @@ -0,0 +1,163 @@ +# Simple Schema for Observability + +## Background +Observability is the ability to measure a system’s current state based on the data it generates, such as logs, metrics, and traces. Observability relies on telemetry derived from instrumentation that comes from the endpoints and services. + +Observability telemetry signals (logs, metrics, traces) arriving from the system would contain all the necessary information needed to observe and monitor. + +Modern application can have a complicated distributed architecture that combines cloud native and microservices layers. Each layer produces telemetry signals that may have different structure and information. + +Using Simple Schema's Observability telemetry schema we can organize, correlate and investigate system behavior in a standard and well-defined manner. + +Observability telemetry schema defines the following components - **logs, traces and metrics**. + +**Logs** provide comprehensive system details, such as a fault and the specific time when the fault occurred. By analyzing the logs, one can troubleshoot code and identify where and why the error occurred. + +**Traces** represent the entire journey of a request or action as it moves through all the layers of a distributed system. Traces allow you to profile and observe systems, especially containerized applications, serverless architectures, or microservices architecture. + +**Metrics** provide a numerical representation of data that can be used to determine a service or component’s overall behaviour over time. + + +In many occasions, correlation between the logs, traces and metrics is mandatory to be able to monitor and understand how the system is behaving. In addition, the distributed nature of the application produces multiple formats of telemetry signals arriving from different components ( network router, web server, database) + +For such correlation to be possible the industry has formulated several protocols ([OTEL](https://github.com/open-telemetry), [ECS](https://github.com/elastic/ecs), [OpenMetrics](https://github.com/OpenObservability/OpenMetrics)) for communicating these signals - the Observability schemas. + +--- +## Schema Aware Components + +The role of the Observability [plugin](https://github.com/opensearch-project/observability) is intended to allow maximum flexibility and not imposing a strict Index structure of the data source. Nevertheless, the modern nature of distributed application and the vast amount of telemetry producers is changing this perception. + +Today many of the Observability solutions (splunk, datadog, dynatrace) recommend using a consolidated schema to represent the entire variance of log/trace/metrics producers. + +This allows monitoring, incidents investigation and corrections process to become simpler, maintainable and reproducible. + + +A Schema-Aware visualization component is a component which assumes the existence of specific index/indices name patterns and expects these indices to have a specific structure - a schema. + +As an example we can see that **Trace-Analytics** is a schema-aware visual component since it directly assumes the traces & serviceMap indices exist and expects them to follow a specific structure. + +This definition doesn’t change the existing status of visualization components which are not “Schema Aware” but it only regulates which Visual components would benefit using a schema and which will be agnostic of its content. + +Operation Panel for example, are not “Schema Aware” since they don’t assume in advanced the existence of a specific index nor do they expect the index they display to have a specific structure. + +## Data Model + +Simple Schema for Observability allows ingestion of both (OTEL/ECS) formats and internally consolidate them to best of its capabilities for presenting a unified Observability platform. + +## Observability index naming + +The Observability indices would follow the recommended for immutable data stream ingestion pattern using the [data_stream concepts](https://opensearch.org/docs/latest/opensearch/data-streams/) + +Index pattern will follow the next naming template `sso_{type}`-`{dataset}`-`{namespace}` + +**type** +- indicated the observability high level types "logs", "metrics", "traces" (prefixed by the `sso_` schema convention ) + +**dataset** +- The field can contain anything that classify the source of the data - such as `nginx.access` + +**namespace** +- A user defined namespace - mainly useful to allow grouping of data such as production grade, geography classification + +This strategy allows two degrees of naming freedom: dataset and namespace. For example a customer may want to route the nginx logs from two geographical areas into two different indices: + + - `sso_logs-nginx-us` + - `sso_logs-nginx-eu` + +This type of distinction also allows for creation of crosscutting queries by setting the next index query pattern `sso_logs-nginx-*` or by using a geographic based crosscutting query `sso_logs-*-eu`. + +## Data index routing + +The [ingestion component](https://github.com/opensearch-project/data-prepper) which is responsible for ingesting the Observability signals is responsible to route the data into the relevant indices. + +The `sso_{type}-{dataset}-{namespace}` combination dictates the target index, `{type}` is prefixed with the `sso_` prefix into one of the supported type: + + - Traces - `sso_traces` + - Metrics - `sso_metrics` + - Logs - `sso_logs` + +For example if within the ingested log contains the following section: +```json5 +{ + ... + "attributes": { + "data_stream": { + "type": "span", + "dataset": "mysql", + "namespace": "prod" + } + } +} +``` +This indicates that the target index for this observability signal should be `sso_traces`-`mysql`-`prod` index that follows uses the traces schema mapping. + +## Observability Index templates + +With the expectation of multiple Observability data providers and the need to consolidate all to a single common schema - the Observability plugin will take the following responsibilities : + + - Define and create all the signals **index templates** upon loading + - Publish a versioned schema file (Json Schema) for each signal type for general validation usage by any 3rd party + +## Observability Ingestion pipeline +The responsibility on an **Observability-ingestion-pipeline** is to create the actual `data_stream` in which it is expecting to ingest into. + +This `data_stream` will use one of the Observability ready-made index templates (Metrics,Traces and Logs) and conform with the above naming pattern (`sso_{type}`-`{dataset}`-`{namespace}`) + +**If the ingesting party has a need to update the template default index setting (shards, replicas ) it may do so before the actual creation of the data_stream.** + +## Observability Signals Correlation + +In order to be able to correlate information across different signal (represented in different indices) we introduced the notion of correlation into the schema. +This information is represented explicitly in both the declarative schema file (for example [`logs.schema`](logs/logs.schema)) and the physical mapping file ([`logs.mapping`](logs/logs.mapping)) + +This information will enable the knowledge to be projected and allow for analytic engine to produce a join query that will take advantage of these relationships. +The correlation metadata info is exported in the following way: + +### Schema related +In JSON Schema, there is no built-in way to represent relationships directly between multiple schemas, like you would find in a relational database. However, you can establish relationships indirectly by using a combination of `$id`, `$ref`, and consistent property naming across your schemas. +For example the [`logs.schema`](logs/logs.schema) file contains the next `$ref` references for the `traceId` & `spanId` fields that belong to the [`traces.schema`](traces/traceGroups.schema). + +```json5 + ... + "traceId": { + "$ref": "https://opensearch.org/schemas/observability/Span#/properties/traceId" + }, + "spanId": { + "$ref": "https://opensearch.org/schemas/observability/Span#/properties/spanId" + }, + ... +``` + +We can observe that the `traceId` field is defined by referencing to the [Span](traces/traceGroups.schema) schema and explicitly to the `#/properties/spanId` field reference location. + +### Mapping related +Each mapping template will contain the foreign schemas that are referenced to in that specific mapping file. For example the [`logs.mapping`](logs/logs.mapping) file will contain the next correlation object in the mapping `_meta` section: + +```json5 + "_meta": { + "description": "Simple Schema For Observability", + "catalog": "observability", + "type": "logs", + "correlations": [ + { + "field": "spanId", + "foreign-schema": "traces", + "foreign-field": "spanId" + }, + { + "field": "traceId", + "foreign-schema": "traces", + "foreign-field": "traceId" + } + ] + } + +``` + +Each `correlations` field contains the F.K (foreign-key) field name - `spanId` , the referenced schema - `traces` and the source field name in that schema `spanId` + +This information can be used to generate the correct join queries on a contextual basis. + +### Note +It is important to mention that these new capabilities would not change or prevent existing customer usage of the system and continue to allow proprietary usage. + diff --git a/src/main/resources/schema/observability/catalog.json b/src/main/resources/schema/observability/catalog.json index 66f52e705..909ba9379 100644 --- a/src/main/resources/schema/observability/catalog.json +++ b/src/main/resources/schema/observability/catalog.json @@ -31,9 +31,28 @@ "description": "Observability communication covers the host and destination information in the network fields. They contain a set of metadata that provide contextual information about the host, destination and network environment where a system is running. They include information such as the host/destination name, IP address, operating system version, and network interface details.", "version": "1.0", "url": "https://github.com/opensearch-project/observability/tree/2.x/schema/observability/logs/communication", - "tags": ["web"], + "tags": [ + "web" + ], "container": false - }] + }, + { + "component": "cloud", + "description": "Observability cloud fields definition contains a set of standardized fields to capture metadata about the cloud environment where an event or metric is generated. This includes information about the cloud provider, region, availability zone, machine type, account details, service name, and instance details. These fields allow for the efficient organization and analysis of data across different cloud platforms, making it easier to monitor and manage resources in a multi-cloud environment.", + "version": "1.0", + "url": "https://github.com/opensearch-project/observability/tree/2.x/schema/observability/logs/cloud", + "container": false, + "tags": [] + }, + { + "component": "container", + "description": "Observability container The container fields definition provides a structured way to store metadata and metrics about containers, which can be used to correlate data across container runtimes. This includes information like the container's unique ID, image name, tags, resource usage (CPU, memory, disk, and network), and runtime management system, allowing users to efficiently monitor and analyze their containerized applications.", + "version": "1.0", + "url": "https://github.com/opensearch-project/observability/tree/2.x/schema/observability/logs/container", + "container": false, + "tags": [] + } + ] }, { "category": "traces", @@ -50,13 +69,22 @@ "container": true }, { - "component": "service", + "component": "services", "description": "Observability services are representations of the dependencies and interactions between various components of a software system. They provide a high-level view of how different parts of the system are connected and how data flows between them. By using service maps, engineers and developers can gain a better understanding of how changes in one part of the system can affect other parts, helping them make more informed decisions about troubleshooting and optimization including root cause analysis", "version": "1.0", "url": "https://github.com/opensearch-project/observability/tree/2.x/schema/observability/traces/services", "tags": [], + "container": true + }, + { + "component": "traceGroups", + "description": "Observability trace groups fields are a set of derived fields that are calculated for a trace's root span - they are copied to all the spans in that trace", + "version": "1.0", + "url": "https://github.com/opensearch-project/observability/tree/2.x/schema/observability/traces/traceGroups", + "tags": [], "container": false - }] + } + ] }, { "category": "metrics", @@ -71,7 +99,8 @@ "url": "https://github.com/opensearch-project/observability/tree/2.x/schema/observability/metrics/metrics", "tags": [], "container": true - }] + } + ] } ] } \ No newline at end of file diff --git a/src/main/resources/schema/observability/logs/README.md b/src/main/resources/schema/observability/logs/README.md new file mode 100644 index 000000000..9fcaad705 --- /dev/null +++ b/src/main/resources/schema/observability/logs/README.md @@ -0,0 +1,234 @@ +# Logs Schema Support + +Observability refers to the ability to monitor and diagnose systems and applications in real-time, in order to understand how they are behaving and identify potential issues. +logs serve as a primary source of information for understanding and debugging complex systems. +Logs provide a record of events, errors, and performance that help developers and administrators identify and resolve issues, monitor system behavior, and improve reliability. + +Logs can also be used to gain insight into user behavior and facilitate auditing and compliance. +By analyzing logs, one can detect patterns, anomalies, and correlations that can inform decisions and facilitate problem-solving. +Logs help to ensure the visibility, reliability, and stability of systems. + +## Details +The next section provides the Simple Schema for Observability support which conforms with the OTEL specification. + +- logs.mapping presents the template mapping for creating the Simple Schema for Observability index +- logs.schema presents the json schema validation for verification of a metrics document conforms to the mapping structure + +## Logs +See [OTEL Logs convention](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model.md) +See [OTEL logs protobuf](https://github.com/open-telemetry/opentelemetry-proto/tree/main/opentelemetry/proto/logs/v1) +See [ECS logs](https://github.com/elastic/ecs) + +Simple Schema for Observability conforms with OTEL logs protocol and also was greatly inspired from the Elastic-Common-Schema schema. + +Simple Schema for Observability defines the next data model: + +## General Fields + +### Event Fields +Within a particular domain, the ```event.name``` attribute identifies the event. Events with same domain and name are structurally similar to one another. +For example, some domains could have well-defined schema for their events based on event names. (OTEL driven) + +`event.kind` gives high-level information about what type of information the event contains, without being specific to the contents of the event.(ECS driven) +**Possible values** + - **alert** - indicates an alerting type event which can be triggered by any alerting mechanism + - **enrichment** - indicates an enriched typed event that adds additional context to the original event + - **event** - the default type of the event + - **metric** - this indicated the event describes a numeric measurement + +The `event.domain` attribute is used to logically separate events from different systems. For example, to record Events from `browser` apps, `mobile` apps and `Kubernetes`, we could use browser, device and k8s as the domain for their Events. +This provides a clean separation of semantics for events in each of the domains. (OTEL driven) + +`event.category` gives categorical-level information about what type of information the event contains, this field is an array. (ECS driven) +**Possible values** + - authentication - events are of a challenge and response process by any system that has such responsibilities + - configuration - events related to the configuration of a system or an application + - database - events that are generated as part of the storage system (SQL RDBMS and such) + - driver - events related to the O/S device driver + - email - events related to email messages, email attachments and such + - file - events related to the fact that it has been created on, or has existed on a filesystem + - host - events related to host inventory or lifecycle events + - iam - Identity & access Management logs types + - network - events relating of network activities (connection / traffic and such) + - package - events indication of software packages installation of hosts + - process - events related to O/S process information + - registry - events related to O/S registry events + - session - events related to a persistent connection between different network components + - web - events related to web server activity + +`event.category` corresponds with `event.domain` + +`event.type` gives a fine grain details of the event's category including the phase in-which the field is part of. +This will allow proper categorization of some events that fall in multiple event types, this field is an array. (ECS driven) + +**Possible values** + - access - indication that this event has accesses some resource + - admin - indication that this event is related to the admin context + - allowed - indication that this event was subsequently allowed by some authority system + - change - indication that this event is related to something that has changed + - connection - indication that this event is related to network traffic with indication of connection activity + - creation - indication that this event is related to resources being created + - deletion - indication that this event is related to resources being deleted + - denied - indication that this event is related to resources being denied access + - error - indication of an error related event + - group - indication of events that are related to group objects + - info - indication of events which are informative without other distinct classification + - installation - indication that this event is related to resources being installed + - protocol - indicate that the event is related to specific knowledge of protocol info + - end - indicate that te event is related to some termination state + - start - indicate that te event is related to some initiation state + - user - indicate that the event is related to specific knowledge a user resource + + +`event.result` gives a success or a failure indication from the perspective of the entity that produced the event. (ECS driven) + +**Possible values** + + - failure + - success + - pending + - undetermined + + +#### Exception Fields +This field encapsulated the exception that should appear under the event section `event.exception` (OTEL driven) + + - `message`: The exception message. + - `stacktrace`: A stacktrace as a string in the natural representation for the language runtime. The representation is to be determined and documented by each language SIG. + - `type`: The type of the exception (its fully-qualified class name, if applicable). The dynamic type of the exception should be preferred over the static type in languages that support it. + +--- + +### Overview + +In observability, Logs are typically unstructured data that is generated by applications or systems as a record of events or messages. +Logs can contain any information that the emitting application or system wants to include, and they often contain free-form text rather than structured data. +This makes logs difficult to process and analyze automatically, but it also provides a lot of flexibility and versatility in terms of what information can be captured. + +According to [ECS](https://github.com/elastic/ecs) and the most recent (experimental) [OTEL](https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/trace/semantic_conventions) definitions, we formalized a unified +log schema. This schema can be used for working with a well-structured set of typed logs arriving from categorical sources. +These sources are expected to report information in a specific way that will simplify future correlations and consolidate similar concerns. + +### data-stream +[data-stream](https://opensearch.org/docs/latest/opensearch/data-streams/) Data streams simplify this process and enforce a setup that best suits time-series data, such as being designed primarily for append-only data and ensuring that each document has a timestamp field. +A data stream is internally composed of multiple backing indices. Search requests are routed to all the backing indices, while indexing requests are routed to the latest write index. + +As part of the Observability naming scheme, the value of the data stream fields combine to the name of the actual data stream : + +`{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`. +This means the fields can only contain characters that are valid as part of names of data streams. + + - **type** conforms to one of the supported Observability signals (Traces, Logs, Metrics, Alerts) + - **dataset** user defined field that can mainly be utilized for describing the origin of the signal + - **namespace** user custom field that can be used to describe any customer domain specific classification + +#### Timestamp field +As part of the data-stream definition the `@timestamp` is mandatory, if the field is not present to begin with use `ObservedTimestamp` as value for this field + +### Instrumentation scope +This is a logical unit of the application with which the emitted telemetry can be associated. It is typically the developer’s choice to decide what denotes a reasonable instrumentation scope. +The most common approach is to use the instrumentation library as the scope, however other scopes are also common, e.g. a module, a package, or a class can be chosen as the instrumentation scope. + +The instrumentation scope may have zero or more additional attributes that provide additional information about the scope. As an example the field +`instrumentationScope.attributes.identification` is presented will be used to determine the resource origin of the signal and can be used to filter accordingly + +For Example - in the sample [nginx_access-log.json](sample/nginx_access-log.json) this value equals to `nginx` indicating the signal source. + +This field is expected to appear in any future integration or Observability resources into OpenSearch. + + +### Logs Classifications + +#### HTTP +_Inspired by [ECS - http](https://www.elastic.co/guide/en/ecs/current/ecs-http.html), [OTEL - http](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md)_ + + - `method` - HTTP request method. `GET; POST; HEAD` (OTEL driven) Correspond with `request.method` (ECS driven) + - `status_code` - [Http response code](https://tools.ietf.org/html/rfc7231#section-6) (OTEL driven) Correspond with `response.status_code` (ECS driven) + - `flavor` - Kind of HTTP protocol used. (OTEL driven) Correspond with `version` (ECS driven) + - `user_agent` - Value of the HTTP User-Agent header sent by the client. (OTEL driven) + + - `request.id` - A unique identifier for each HTTP request to correlate logs between clients and servers in transactions. (ECS driven) + - `request_content_length` - The size of the request payload body in bytes. (OTEL driven) Correspond with `request.bytes` (ECS driven) + - `request.body.content` - The full HTTP request body. (ECS driven) + - `request.referrer` - Referrer for this HTTP request. (ECS driven) + - `request.header` - HTTP request headers key/value object (OTEL driven) + - `request.mime_type` - Mime type of the body of the response. (ECS driven) + + - `response_content_length` - The size of the response payload body in bytes. (OTEL driven) Correspond with `response.bytes` (ECS driven) + - `response.body.content` - The full HTTP response body. (ECS driven) + - `response.header` - HTTP response headers key/value object (OTEL driven) + + - `url` - Full HTTP request URL in the form scheme://host[:port]/path?query[#fragment] (OTEL driven) + - `resend_count` - The ordinal number of request resending attempt (OTEL driven) + + - `scheme` - The URI scheme identifying the used protocol. (OTEL driven) + - `target` - The full request target as passed in a HTTP request line or equivalent. (OTEL driven) + - `route` - The matched route (path template in the format used by the respective server framework) (OTEL driven) + - `client_ip` - The IP address of the original client behind all proxies (OTEL driven) `client.ip` - IP address of the client (IPv4 or IPv6). (ECS driven) + +#### Communication +Includes client / server part of the communication + + **_Inspired by_** : + - [ECS - client](https://www.elastic.co/guide/en/ecs/8.6/ecs-client.html) + - [ECS - Server](https://www.elastic.co/guide/en/ecs/8.6/ecs-server.html#ecs-server) + - [ECS - Source](https://www.elastic.co/guide/en/ecs/8.6/ecs-source.html) + - [ECS - Destination](https://www.elastic.co/guide/en/ecs/8.6/ecs-destination.html) + - [OTEL - network](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md) + + + - `sock.family` - Protocol address family which is used for communication. (OTEL driven) + + - `client.domain` - The domain name of the client system. (ECS driven) Correspond with `source.domain`, (ECS driven) + - `client.bytes` - Bytes sent from the client to the server. (ECS driven) Correspond with `source.bytes` (ECS driven) + - `client.ip` - IP address of the client (IPv4 or IPv6). (ECS driven) Correspond with `source.ip` (ECS driven) + - `client.mac` - MAC address of the client. (IPv4 or IPv6). (ECS driven) Correspond with `source.mac` (ECS driven) + - `client.packets` - Packets sent from the client to the server. (ECS driven) Correspond with `source.packets` (ECS driven) + + - `sock.host.addr` - Local socket address. (OTEL driven) Correspond with `client.address`,`source.address` (ECS driven) + - `sock.host.port` - Local socket port number. (OTEL driven) Corresponds with `client.port`,`source.port` (ECS driven) + + - `server.domain` - The domain name of the server system. (ECS driven) Correspond with `destination.domain` (ECS driven) + - `server.bytes` - Bytes sent from the server to the client. (ECS driven) Correspond with `destination.bytes` (ECS driven) + - `server.ip` - IP address of the server (IPv4 or IPv6). (ECS driven) Correspond with `destination.ip` (ECS driven) + - `server.mac` - MAC address of the server. (IPv4 or IPv6). (ECS driven) Correspond with `destination.mac` (ECS driven) + - `server.packets` - Packets sent from the server to the client.. (ECS driven) Correspond with `destination.packets` (ECS driven) + + - `sock.peer.addr` - Remote socket peer address: IPv4 or IPv6 (OTEL driven) Correspond with `server.address`, `destination.address` (ECS driven) + - `sock.peer.name` - Remote socket peer name. (OTEL driven) Correspond with `destination.domain` (ECS driven) + - `sock.peer.port` - Remote socket peer port. (OTEL driven) (OTEL driven) Corresponds with `server.port`,`destination.port` (ECS driven) + +--- + +```text + + ___________________ + | _______________ | + | |XXXXXXXXXXXXX| | + | |XXXXXXXXXXXXX| | + | |XXXXXXXXXXXXX| | + | |XXXXXXXXXXXXX| | + | |XXXXXXXXXXXXX| | + |_________________| + _[_______]_ + ___[___________]___ + | [_____] []|__ + | [_____] []| \__ + L___________________J \ \___\/ + ___________________ /\ + /###################\ (__) + +``` + +--- + +### References + - https://github.com/opensearch-project/observability/issues/1413 + - https://github.com/opensearch-project/observability/issues/1405 + - https://github.com/opensearch-project/observability/issues/1411 + - https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/semantic_conventions/events.md + - https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md + - https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/http.md + - https://www.elastic.co/guide/en/ecs/8.6/ecs-destination.html + - https://www.elastic.co/guide/en/ecs/8.6/ecs-client.html + - https://www.elastic.co/guide/en/ecs/8.6/ecs-server.html \ No newline at end of file diff --git a/src/main/resources/schema/observability/logs/Usage.md b/src/main/resources/schema/observability/logs/Usage.md new file mode 100644 index 000000000..03fa0c5bb --- /dev/null +++ b/src/main/resources/schema/observability/logs/Usage.md @@ -0,0 +1,117 @@ +# Actual Index setup and usage +The next document describes the practicality of manually defining and initiating of the observability schema indices, it assumes the OpenSearch cluster +is up and running. + +### Setting Up the Logs Mapping +Start the OpenSearch cluster and follow the next steps for manually setup of the Log mapping template: + +`>> PUT _component_template/http_template` + +Copy the http.mapping content [here](http.mapping) + +`>> PUT _component_template/communication_template` + +Copy the communication.mapping content [here](communication.mapping) + +`>> PUT _index_template/logs` + +Copy the logs.mapping content [here](logs.mapping) + +Now you can create an data-stream index (following the logs index pattern) that has the supported schema: + +`>> PUT _data_stream/sso_logs-dataset-test1` + +You can also directly start ingesting data without creating a data stream. +Because we have a matching index template with a data_stream object, OpenSearch automatically creates the data stream: + +`POST sso_logs-dataset-test1/_doc` +```json +{ + "body": "login attempt failed", + "@timestamp": "2013-03-01T00:00:00", + ... +} + +``` + +To see information about a that data stream: +`GET _data_stream/sso_logs-dataset-test1` + +Would respond the following: +```json +{ + "data_streams" : [ + { + "name" : "sso_logs-dataset-test1", + "timestamp_field" : { + "name" : "@timestamp" + }, + "indices" : [ + { + "index_name" : ".ds-sso_logs-dataset-test1-000001", + "index_uuid" : "-VhmuhrQQ6ipYCmBhn6vLw" + } + ], + "generation" : 1, + "status" : "GREEN", + "template" : "sso_logs-*-*" + } + ] +} +``` + +To see more insights about the data stream, use the `_stats` endpoint: +`GET _data_stream/sso_logs-dataset-test1/_stats` +Would respond the following: +```json +{ + "_shards" : { + "total" : 1, + "successful" : 1, + "failed" : 0 + }, + "data_stream_count" : 1, + "backing_indices" : 1, + "total_store_size_bytes" : 208, + "data_streams" : [ + { + "data_stream" : "sso_logs-dataset-test1", + "backing_indices" : 1, + "store_size_bytes" : 208, + "maximum_timestamp" : 0 + } + ] +} +``` +### Ingestion +To ingest data into a data stream, you can use the regular indexing APIs. Make sure every document that you index has a timestamp field. + +`POST sso_logs-dataset-test1/_doc` +```json +{ + "body": "login attempt failed", + "@timestamp": "2013-03-01T00:00:00", + ... +} + +``` +You can search a data stream just like you search a regular index or an index alias. The search operation applies to all of the backing indices (all data present in the stream). + +`GET sso_logs-dataset-test1/_search` +```json +{ + "query": { + "match": { + ... + } + } +} +``` + +### Manage data-streams in OpenSearh + +To manage data streams from OpenSearch Dashboards, open OpenSearch Dashboards, choose Index Management, select Indices or Policy managed indices. + +see additional information: + - https://opensearch.org/docs/latest/opensearch/data-streams/#step-6-manage-data-streams-in-opensearch-dashboards + - https://opensearch.org/docs/latest/opensearch/data-streams/#step-5-rollover-a-data-stream \ No newline at end of file diff --git a/src/main/resources/schema/observability/logs/aws_alb.mapping b/src/main/resources/schema/observability/logs/aws_alb.mapping new file mode 100644 index 000000000..a89ae1225 --- /dev/null +++ b/src/main/resources/schema/observability/logs/aws_alb.mapping @@ -0,0 +1,147 @@ +{ + "template": { + "mappings": { + "_meta": { + "version": "1.0.0", + "catalog": "observability", + "type": "logs", + "component": "aws_alb" + }, + "aws": { + "type": "object", + "elb": { + "properties": { + "name": { + "type": "keyword" + }, + "type": { + "type": "keyword" + }, + "target_group": { + "properties": { + "arn": { + "type": "keyword" + } + } + }, + "listener": { + "type": "keyword" + }, + "protocol": { + "type": "keyword" + }, + "request_processing_time": { + "properties": { + "sec": { + "type": "float" + } + } + }, + "backend_processing_time": { + "properties": { + "sec": { + "type": "float" + } + } + }, + "response_processing_time": { + "properties": { + "sec": { + "type": "float" + } + } + }, + "connection_time": { + "properties": { + "ms": { + "type": "long" + } + } + }, + "tls_handshake_time": { + "properties": { + "ms": { + "type": "long" + } + } + }, + "backend": { + "properties": { + "ip": { + "type": "keyword" + }, + "port": { + "type": "keyword" + }, + "http": { + "properties": { + "response": { + "properties": { + "status_code": { + "type": "long" + } + } + } + } + } + } + }, + "ssl_cipher": { + "type": "keyword" + }, + "ssl_protocol": { + "type": "keyword" + }, + "chosen_cert": { + "properties": { + "arn": { + "type": "keyword" + }, + "serial": { + "type": "keyword" + } + } + }, + "incoming_tls_alert": { + "type": "keyword" + }, + "tls_named_group": { + "type": "keyword" + }, + "trace_id": { + "type": "keyword" + }, + "matched_rule_priority": { + "type": "keyword" + }, + "action_executed": { + "type": "keyword" + }, + "redirect_url": { + "type": "keyword" + }, + "error": { + "properties": { + "reason": { + "type": "keyword" + } + } + }, + "target_port": { + "type": "keyword" + }, + "target_status_code": { + "type": "keyword" + }, + "classification": { + "type": "keyword" + }, + "classification_reason": { + "type": "keyword" + } + } + } + } + } + } +} diff --git a/src/main/resources/schema/observability/logs/aws_alb.schema b/src/main/resources/schema/observability/logs/aws_alb.schema new file mode 100644 index 000000000..025be426d --- /dev/null +++ b/src/main/resources/schema/observability/logs/aws_alb.schema @@ -0,0 +1,259 @@ +{ + "$schema": "http://json-schema.org/draft-04/schema#", + "$id": "https://opensearch.org/schemas/observability/AWS_ALB", + "type": "object", + "properties": { + "@timestamp": { + "type": "string" + }, + "aws": { + "type": "object", + "properties": { + "elb": { + "type": "object", + "properties": { + "backend": { + "type": "object", + "properties": { + "http": { + "type": "object", + "properties": { + "response": { + "type": "object", + "properties": { + "status_code": { + "type": "integer" + } + }, + "required": [ + "status_code" + ] + } + }, + "required": [ + "response" + ] + }, + "ip": { + "type": "string" + }, + "port": { + "type": "string" + } + }, + "required": [ + "http", + "ip", + "port" + ] + }, + "backend_processing_time": { + "type": "object", + "properties": { + "sec": { + "type": "number" + } + }, + "required": [ + "sec" + ] + }, + "matched_rule_priority": { + "type": "string" + }, + "name": { + "type": "string" + }, + "protocol": { + "type": "string" + }, + "request_processing_time": { + "type": "object", + "properties": { + "sec": { + "type": "integer" + } + }, + "required": [ + "sec" + ] + }, + "response_processing_time": { + "type": "object", + "properties": { + "sec": { + "type": "integer" + } + }, + "required": [ + "sec" + ] + }, + "target_group": { + "type": "object", + "properties": { + "arn": { + "type": "string" + } + }, + "required": [ + "arn" + ] + }, + "target_port": { + "type": "array", + "items": [ + { + "type": "string" + } + ] + }, + "target_status_code": { + "type": "array", + "items": [ + { + "type": "string" + } + ] + }, + "traceId": { + "$ref": "https://opensearch.org/schemas/observability/Span#/properties/traceId" + }, + "type": { + "type": "string" + } + }, + "required": [ + "backend", + "backend_processing_time", + "matched_rule_priority", + "name", + "protocol", + "request_processing_time", + "response_processing_time", + "target_group", + "target_port", + "target_status_code", + "traceId", + "type" + ] + } + }, + "required": [ + "elb" + ] + }, + "cloud": { + "type": "object", + "properties": { + "provider": { + "type": "string" + } + }, + "required": [ + "provider" + ] + }, + "http": { + "type": "object", + "properties": { + "request": { + "type": "object", + "properties": { + "body": { + "type": "object", + "properties": { + "bytes": { + "type": "integer" + } + }, + "required": [ + "bytes" + ] + }, + "method": { + "type": "string" + } + }, + "required": [ + "body", + "method" + ] + }, + "response": { + "type": "object", + "properties": { + "body": { + "type": "object", + "properties": { + "bytes": { + "type": "integer" + } + }, + "required": [ + "bytes" + ] + }, + "status_code": { + "type": "integer" + } + }, + "required": [ + "body", + "status_code" + ] + }, + "url": { + "type": "string" + }, + "schema": { + "type": "string" + } + }, + "required": [ + "request", + "response", + "url", + "schema" + ] + }, + "communication": { + "type": "object", + "properties": { + "source": { + "type": "object", + "properties": { + "address": { + "type": "string" + }, + "ip": { + "type": "string" + }, + "port": { + "type": "integer" + } + }, + "required": [ + "address", + "ip", + "port" + ] + } + }, + "required": [ + "source" + ] + }, + "traceId": { + "type": "string" + } + }, + "required": [ + "@timestamp", + "aws", + "cloud", + "http", + "communication", + "traceId" + ] +} \ No newline at end of file diff --git a/src/main/resources/schema/observability/logs/cloud.mapping b/src/main/resources/schema/observability/logs/cloud.mapping new file mode 100644 index 000000000..ece2a2c1f --- /dev/null +++ b/src/main/resources/schema/observability/logs/cloud.mapping @@ -0,0 +1,76 @@ +{ + "template": { + "mappings": { + "_meta": { + "version": "1.0.0", + "catalog": "observability", + "type": "logs", + "component": "cloud" + }, + "properties": { + "cloud": { + "properties": { + "provider": { + "type": "keyword" + }, + "availability_zone": { + "type": "keyword" + }, + "region": { + "type": "keyword" + }, + "machine": { + "type": "object", + "properties": { + "type": { + "type": "keyword" + } + } + }, + "account": { + "type": "object", + "properties": { + "id": { + "type": "keyword" + }, + "name": { + "type": "keyword" + } + } + }, + "service": { + "type": "object", + "properties": { + "name": { + "type": "keyword" + } + } + }, + "project": { + "type": "object", + "properties": { + "id": { + "type": "keyword" + }, + "name": { + "type": "keyword" + } + } + }, + "instance": { + "type": "object", + "properties": { + "id": { + "type": "keyword" + }, + "name": { + "type": "keyword" + } + } + } + } + } + } + } + } +} \ No newline at end of file diff --git a/src/main/resources/schema/observability/logs/communication.mapping b/src/main/resources/schema/observability/logs/communication.mapping index f4325b4b2..73cb945b9 100644 --- a/src/main/resources/schema/observability/logs/communication.mapping +++ b/src/main/resources/schema/observability/logs/communication.mapping @@ -1,6 +1,12 @@ { "template": { "mappings": { + "_meta": { + "version": "1.0.0", + "catalog": "observability", + "type": "logs", + "component": "communication" + }, "properties": { "communication": { "properties": { diff --git a/src/main/resources/schema/observability/logs/communication.schema b/src/main/resources/schema/observability/logs/communication.schema index 44caf523b..76f0535eb 100644 --- a/src/main/resources/schema/observability/logs/communication.schema +++ b/src/main/resources/schema/observability/logs/communication.schema @@ -1,6 +1,6 @@ { "$schema": "http://json-schema.org/draft-07/schema#", - "$id": "https://opensearch.org/schemas/Communication", + "$id": "https://opensearch.org/schemas/observability/Communication", "title": "Communication", "type": "object", "properties": { @@ -11,10 +11,10 @@ "type": "string" }, "source": { - "$ref": "/schemas/Source" + "$ref": "#/definitions/Source" }, "destination": { - "$ref": "/schemas/Destination" + "$ref": "#/definitions/Destination" } } }, @@ -22,66 +22,67 @@ "type": "object", "properties": { } - }, - "$defs": { - "Source": { - "$id": "/schemas/Source", - "type": "object", - "additionalProperties": true, - "properties": { - "address": { - "type": "string" - }, - "domain": { - "type": "string" - }, - "bytes": { - "type": "integer" - }, - "ip": { - "type": "string" - }, - "port": { - "type": "integer" - }, - "mac": { - "type": "string" - }, - "packets": { - "type": "integer" - } + } + }, + "definitions": { + "Source": { + "$id": "#/definitions/Source", + "type": "object", + "additionalProperties": true, + "properties": { + "address": { + "type": "string" + }, + "domain": { + "type": "string" + }, + "bytes": { + "type": "integer" }, - "title": "Source" + "ip": { + "type": "string" + }, + "port": { + "type": "integer" + }, + "mac": { + "type": "string" + }, + "packets": { + "type": "integer" + } }, - "Destination": { - "$id": "/schemas/Destination", - "type": "object", - "additionalProperties": true, - "properties": { - "address": { - "type": "string" - }, - "domain": { - "type": "string" - }, - "bytes": { - "type": "integer" - }, - "ip": { - "type": "string" - }, - "port": { - "type": "integer" - }, - "mac": { - "type": "string" - }, - "packets": { - "type": "integer" - } + "title": "Source" + }, + "Destination": { + "$id": "#/definitions/Destination", + "type": "object", + "additionalProperties": true, + "properties": { + "address": { + "type": "string" }, - "title": "Destination" - } + "domain": { + "type": "string" + }, + "bytes": { + "type": "integer" + }, + "ip": { + "type": "string" + }, + "port": { + "type": "integer" + }, + "mac": { + "type": "string" + }, + "packets": { + "type": "integer" + } + }, + "title": "Destination" } } + } diff --git a/src/main/resources/schema/observability/logs/container.mapping b/src/main/resources/schema/observability/logs/container.mapping new file mode 100644 index 000000000..edcff9897 --- /dev/null +++ b/src/main/resources/schema/observability/logs/container.mapping @@ -0,0 +1,67 @@ +{ + "template": { + "mappings": { + "_meta": { + "version": "1.0.0", + "catalog": "observability", + "type": "logs", + "component" : "container" + }, + "properties": { + "container": { + "properties": { + "image": { + "type": "object", + "properties": { + "name": { + "type": "keyword" + }, + "tag": { + "type": "keyword" + }, + "hash": { + "type": "keyword" + } + } + }, + "id": { + "type": "keyword" + }, + "name": { + "type": "keyword" + }, + "labels": { + "type": "keyword" + }, + "runtime": { + "type": "keyword" + }, + "memory.usage": { + "type": "float" + }, + "network": { + "type": "object", + "properties": { + "ingress.bytes": { + "type": "long" + }, + "egress.bytes": { + "type": "long" + } + } + }, + "cpu.usage": { + "type": "float" + }, + "disk.read.bytes": { + "type": "long" + }, + "disk.write.bytes": { + "type": "long" + } + } + } + } + } + } +} \ No newline at end of file diff --git a/src/main/resources/schema/observability/logs/http.mapping b/src/main/resources/schema/observability/logs/http.mapping index f55b3a5f0..cf70dac93 100644 --- a/src/main/resources/schema/observability/logs/http.mapping +++ b/src/main/resources/schema/observability/logs/http.mapping @@ -1,6 +1,12 @@ { "template": { "mappings": { + "_meta": { + "version": "1.0.0", + "catalog": "observability", + "type": "logs", + "component": "http" + }, "dynamic_templates": [ { "request_header_map": { diff --git a/src/main/resources/schema/observability/logs/http.schema b/src/main/resources/schema/observability/logs/http.schema index 85ace4283..b25fa81fc 100644 --- a/src/main/resources/schema/observability/logs/http.schema +++ b/src/main/resources/schema/observability/logs/http.schema @@ -1,14 +1,14 @@ { "$schema": "http://json-schema.org/draft-07/schema#", - "$id": "https://opensearch.org/schemas/Http", + "$id": "https://opensearch.org/schemas/observability/Http", "title": "Http", "type": "object", "properties": { "request": { - "$ref": "/schemas/Request" + "$ref": "#/definitions/Request" }, "response": { - "$ref": "/schemas/Response" + "$ref": "#/definitions/Response" }, "flavor": { "type": "string" @@ -35,9 +35,9 @@ "type": "integer" } }, - "$defs": { + "definitions": { "Request": { - "$id": "/schemas/Request", + "$id": "#/definitions/Request", "type": "object", "additionalProperties": true, "properties": { @@ -66,7 +66,7 @@ "title": "Request" }, "Response": { - "$id": "/schemas/Response", + "$id": "#/definitions/Response", "type": "object", "additionalProperties": true, "properties": { diff --git a/src/main/resources/schema/observability/logs/logs.mapping b/src/main/resources/schema/observability/logs/logs.mapping index 2fdac21f5..ad87bd4ad 100644 --- a/src/main/resources/schema/observability/logs/logs.mapping +++ b/src/main/resources/schema/observability/logs/logs.mapping @@ -6,7 +6,22 @@ "template": { "mappings": { "_meta": { - "version": "1.0.0" + "version": "1.0.0", + "catalog": "observability", + "type": "logs", + "component": "log", + "correlations": [ + { + "field": "spanId", + "foreign-schema": "traces", + "foreign-field": "spanId" + }, + { + "field": "traceId", + "foreign-schema": "traces", + "foreign-field": "traceId" + } + ] }, "_source": { "enabled": true @@ -199,6 +214,18 @@ "_meta": { "description": "Simple Schema For Observability", "catalog": "observability", - "type": "logs" + "type": "logs", + "correlations": [ + { + "field": "spanId", + "foreign-schema": "traces", + "foreign-field": "spanId" + }, + { + "field": "traceId", + "foreign-schema": "traces", + "foreign-field": "traceId" + } + ] } } \ No newline at end of file diff --git a/src/main/resources/schema/observability/logs/logs.schema b/src/main/resources/schema/observability/logs/logs.schema index c014c12ac..10b10e647 100644 --- a/src/main/resources/schema/observability/logs/logs.schema +++ b/src/main/resources/schema/observability/logs/logs.schema @@ -1,6 +1,6 @@ { "$schema": "http://json-schema.org/draft-07/schema#", - "$id": "https://opensearch.org#/definitions/Logs", + "$id": "https://opensearch.org/schema/observability/Logs", "title": "OpenTelemetry Logs", "type": "object", "properties": { @@ -25,10 +25,10 @@ "format": "date-time" }, "traceId": { - "type": "string" + "$ref": "https://opensearch.org/schemas/observability/Span#/properties/traceId" }, "spanId": { - "type": "string" + "$ref": "https://opensearch.org/schemas/observability/Span#/properties/spanId" }, "schemaUrl": { "type": "string" @@ -165,7 +165,7 @@ "event", "metric", "state", - "pipeline_error", + "error", "signal" ] }, diff --git a/src/main/resources/schema/observability/metrics/README.md b/src/main/resources/schema/observability/metrics/README.md new file mode 100644 index 000000000..78e940f51 --- /dev/null +++ b/src/main/resources/schema/observability/metrics/README.md @@ -0,0 +1,105 @@ +# Metrics Schema Support + +Observability refers to the ability to monitor and diagnose systems and applications in real-time, in order to understand how they are behaving and identify potential issues. +Metrics present a critical component of observability, providing quantifiable data about the performance and behavior of systems and applications. +The importance of supporting metrics structured schema lies in the fact that it enables better analysis and understanding of system behavior. + +A structured schema provides a clear, consistent format, making it easier for observability tools to process and aggregate the data. +This in turn makes it easier for engineers to understand the performance and behavior of their systems, and quickly identify potential issues. + +When metrics are unstructured, it can be difficult for observability tools to extract meaningful information from them. +For example, if the data for a particular metric is not consistently recorded in the same format, it can be difficult to compare and analyze performance data over time. +Similarly, if metrics are not consistently named or categorized, it can be difficult to understand their context and significance. + +With a structured schema in place, observability tools can automatically extract and aggregate data, making it easier to understand system behavior at a high level. +This can help teams quickly identify performance bottlenecks, track changes in system behavior over time, and make informed decisions about system performance optimization. + +## Details +The next section provides the Simple Schema for Observability support which conforms with the OTEL specification. + +- metrics.mapping presents the template mapping for creating the Simple Schema for Observability index +- metrics.schema presents the json schema validation for verification of a metrics document conforms to the mapping structure + +## Metrics +see [OTEL metrics convention](https://opentelemetry.io/docs/reference/specification/metrics/) +see [OTEL metrics protobuf](https://github.com/open-telemetry/opentelemetry-proto/tree/main/opentelemetry/proto/metrics/v1) + +Simple Schema for Observability conforms with OTEL metrics protocol which defines the next data model: + +#### Timestamp field +As part of the data-stream definition the `@timestamp` is mandatory, if the field is not present in the original signal populate this field using `ObservedTimestamp` as value. + +### Instrumentation scope +This is a logical unit of the application with which the emitted telemetry can be associated. It is typically the developer’s choice to decide what denotes a reasonable instrumentation scope. +The most common approach is to use the instrumentation library as the scope, however other scopes are also common, e.g. a module, a package, or a class can be chosen as the instrumentation scope. + +The instrumentation scope may have zero or more additional attributes that provide additional information about the scope. As an example the field +`instrumentationScope.attributes.identification` is presented will be used to determine the resource origin of the signal and can be used to filter accordingly + +### Overview +Metrics are a specific kind of telemetry data. They represent a snapshot of the current state for a set of data. +Metrics are distinct from logs or events, which focus on records or information about individual events. + +Metrics expresses all system states as numerical values; counts, current values and such. +Metrics tend to aggregate data temporally, while this can lose information, the reduction in overhead is an engineering trade-off commonly chosen in many modern monitoring systems. + +Time series are a record of changing information over time. While time series can support arbitrary strings or binary data, only numeric data is in our scope. +Common examples of metric time series would be network interface counters, device temperatures, BGP connection states, and alert states. + +### Metric streams +In a similar way to the data_stream attribute field representing the category of a trace, the metric streams are grouped into individual Metric objects, identified by: + + - The originating Resource attributes + - The instrumentation Scope (e.g., instrumentation library name, version) + - The metric stream’s name + +### Metrics +Metric object is defined by the following properties: + + - The data point type (e.g. Sum, Gauge, Histogram ExponentialHistogram, Summary) + - The metric stream’s unit + - The data point properties, where applicable: AggregationTemporality, Monotonic + +The description is also present in the metrics object but is not part of the identification fields +_- The metric stream’s description_ + + +### Data Types + +**Values:** Metric values in MUST be either floating points or integers. + +**Attributes:** Labels are key-value pairs consisting of string as keys and Any type as values (strings, object, array) + +**MetricPoint:** Each MetricPoint consists of a set of values, depending on the MetricFamily type. + +**Metric** Metrics are defined by a unique attributes (dimensions) within a MetricFamily. + +--- + +Metrics MUST contain a list of one or more MetricPoints. Metrics with the same name for a given MetricFamily SHOULD have the same set of label names in their LabelSet. + +* Metrics.name: String value representation of the matrix purpose +* Metrics.type: Valid values are "gauge", "counter","histogram", and "summary". +* Metrics.Unit: specifies MetricFamily units. + +## Metric Types + +### Gauge +Gauges are current measurements, such as bytes of memory currently used or the number of items in a queue. For gauges the absolute value is what is of interest to a user. +**_A MetricPoint in a Metric with the type gauge MUST have a single value._** +Gauges MAY increase, decrease, or stay constant over time. Even if they only ever go in one direction, they might still be gauges and not counters. + +### Counter +Counters measure discrete events. Common examples are the number of HTTP requests received, CPU seconds spent, or bytes sent. For counters how quickly they are increasing over time is what is of interest to a user. +**_A MetricPoint in a Metric with the type Counter MUST have one value called Total._** + +### Histogram / Exponential-Histogram +Histograms measure distributions of discrete events. Common examples are the latency of HTTP requests, function runtimes, or I/O request sizes. +**_A Histogram MetricPoint MUST contain at least one bucket_**, and SHOULD contain Sum, and Created values. Every bucket MUST have a threshold and a value. + +### Summary +Summaries also measure distributions of discrete events and MAY be used when Histograms are too expensive and/or an average event size is sufficient. +**_A Summary MetricPoint MAY consist of a Count, Sum, Created, and a set of quantiles._** +Semantically, Count and Sum values are counters & MUST be an integer. + + diff --git a/src/main/resources/schema/observability/metrics/metrics.mapping b/src/main/resources/schema/observability/metrics/metrics.mapping index 0e66087a0..f45a9d8f9 100644 --- a/src/main/resources/schema/observability/metrics/metrics.mapping +++ b/src/main/resources/schema/observability/metrics/metrics.mapping @@ -6,7 +6,22 @@ "template": { "mappings": { "_meta": { - "version": "1.0.0" + "version": "1.0.0", + "catalog": "observability", + "type": "metrics", + "component": "metrics", + "correlations" : [ + { + "field": "spanId", + "foreign-schema" : "traces", + "foreign-field" : "spanId" + }, + { + "field": "traceId", + "foreign-schema" : "traces", + "foreign-field" : "traceId" + } + ] }, "_source": { "enabled": true @@ -285,6 +300,18 @@ "_meta": { "description": "Observability Metrics Mapping Template", "catalog": "observability", - "type": "metrics" + "type": "metrics", + "correlations" : [ + { + "field": "spanId", + "foreign-schema" : "traces", + "foreign-field" : "spanId" + }, + { + "field": "traceId", + "foreign-schema" : "traces", + "foreign-field" : "traceId" + } + ] } } \ No newline at end of file diff --git a/src/main/resources/schema/observability/metrics/metrics.schema b/src/main/resources/schema/observability/metrics/metrics.schema index 27a8cd044..83d504a43 100644 --- a/src/main/resources/schema/observability/metrics/metrics.schema +++ b/src/main/resources/schema/observability/metrics/metrics.schema @@ -1,6 +1,6 @@ { "$schema": "http://json-schema.org/draft-07/schema#", - "$id": "https://opensearch.org#/definitions/Metrics", + "$id": "https://opensearch.org/schema/observability/Metrics", "title": "OpenTelemetry Metrics", "type": "object", "properties": { @@ -227,10 +227,10 @@ "format": "date-time" }, "spanId": { - "type": "string" + "$ref": "https://opensearch.org/schemas/observability/Span#/properties/spanId" }, "traceId": { - "type": "string" + "$ref": "https://opensearch.org/schemas/observability/Span#/properties/traceId" } }, "required": [ @@ -243,6 +243,9 @@ "type": "object", "additionalProperties": true, "properties": { + "serviceName": { + "type": "string" + }, "data_stream": { "$ref": "#/definitions/Dataflow" } diff --git a/src/main/resources/schema/observability/traces/README.md b/src/main/resources/schema/observability/traces/README.md new file mode 100644 index 000000000..dcf16adcb --- /dev/null +++ b/src/main/resources/schema/observability/traces/README.md @@ -0,0 +1,153 @@ +# Traces Schema Support +Observability in the software industry is the ability to monitor and diagnose systems and applications in real-time, in order to understand how they are behaving and identify potential issues. +Traces are a critical component of observability, providing detailed information about the flow of requests through a system, including timing information and any relevant contextual data. + +The importance of supporting traces schema lies in the fact that it enables better analysis and understanding of system behavior. +A structured schema provides a clear, consistent format for traces, making it easier for observability tools to process and aggregate the data. +This in turn makes it easier for engineers to understand the performance and behavior of their systems, and quickly identify potential issues. + +When traces are unstructured, it can be difficult for observability tools to extract meaningful information from them - For example, if the timing information for a particular request is not consistently represented in the same format, +it can be difficult to compare and analyze performance data over time. Similarly, if contextual data is not consistently recorded, it can be difficult to understand the context in which a particular request was executed. + +With a structured schema in place, observability tools can automatically extract and aggregate data, making it easier to understand system behavior at a high level. +This can help teams quickly identify performance bottlenecks, track the root cause of errors, and resolve issues more efficiently. + +## Details +The next section provides the Simple Schema for Observability support which conforms with the OTEL specification. + +- traces.mapping presents the template mapping for creating the Simple Schema for Observability index +- traces.schema presents the json schema validation for verification of a trace document conforms to the mapping structure + +### data-stream +[data-stream](https://opensearch.org/docs/latest/opensearch/data-streams/) Data streams simplify this process and enforce a setup that best suits time-series data, such as being designed primarily for append-only data and ensuring that each document has a timestamp field. +A data stream is internally composed of multiple backing indices. Search requests are routed to all the backing indices, while indexing requests are routed to the latest write index. + +As part of the Observability naming scheme, the value of the data stream fields combine to the name of the actual data stream : + +`{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`. +This means the fields can only contain characters that are valid as part of names of data streams. + +- **type** conforms to one of the supported Observability signals (Traces, Logs, Metrics, Alerts) +- **dataset** user defined field that can mainly be utilized for describing the origin of the signal +- **namespace** user custom field that can be used to describe any customer domain specific classification + +#### Timestamp field +As part of the data-stream definition the `@timestamp` is mandatory, if the field is not present to begin with use `ObservedTimestamp` as value for this field +**Note** - `@timestamp` value is the actual signal happening time and `observedTimestamp` is the time the exporter reads the actual event record. + +### Instrumentation scope +This is a logical unit of the application with which the emitted telemetry can be associated. It is typically the developer’s choice to decide what denotes a reasonable instrumentation scope. +The most common approach is to use the instrumentation library as the scope, however other scopes are also common, e.g. a module, a package, or a class can be chosen as the instrumentation scope. + +The instrumentation scope may have zero or more additional attributes that provide additional information about the scope. As an example the field +`instrumentationScope.attributes.identification` is presented will be used to determine the resource origin of the signal and can be used to filter accordingly + +## Traces +see [OTEL traces convention](https://github.com/open-telemetry/opentelemetry-specification/tree/main/semantic_conventions/trace) + +Traces are defined implicitly by their Spans - In particular, a Trace can be thought of as a directed acyclic graph (DAG) of Spans, where the edges between Spans are defined as parent/child relationship. + +## Spans +A span represents an operation within a transaction. Each Span encapsulates the following state: +Observability in the software industry is the ability to monitor and diagnose systems and applications in real-time, in order to understand how they are behaving and identify potential issues. +Traces are a critical component of observability, providing detailed information about the flow of requests through a system, including timing information and any relevant contextual data. + +The importance of supporting traces schema lies in the fact that it enables better analysis and understanding of system behavior. +A structured schema provides a clear, consistent format for traces, making it easier for observability tools to process and aggregate the data. +This in turn makes it easier for engineers to understand the performance and behavior of their systems, and quickly identify potential issues. + +When traces are unstructured, it can be difficult for observability tools to extract meaningful information from them - For example, if the timing information for a particular request is not consistently represented in the same format, +it can be difficult to compare and analyze performance data over time. Similarly, if contextual data is not consistently recorded, it can be difficult to understand the context in which a particular request was executed. + +With a structured schema in place, observability tools can automatically extract and aggregate data, making it easier to understand system behavior at a high level. +This can help teams quickly identify performance bottlenecks, track the root cause of errors, and resolve issues more efficiently. + +## Details +The next section provides the Simple Schema for Observability support which conforms with the OTEL specification. + +- traces.mapping presents the template mapping for creating the Simple Schema for Observability index +- traces.schema presents the json schema validation for verification of a trace document conforms to the mapping structure + +### data-stream +[data-stream](https://opensearch.org/docs/latest/opensearch/data-streams/) Data streams simplify this process and enforce a setup that best suits time-series data, such as being designed primarily for append-only data and ensuring that each document has a timestamp field. +A data stream is internally composed of multiple backing indices. Search requests are routed to all the backing indices, while indexing requests are routed to the latest write index. + +As part of the Observability naming scheme, the value of the data stream fields combine to the name of the actual data stream : + +`{data_stream.type}-{data_stream.dataset}-{data_stream.namespace}`. +This means the fields can only contain characters that are valid as part of names of data streams. + +- **type** conforms to one of the supported Observability signals (Traces, Logs, Metrics, Alerts) +- **dataset** user defined field that can mainly be utilized for describing the origin of the signal +- **namespace** user custom field that can be used to describe any customer domain specific classification + +#### Timestamp field +As part of the data-stream definition the `@timestamp` is mandatory, if the field is not present to begin with use `ObservedTimestamp` as value for this field +**Note** - `@timestamp` value is the actual signal happening time and `observedTimestamp` is the time the exporter reads the actual event record. + +### Instrumentation scope +This is a logical unit of the application with which the emitted telemetry can be associated. It is typically the developer’s choice to decide what denotes a reasonable instrumentation scope. +The most common approach is to use the instrumentation library as the scope, however other scopes are also common, e.g. a module, a package, or a class can be chosen as the instrumentation scope. + +The instrumentation scope may have zero or more additional attributes that provide additional information about the scope. As an example the field +`instrumentationScope.attributes.identification` is presented will be used to determine the resource origin of the signal and can be used to filter accordingly + +## Traces +see [OTEL traces convention](https://github.com/open-telemetry/opentelemetry-specification/tree/main/semantic_conventions/trace) + +Traces are defined implicitly by their Spans - In particular, a Trace can be thought of as a directed acyclic graph (DAG) of Spans, where the edges between Spans are defined as parent/child relationship. + +## Spans +A span represents an operation within a transaction. Each Span encapsulates the following state: + +* An operation name +* start and finish timestamp +* Attributes list of key-value pairs. +* Set of Events, each of which is itself a tuple (timestamp, name, Attributes) +* Parent's Span identifier. +* Links to causally-related Spans (via the SpanContext of those related Spans). +* SpanContext information required to reference a Span. + +### SpanContext +Represents all the information that identifies Span in the Trace and is propagated to child Spans and across process boundaries. +A **SpanContext** contains the tracing identifiers and the options that are propagated from parent to child Spans. + +* `TraceId` - It is worldwide unique with practically sufficient probability by being made as 16 randomly generated bytes - used to group all spans for a specific trace together across all processes. +* `SpanId` - It is the identifier for a span, globally unique with practically sufficient probability by being made as 8 randomly generated bytes. When passed to a child Span this identifier becomes the parent span id for the child Span. +* `Tracestate` - carries tracing-system specific context in a list of key value pairs . Trace-state allows different vendors propagate additional information and inter-operate with their legacy Id formats. For more details see this. + +Additional fields can be supported via the Attributes key/value store see [traces](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/README.md) + +### Structure +The default fields that are supported by the traces are +- **TraceId** : It is worldwide unique with practically sufficient probability by being made as 16 randomly generated bytes - used to group all spans for a specific trace together across all processes. +- **SpanId** : It is the identifier for a span, globally unique with practically sufficient probability by being made as 8 randomly generated bytes. When passed to a child Span this identifier becomes the parent span id for the child Span. +- **ParentId** : It is the identifier for a span's parent span. +- **TraceState** : carries tracing-system specific context in a list of key value pairs. Tracestate allows different vendors propagate additional information and inter-operate with their legacy Id formats. + +- **Name** : String representing the span's name +- **Kind** + - SpanKind.CLIENT + - SpanKind.SERVER + - SpanKind.CONSUMER + - SpanKind.PRODUCER + - SpanKind.INTERNAL + +- **StartTime** : Start time of the event +- **EndTime** : End time of the event +- **Attributes** + - An Attribute is a key-value pair, which has the following structure [Attributes](https://github.com/open-telemetry/opentelemetry-specification/blob/b00980832b4b823155001df56dbf9203d4e53f98/specification/common/README.md#attribute) + +- **DroppedAttributesCount** : Integer counting the dropped attributes +- **Events** : A set of the next tuples (timestamp, name, Attributes) +- **DroppedEventsCount** : Integer counting the dropped events +- **Links** : links to causally-related Spans +- **DroppedLinksCount** : Integer counting the dropped links +- **Status** - + + _status code is the int value + status message is the text representation_ + + - `UNSET = 0` : The default status. + - `OK = 1` : The operation has been validated by an Application developer or Operator to have completed successfully. + - `ERROR = 2` : The operation contains an error. \ No newline at end of file diff --git a/src/main/resources/schema/observability/traces/Usage.md b/src/main/resources/schema/observability/traces/Usage.md new file mode 100644 index 000000000..31fe0b9b3 --- /dev/null +++ b/src/main/resources/schema/observability/traces/Usage.md @@ -0,0 +1,113 @@ +# Actual Index setup and usage +The next document describes the practicality of manually defining and initiating of the observability schema indices, it assumes the OpenSearch cluster +is up and running. + +### Setting Up the Mapping +Start the OpenSearch cluster and follow the next steps for manually setup of the Log mapping template: + +`>> PUT _component_template/tracegroups_template` + +Copy the [traceGroups.mapping](traceGroups.mapping) + +`>> PUT _index_template/traces` + +Copy the [traces.mapping](traces.mapping) + +Now you can create an data-stream index (following the logs index pattern) that has the supported schema: + +`>> PUT _data_stream/sso_traces-dataset-test` + +You can also directly start ingesting data without creating a data stream. +Because we have a matching index template with a data_stream object, OpenSearch automatically creates the data stream: + +`POST sso_traces-dataset-test/_doc` +```json +{ + "body": "login attempt failed", + "@timestamp": "2013-03-01T00:00:00", + ... +} + +``` + +To see information about a that data stream: +`GET _data_stream/sso_traces-dataset-test` + +Would respond the following: +```json +{ + "data_streams" : [ + { + "name" : "sso_traces-dataset-test", + "timestamp_field" : { + "name" : "@timestamp" + }, + "indices" : [ + { + "index_name" : ".ds-sso_traces-dataset-test-000001", + "index_uuid" : "-VhmuhrQQ6ipYCmBhn6vLw" + } + ], + "generation" : 1, + "status" : "GREEN", + "template" : "sso_traces-*-*" + } + ] +} +``` + +To see more insights about the data stream, use the `_stats` endpoint: +`GET _data_stream/sso_traces-dataset-test/_stats` +Would respond the following: +```json +{ + "_shards" : { + "total" : 1, + "successful" : 1, + "failed" : 0 + }, + "data_stream_count" : 1, + "backing_indices" : 1, + "total_store_size_bytes" : 208, + "data_streams" : [ + { + "data_stream" : "sso_traces-dataset-test", + "backing_indices" : 1, + "store_size_bytes" : 208, + "maximum_timestamp" : 0 + } + ] +} +``` +### Ingestion +To ingest data into a data stream, you can use the regular indexing APIs. Make sure every document that you index has a timestamp field. + +`POST sso_traces-dataset-test/_doc` +```json +{ + "body": "login attempt failed", + "@timestamp": "2013-03-01T00:00:00", + ... +} + +``` +You can search a data stream just like you search a regular index or an index alias. The search operation applies to all of the backing indices (all data present in the stream). + +`GET sso_traces-dataset-test/_search` +```json +{ + "query": { + "match": { + ... + } + } +} +``` + +### Manage data-streams in OpenSearh + +To manage data streams from OpenSearch Dashboards, open OpenSearch Dashboards, choose Index Management, select Indices or Policy managed indices. + +see additional information: + - https://opensearch.org/docs/latest/opensearch/data-streams/#step-6-manage-data-streams-in-opensearch-dashboards + - https://opensearch.org/docs/latest/opensearch/data-streams/#step-5-rollover-a-data-stream \ No newline at end of file diff --git a/src/main/resources/schema/observability/traces/services.mapping b/src/main/resources/schema/observability/traces/services.mapping index 06b9ab681..377596d7b 100644 --- a/src/main/resources/schema/observability/traces/services.mapping +++ b/src/main/resources/schema/observability/traces/services.mapping @@ -1,65 +1,102 @@ { + "index_patterns": [ + "sso_services-*-*" + ], "template": { "mappings": { - "mappings": { - "dynamic_templates": [ + "_meta": { + "version": "1.0.0", + "catalog": "observability", + "type": "traces", + "component": "services", + "correlations": [ { - "strings_as_keyword": { - "match_mapping_type": "string", - "mapping": { - "ignore_above": 1024, - "type": "keyword" - } - } + "field": "traceGroupName", + "foreign-schema": "traceGroups", + "foreign-field": "traceGroup" } - ], - "date_detection": false, - "properties": { - "services": { - "destination": { - "properties": { - "domain": { - "type": "keyword", - "ignore_above": 1024 - }, - "resource": { - "type": "keyword", - "ignore_above": 1024 - } - } + ] + }, + "_source": { + "enabled": true + }, + "dynamic_templates": [ + { + "attributes_map": { + "mapping": { + "type": "keyword" }, - "hashId": { + "path_match": "attributes.*" + } + } + ], + "properties": { + "destination": { + "properties": { + "domain": { "type": "keyword", "ignore_above": 1024 }, - "kind": { + "resource": { "type": "keyword", "ignore_above": 1024 - }, - "serviceName": { + } + } + }, + "hashId": { + "type": "keyword", + "ignore_above": 1024 + }, + "serviceName": { + "type": "keyword", + "ignore_above": 1024 + }, + "kind": { + "type": "keyword", + "ignore_above": 1024 + }, + "target": { + "properties": { + "domain": { "type": "keyword", "ignore_above": 1024 }, - "target": { - "properties": { - "domain": { - "type": "keyword", - "ignore_above": 1024 - }, - "resource": { - "type": "keyword", - "ignore_above": 1024 - } - } - }, - "traceGroupName": { + "resource": { "type": "keyword", "ignore_above": 1024 } } + }, + "traceGroupName": { + "type": "keyword", + "ignore_above": 1024 } } + }, + "settings": { + "index": { + "mapping": { + "total_fields": { + "limit": 10000 + } + }, + "refresh_interval": "5s" + } } + }, + "composed_of": [ + ], + "version": 1, + "_meta": { + "description": "Simple Schema For Observability Service", + "catalog": "observability", + "type": "services", + "correlations": [ + { + "field": "traceGroupName", + "foreign-schema": "traceGroups", + "foreign-field": "traceGroup" + } + ] } -} } \ No newline at end of file diff --git a/src/main/resources/schema/observability/traces/services.schema b/src/main/resources/schema/observability/traces/services.schema new file mode 100644 index 000000000..607181c75 --- /dev/null +++ b/src/main/resources/schema/observability/traces/services.schema @@ -0,0 +1,45 @@ +{ + "$schema": "http://json-schema.org/draft-04/schema#", + "$id": "https://opensearch.org/schemas/observability/Service", + "type": "object", + "properties": { + "serviceName": { + "type": "string" + }, + "kind": { + "type": "string" + }, + "destination": { + "type": "object", + "properties": { + "resource": { + "type": "string" + }, + "domain": { + "type": "string" + } + }, + "required": [ + "resource", + "domain" + ] + }, + "target": { + "type": "null" + }, + "traceGroupName": { + "$ref": "https://opensearch.org/schemas/observability/TraceGroups#/properties/traceGroup" + }, + "hashId": { + "type": "string" + } + }, + "required": [ + "serviceName", + "kind", + "destination", + "target", + "traceGroupName", + "hashId" + ] +} diff --git a/src/main/resources/schema/observability/traces/traceGroups.mapping b/src/main/resources/schema/observability/traces/traceGroups.mapping new file mode 100644 index 000000000..6a564dff5 --- /dev/null +++ b/src/main/resources/schema/observability/traces/traceGroups.mapping @@ -0,0 +1,31 @@ +{ + "template": { + "mappings": { + "_meta": { + "version": "1.0.0", + "catalog": "observability", + "type": "traces", + "component": "traceGroups" + }, + "properties": { + "traceGroup": { + "ignore_above": 1024, + "type": "keyword" + }, + "traceGroupFields": { + "properties": { + "endTime": { + "type": "date_nanos" + }, + "durationInNanos": { + "type": "long" + }, + "statusCode": { + "type": "integer" + } + } + } + } + } + } +} \ No newline at end of file diff --git a/src/main/resources/schema/observability/traces/traceGroups.schema b/src/main/resources/schema/observability/traces/traceGroups.schema new file mode 100644 index 000000000..15b5416a0 --- /dev/null +++ b/src/main/resources/schema/observability/traces/traceGroups.schema @@ -0,0 +1,33 @@ +{ + "$schema": "http://json-schema.org/draft-04/schema#", + "$id": "https://opensearch.org/schemas/observability/TraceGroups", + "type": "object", + "properties": { + "traceGroupFields": { + "type": "object", + "properties": { + "endTime": { + "type": "string" + }, + "durationInNanos": { + "type": "integer" + }, + "statusCode": { + "type": "integer" + } + }, + "required": [ + "endTime", + "durationInNanos", + "statusCode" + ] + }, + "traceGroup": { + "type": "string" + } + }, + "required": [ + "traceGroupFields", + "traceGroup" + ] +} \ No newline at end of file diff --git a/src/main/resources/schema/observability/traces/traces.mapping b/src/main/resources/schema/observability/traces/traces.mapping index a6be2931f..eaeb0e71e 100644 --- a/src/main/resources/schema/observability/traces/traces.mapping +++ b/src/main/resources/schema/observability/traces/traces.mapping @@ -6,7 +6,17 @@ "template": { "mappings": { "_meta": { - "version": "1.0.0" + "version": "1.0.0", + "catalog": "observability", + "type": "traces", + "component": "trace", + "correlations": [ + { + "field": "serviceName", + "foreign-schema": "services", + "foreign-field": "spanId" + } + ] }, "dynamic_templates": [ { @@ -112,6 +122,9 @@ "attributes": { "type": "object", "properties": { + "serviceName": { + "type": "keyword" + }, "data_stream": { "properties": { "dataset": { @@ -190,11 +203,19 @@ } }, "composed_of": [ + "tracegroups_template" ], "version": 1, "_meta": { "description": "Observability Traces Mapping Template", "catalog": "observability", - "type": "traces" + "type": "traces", + "correlations": [ + { + "field": "serviceName", + "foreign-schema": "services", + "foreign-field": "spanId" + } + ] } } \ No newline at end of file diff --git a/src/main/resources/schema/observability/traces/traces.schema b/src/main/resources/schema/observability/traces/traces.schema index 6225c0ed4..518a81fbb 100644 --- a/src/main/resources/schema/observability/traces/traces.schema +++ b/src/main/resources/schema/observability/traces/traces.schema @@ -1,6 +1,6 @@ { "$schema": "http://json-schema.org/draft-06/schema#", - "$id": "https://opensearch.org/schemas/Span", + "$id": "https://opensearch.org/schemas/observability/Span", "type": "object", "additionalProperties": false, "properties": { @@ -21,7 +21,7 @@ "$ref": "#/definitions/Status" }, "parentSpanId": { - "type": "string" + "$ref": "#/properties/spanId" }, "name": { "type": "string" @@ -40,6 +40,10 @@ ] } }, + "@timestamp": { + "type": "string", + "format": "date-time" + }, "startTime": { "type": "string", "format": "date-time" @@ -67,10 +71,10 @@ "additionalProperties": false, "properties": { "traceId": { - "type": "string" + "$ref": "#/properties/traceId" }, "spanId": { - "type": "string" + "$ref": "#/properties/spanId" }, "traceState": { "type": "array", @@ -112,6 +116,7 @@ "required": [ "traceId", "spanId", + "@timestamp", "startTime", "endTime", "kind", @@ -188,6 +193,9 @@ "type": "object", "additionalProperties": true, "properties": { + "serviceName": { + "$ref": "https://opensearch.org/schemas/observability/Service#/properties/serviceName" + }, "data_stream": { "$ref": "#/definitions/Dataflow" } diff --git a/src/main/resources/schema/security/README.md b/src/main/resources/schema/security/README.md new file mode 100644 index 000000000..f5f87bf52 --- /dev/null +++ b/src/main/resources/schema/security/README.md @@ -0,0 +1,9 @@ +# Security Domain Schema + +OpenSearch Security is a [plugin](https://github.com/opensearch-project/security) for OpenSearch that offers encryption, authentication and authorization. When combined with OpenSearch Security-Advanced Modules, it supports authentication via Active Directory, LDAP, Kerberos, JSON web tokens, SAML, OpenID and more. It includes fine grained role-based access control to indices, documents and fields. It also provides multi-tenancy support in OpenSearch Dashboards. + +Another leading security specification if the [Open Cybersecurity Schema Framework](https://github.com/ocsf/ocsf-schema) +OCSF is a framework for creating schemas and it also delivers a cybersecurity event schema built with the framework. + +OCSF framework is made up of a set of categories, event classes, data types, and an attribute dictionary. The framework is not restricted to cybersecurity nor to events, however the initial focus of the framework has been a schema for cybersecurity events. A schema browser for the cybersecurity schema can be found at schema.ocsf.io. This is the recommended way to explore the schema. + diff --git a/src/main/resources/schema/system/README.md b/src/main/resources/schema/system/README.md new file mode 100644 index 000000000..20bf9b48a --- /dev/null +++ b/src/main/resources/schema/system/README.md @@ -0,0 +1,57 @@ +# Internal System Schema + +This folder contains internal representation of assets that are stored in the system indices of dashboard and integration. + - Application + - Datasource + - Index-Pattern + - Integration + - Notebook + - Operational-Panel + - SavedQuery + - Visualization + +### Application +[Application](https://opensearch.org/docs/2.5/observing-your-data/app-analytics/) enables creation of custom observability display to view the availability status of your systems, where you can combine log events with trace and metric data into a single view of overall system health. +This lets you quickly pivot between logs, traces, and metrics to dig into the source of any issues. + + - [Schema](application.schema) + +### Datasource +[Data-source](https://opensearch.org/docs/2.4/dashboards/discover/multi-data-sources/) Enables adding multiple data sources to a single dashboard. +OpenSearch Dashboards allows you to dynamically manage data sources, create index patterns based on those data sources, and execute queries against a specific data source and then combine visualizations in one dashboard. + + - [Schema](datasource.schema) + +### Index-Pattern +An Index Pattern allows to access data that you want to explore. An index pattern selects the data to use. An index pattern may point to multiple indices, data stream, or index aliases. + + - [Schema](index-pattern.schema) + +### Integration +Integration is a schematized and categorized bundle of assets grouped together to allow simple and coherent way to view, analyze and investigate different aspects of your data. +Integrations allow pre-defining dashboards, visualizations, index-templates, saved-queries and additional assets so that they provide a complete meaningful user experience. + + - [Schema](integration.schema) + +### Notebook +[Notebook](https://opensearch.org/docs/2.5/observing-your-data/notebooks/) A notebook is a document composed of two elements: code blocks (Markdown/SQL/PPL) and visualizations. +Choose multiple timelines to compare and contrast visualizations. +You can also generate reports directly from your notebooks. Common use cases include creating postmortem reports, designing runbooks, building live infrastructure reports, and writing documentation. + + - [Schema](notebook.schema) + +### Operational-Panel +[Operational Panels](https://opensearch.org/docs/2.5/observing-your-data/operational-panels/) in OpenSearch Dashboards are collections of visualizations generated using Piped Processing Language (PPL) queries. + + - [Schema](operational-panel.schema) + +### Saved-Query +A saved query (saved search) allows to reuse a search created in a dashboard for other dashboards. + + - [Schema](saved-query.schema) + +### Visualization +[Visualization](https://opensearch.org/docs/2.5/dashboards/visualize/viz-index/) allows translation of complex, high-volume, or numerical data into a visual representation that is easier to process. +OpenSearch Dashboards gives you data visualization tools to improve and automate the visual communication process. By using visual elements like charts, graphs, or maps to represent data, you can advance business intelligence and support data-driven decision-making and strategic planning. + + - [Schema](visualization.schema) diff --git a/src/main/resources/schema/system/integration-fields-list.schema b/src/main/resources/schema/system/integration-fields-list.schema new file mode 100644 index 000000000..7ff0925da --- /dev/null +++ b/src/main/resources/schema/system/integration-fields-list.schema @@ -0,0 +1,66 @@ +{ + "$schema": "http://json-schema.org/draft-04/schema#", + "type": "object", + "properties": { + "template-name": { + "type": "string" + }, + "version": { + "type": "string" + }, + "description": { + "type": "string" + }, + "catalog": { + "type": "string" + }, + "collections": { + "type": "array", + "items": [ + { + "type": "object", + "properties": { + "category": { + "type": "string" + }, + "components": { + "type": "array", + "items": [ + { + "type": "object", + "properties": { + "source": { + "type": "string" + }, + "container": { + "type": "boolean" + }, + "fields": { + "type": "object" + } + }, + "required": [ + "source", + "container", + "fields" + ] + } + ] + } + }, + "required": [ + "category", + "components" + ] + } + ] + } + }, + "required": [ + "template-name", + "version", + "description", + "catalog", + "collections" + ] +} \ No newline at end of file