Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[8.x] [Logs Data Telemetry] Create background job to collect and send…
… logs data telemetry (#189380) (#193877) # Backport This will backport the following commits from `main` to `8.x`: - [[Logs Data Telemetry] Create background job to collect and send logs data telemetry (#189380)](#189380) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Abdul Wahab Zahid","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-09-17T14:58:34Z","message":"[Logs Data Telemetry] Create background job to collect and send logs data telemetry (#189380)\n\n## Summary\r\n\r\nThe PR creates a service which runs in background as a Kibana Task and\r\nlazily collects and processes logs data telemetry events. This\r\nimplementation collects the data by reading indices info and prepares\r\nthe telemetry events. These events will be reported to stack telemetry\r\nin follow up PRs.\r\n\r\nThe service groups the stats per\r\n[pattern_name](https://github.com/elastic/kibana/blob/1116ac6daaef45785dad05ad09f78a93768e591d/src/plugins/telemetry/server/telemetry_collection/get_data_telemetry/constants.ts#L42)\r\nand gathers the following information:\r\n- Docs and indices count (regular and failure)\r\n- Count of unique namespaces found in data streams matching a pattern\r\n- Size of the documents (regular only)\r\n- Meta information (managed by, package name and if beat information is\r\nfound in mappings)\r\n- Total fields count and count of individual log centric fields\r\n- Count of docs corresponding to each structure level\r\n\r\nThe service gathers the data streams information and mapping and\r\ngenerate events in the following manner:\r\n```yml\r\n[\r\n {\r\n \"pattern_name\": \"heartbeat\",\r\n \"shipper\": \"heartbeat\",\r\n \"doc_count\": 9239,\r\n \"structure_level\": {\r\n \"5\": 9239\r\n },\r\n \"index_count\": 1,\r\n \"failure_store_doc_count\": 9239,\r\n \"failure_store_index_count\": 1,\r\n \"namespace_count\": 0,\r\n \"field_count\": 1508,\r\n \"field_existence\": {\r\n \"container.id\": 9239,\r\n \"log.level\": 9239,\r\n \"container.name\": 9239,\r\n \"host.name\": 9239,\r\n \"host.hostname\": 9239,\r\n \"kubernetes.pod.name\": 9239,\r\n \"kubernetes.pod.uid\": 9239,\r\n \"cloud.provider\": 9239,\r\n \"agent.type\": 9239,\r\n \"event.dataset\": 9239,\r\n \"event.category\": 9239,\r\n \"event.module\": 9239,\r\n \"service.name\": 9239,\r\n \"service.type\": 9239,\r\n \"service.version\": 9239,\r\n \"message\": 9239,\r\n \"event.original\": 9239,\r\n \"error.message\": 9239,\r\n \"@timestamp\": 9239,\r\n \"data_stream.dataset\": 9239,\r\n \"data_stream.namespace\": 9239,\r\n \"data_stream.type\": 9239\r\n },\r\n \"size_in_bytes\": 12382655,\r\n \"managed_by\": [],\r\n \"package_name\": [],\r\n \"beat\": [\r\n \"heartbeat\"\r\n ]\r\n },\r\n {\r\n \"pattern_name\": \"nginx\",\r\n \"doc_count\": 10080,\r\n \"structure_level\": {\r\n \"6\": 10080\r\n },\r\n \"index_count\": 1,\r\n \"failure_store_doc_count\": 0,\r\n \"failure_store_index_count\": 0,\r\n \"namespace_count\": 1,\r\n \"field_count\": 1562,\r\n \"field_existence\": {\r\n \"container.id\": 10080,\r\n \"log.level\": 10080,\r\n \"host.name\": 10080,\r\n \"kubernetes.pod.uid\": 10080,\r\n \"cloud.provider\": 10080,\r\n \"event.dataset\": 10080,\r\n \"service.name\": 10080,\r\n \"message\": 10080,\r\n \"@timestamp\": 10080,\r\n \"data_stream.dataset\": 10080,\r\n \"data_stream.namespace\": 10080,\r\n \"data_stream.type\": 10080\r\n },\r\n \"size_in_bytes\": 12098071,\r\n \"managed_by\": [],\r\n \"package_name\": [],\r\n \"beat\": []\r\n },\r\n {\r\n \"pattern_name\": \"apache\",\r\n \"doc_count\": 1439,\r\n \"structure_level\": {\r\n \"6\": 1439\r\n },\r\n \"index_count\": 1,\r\n \"failure_store_doc_count\": 0,\r\n \"failure_store_index_count\": 0,\r\n \"namespace_count\": 2,\r\n \"field_count\": 1562,\r\n \"field_existence\": {\r\n \"container.id\": 1439,\r\n \"log.level\": 1439,\r\n \"host.name\": 1439,\r\n \"kubernetes.pod.uid\": 1439,\r\n \"cloud.provider\": 1439,\r\n \"event.dataset\": 1439,\r\n \"service.name\": 1439,\r\n \"message\": 1439,\r\n \"@timestamp\": 1439,\r\n \"data_stream.dataset\": 1439,\r\n \"data_stream.namespace\": 1439,\r\n \"data_stream.type\": 1439\r\n },\r\n \"size_in_bytes\": 4425502,\r\n \"managed_by\": [],\r\n \"package_name\": [],\r\n \"beat\": []\r\n },\r\n {\r\n \"pattern_name\": \"generic-logs\",\r\n \"doc_count\": 106659,\r\n \"structure_level\": {\r\n \"2\": 100907,\r\n \"3\": 5752\r\n },\r\n \"index_count\": 6,\r\n \"failure_store_doc_count\": 0,\r\n \"failure_store_index_count\": 0,\r\n \"namespace_count\": 2,\r\n \"field_count\": 1581,\r\n \"field_existence\": {\r\n \"log.level\": 106659,\r\n \"host.name\": 106659,\r\n \"service.name\": 106659,\r\n \"@timestamp\": 106659,\r\n \"data_stream.dataset\": 106659,\r\n \"data_stream.namespace\": 106659,\r\n \"data_stream.type\": 106659,\r\n \"container.id\": 5752,\r\n \"kubernetes.pod.uid\": 5752,\r\n \"cloud.provider\": 5752,\r\n \"event.dataset\": 5752,\r\n \"message\": 5752\r\n },\r\n \"size_in_bytes\": 29752097,\r\n \"managed_by\": [],\r\n \"package_name\": [],\r\n \"beat\": []\r\n }\r\n]\r\n```\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>\r\nCo-authored-by: Elastic Machine <[email protected]>","sha":"13736730d18a323da676a7a619f675038eb2c0bf","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:prev-minor","ci:project-deploy-observability"],"title":"[Logs Data Telemetry] Create background job to collect and send logs data telemetry","number":189380,"url":"https://github.com/elastic/kibana/pull/189380","mergeCommit":{"message":"[Logs Data Telemetry] Create background job to collect and send logs data telemetry (#189380)\n\n## Summary\r\n\r\nThe PR creates a service which runs in background as a Kibana Task and\r\nlazily collects and processes logs data telemetry events. This\r\nimplementation collects the data by reading indices info and prepares\r\nthe telemetry events. These events will be reported to stack telemetry\r\nin follow up PRs.\r\n\r\nThe service groups the stats per\r\n[pattern_name](https://github.com/elastic/kibana/blob/1116ac6daaef45785dad05ad09f78a93768e591d/src/plugins/telemetry/server/telemetry_collection/get_data_telemetry/constants.ts#L42)\r\nand gathers the following information:\r\n- Docs and indices count (regular and failure)\r\n- Count of unique namespaces found in data streams matching a pattern\r\n- Size of the documents (regular only)\r\n- Meta information (managed by, package name and if beat information is\r\nfound in mappings)\r\n- Total fields count and count of individual log centric fields\r\n- Count of docs corresponding to each structure level\r\n\r\nThe service gathers the data streams information and mapping and\r\ngenerate events in the following manner:\r\n```yml\r\n[\r\n {\r\n \"pattern_name\": \"heartbeat\",\r\n \"shipper\": \"heartbeat\",\r\n \"doc_count\": 9239,\r\n \"structure_level\": {\r\n \"5\": 9239\r\n },\r\n \"index_count\": 1,\r\n \"failure_store_doc_count\": 9239,\r\n \"failure_store_index_count\": 1,\r\n \"namespace_count\": 0,\r\n \"field_count\": 1508,\r\n \"field_existence\": {\r\n \"container.id\": 9239,\r\n \"log.level\": 9239,\r\n \"container.name\": 9239,\r\n \"host.name\": 9239,\r\n \"host.hostname\": 9239,\r\n \"kubernetes.pod.name\": 9239,\r\n \"kubernetes.pod.uid\": 9239,\r\n \"cloud.provider\": 9239,\r\n \"agent.type\": 9239,\r\n \"event.dataset\": 9239,\r\n \"event.category\": 9239,\r\n \"event.module\": 9239,\r\n \"service.name\": 9239,\r\n \"service.type\": 9239,\r\n \"service.version\": 9239,\r\n \"message\": 9239,\r\n \"event.original\": 9239,\r\n \"error.message\": 9239,\r\n \"@timestamp\": 9239,\r\n \"data_stream.dataset\": 9239,\r\n \"data_stream.namespace\": 9239,\r\n \"data_stream.type\": 9239\r\n },\r\n \"size_in_bytes\": 12382655,\r\n \"managed_by\": [],\r\n \"package_name\": [],\r\n \"beat\": [\r\n \"heartbeat\"\r\n ]\r\n },\r\n {\r\n \"pattern_name\": \"nginx\",\r\n \"doc_count\": 10080,\r\n \"structure_level\": {\r\n \"6\": 10080\r\n },\r\n \"index_count\": 1,\r\n \"failure_store_doc_count\": 0,\r\n \"failure_store_index_count\": 0,\r\n \"namespace_count\": 1,\r\n \"field_count\": 1562,\r\n \"field_existence\": {\r\n \"container.id\": 10080,\r\n \"log.level\": 10080,\r\n \"host.name\": 10080,\r\n \"kubernetes.pod.uid\": 10080,\r\n \"cloud.provider\": 10080,\r\n \"event.dataset\": 10080,\r\n \"service.name\": 10080,\r\n \"message\": 10080,\r\n \"@timestamp\": 10080,\r\n \"data_stream.dataset\": 10080,\r\n \"data_stream.namespace\": 10080,\r\n \"data_stream.type\": 10080\r\n },\r\n \"size_in_bytes\": 12098071,\r\n \"managed_by\": [],\r\n \"package_name\": [],\r\n \"beat\": []\r\n },\r\n {\r\n \"pattern_name\": \"apache\",\r\n \"doc_count\": 1439,\r\n \"structure_level\": {\r\n \"6\": 1439\r\n },\r\n \"index_count\": 1,\r\n \"failure_store_doc_count\": 0,\r\n \"failure_store_index_count\": 0,\r\n \"namespace_count\": 2,\r\n \"field_count\": 1562,\r\n \"field_existence\": {\r\n \"container.id\": 1439,\r\n \"log.level\": 1439,\r\n \"host.name\": 1439,\r\n \"kubernetes.pod.uid\": 1439,\r\n \"cloud.provider\": 1439,\r\n \"event.dataset\": 1439,\r\n \"service.name\": 1439,\r\n \"message\": 1439,\r\n \"@timestamp\": 1439,\r\n \"data_stream.dataset\": 1439,\r\n \"data_stream.namespace\": 1439,\r\n \"data_stream.type\": 1439\r\n },\r\n \"size_in_bytes\": 4425502,\r\n \"managed_by\": [],\r\n \"package_name\": [],\r\n \"beat\": []\r\n },\r\n {\r\n \"pattern_name\": \"generic-logs\",\r\n \"doc_count\": 106659,\r\n \"structure_level\": {\r\n \"2\": 100907,\r\n \"3\": 5752\r\n },\r\n \"index_count\": 6,\r\n \"failure_store_doc_count\": 0,\r\n \"failure_store_index_count\": 0,\r\n \"namespace_count\": 2,\r\n \"field_count\": 1581,\r\n \"field_existence\": {\r\n \"log.level\": 106659,\r\n \"host.name\": 106659,\r\n \"service.name\": 106659,\r\n \"@timestamp\": 106659,\r\n \"data_stream.dataset\": 106659,\r\n \"data_stream.namespace\": 106659,\r\n \"data_stream.type\": 106659,\r\n \"container.id\": 5752,\r\n \"kubernetes.pod.uid\": 5752,\r\n \"cloud.provider\": 5752,\r\n \"event.dataset\": 5752,\r\n \"message\": 5752\r\n },\r\n \"size_in_bytes\": 29752097,\r\n \"managed_by\": [],\r\n \"package_name\": [],\r\n \"beat\": []\r\n }\r\n]\r\n```\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>\r\nCo-authored-by: Elastic Machine <[email protected]>","sha":"13736730d18a323da676a7a619f675038eb2c0bf"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/189380","number":189380,"mergeCommit":{"message":"[Logs Data Telemetry] Create background job to collect and send logs data telemetry (#189380)\n\n## Summary\r\n\r\nThe PR creates a service which runs in background as a Kibana Task and\r\nlazily collects and processes logs data telemetry events. This\r\nimplementation collects the data by reading indices info and prepares\r\nthe telemetry events. These events will be reported to stack telemetry\r\nin follow up PRs.\r\n\r\nThe service groups the stats per\r\n[pattern_name](https://github.com/elastic/kibana/blob/1116ac6daaef45785dad05ad09f78a93768e591d/src/plugins/telemetry/server/telemetry_collection/get_data_telemetry/constants.ts#L42)\r\nand gathers the following information:\r\n- Docs and indices count (regular and failure)\r\n- Count of unique namespaces found in data streams matching a pattern\r\n- Size of the documents (regular only)\r\n- Meta information (managed by, package name and if beat information is\r\nfound in mappings)\r\n- Total fields count and count of individual log centric fields\r\n- Count of docs corresponding to each structure level\r\n\r\nThe service gathers the data streams information and mapping and\r\ngenerate events in the following manner:\r\n```yml\r\n[\r\n {\r\n \"pattern_name\": \"heartbeat\",\r\n \"shipper\": \"heartbeat\",\r\n \"doc_count\": 9239,\r\n \"structure_level\": {\r\n \"5\": 9239\r\n },\r\n \"index_count\": 1,\r\n \"failure_store_doc_count\": 9239,\r\n \"failure_store_index_count\": 1,\r\n \"namespace_count\": 0,\r\n \"field_count\": 1508,\r\n \"field_existence\": {\r\n \"container.id\": 9239,\r\n \"log.level\": 9239,\r\n \"container.name\": 9239,\r\n \"host.name\": 9239,\r\n \"host.hostname\": 9239,\r\n \"kubernetes.pod.name\": 9239,\r\n \"kubernetes.pod.uid\": 9239,\r\n \"cloud.provider\": 9239,\r\n \"agent.type\": 9239,\r\n \"event.dataset\": 9239,\r\n \"event.category\": 9239,\r\n \"event.module\": 9239,\r\n \"service.name\": 9239,\r\n \"service.type\": 9239,\r\n \"service.version\": 9239,\r\n \"message\": 9239,\r\n \"event.original\": 9239,\r\n \"error.message\": 9239,\r\n \"@timestamp\": 9239,\r\n \"data_stream.dataset\": 9239,\r\n \"data_stream.namespace\": 9239,\r\n \"data_stream.type\": 9239\r\n },\r\n \"size_in_bytes\": 12382655,\r\n \"managed_by\": [],\r\n \"package_name\": [],\r\n \"beat\": [\r\n \"heartbeat\"\r\n ]\r\n },\r\n {\r\n \"pattern_name\": \"nginx\",\r\n \"doc_count\": 10080,\r\n \"structure_level\": {\r\n \"6\": 10080\r\n },\r\n \"index_count\": 1,\r\n \"failure_store_doc_count\": 0,\r\n \"failure_store_index_count\": 0,\r\n \"namespace_count\": 1,\r\n \"field_count\": 1562,\r\n \"field_existence\": {\r\n \"container.id\": 10080,\r\n \"log.level\": 10080,\r\n \"host.name\": 10080,\r\n \"kubernetes.pod.uid\": 10080,\r\n \"cloud.provider\": 10080,\r\n \"event.dataset\": 10080,\r\n \"service.name\": 10080,\r\n \"message\": 10080,\r\n \"@timestamp\": 10080,\r\n \"data_stream.dataset\": 10080,\r\n \"data_stream.namespace\": 10080,\r\n \"data_stream.type\": 10080\r\n },\r\n \"size_in_bytes\": 12098071,\r\n \"managed_by\": [],\r\n \"package_name\": [],\r\n \"beat\": []\r\n },\r\n {\r\n \"pattern_name\": \"apache\",\r\n \"doc_count\": 1439,\r\n \"structure_level\": {\r\n \"6\": 1439\r\n },\r\n \"index_count\": 1,\r\n \"failure_store_doc_count\": 0,\r\n \"failure_store_index_count\": 0,\r\n \"namespace_count\": 2,\r\n \"field_count\": 1562,\r\n \"field_existence\": {\r\n \"container.id\": 1439,\r\n \"log.level\": 1439,\r\n \"host.name\": 1439,\r\n \"kubernetes.pod.uid\": 1439,\r\n \"cloud.provider\": 1439,\r\n \"event.dataset\": 1439,\r\n \"service.name\": 1439,\r\n \"message\": 1439,\r\n \"@timestamp\": 1439,\r\n \"data_stream.dataset\": 1439,\r\n \"data_stream.namespace\": 1439,\r\n \"data_stream.type\": 1439\r\n },\r\n \"size_in_bytes\": 4425502,\r\n \"managed_by\": [],\r\n \"package_name\": [],\r\n \"beat\": []\r\n },\r\n {\r\n \"pattern_name\": \"generic-logs\",\r\n \"doc_count\": 106659,\r\n \"structure_level\": {\r\n \"2\": 100907,\r\n \"3\": 5752\r\n },\r\n \"index_count\": 6,\r\n \"failure_store_doc_count\": 0,\r\n \"failure_store_index_count\": 0,\r\n \"namespace_count\": 2,\r\n \"field_count\": 1581,\r\n \"field_existence\": {\r\n \"log.level\": 106659,\r\n \"host.name\": 106659,\r\n \"service.name\": 106659,\r\n \"@timestamp\": 106659,\r\n \"data_stream.dataset\": 106659,\r\n \"data_stream.namespace\": 106659,\r\n \"data_stream.type\": 106659,\r\n \"container.id\": 5752,\r\n \"kubernetes.pod.uid\": 5752,\r\n \"cloud.provider\": 5752,\r\n \"event.dataset\": 5752,\r\n \"message\": 5752\r\n },\r\n \"size_in_bytes\": 29752097,\r\n \"managed_by\": [],\r\n \"package_name\": [],\r\n \"beat\": []\r\n }\r\n]\r\n```\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>\r\nCo-authored-by: Elastic Machine <[email protected]>","sha":"13736730d18a323da676a7a619f675038eb2c0bf"}}]}] BACKPORT--> Co-authored-by: Abdul Wahab Zahid <[email protected]>
- Loading branch information