Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UII] Advanced agent monitoring options UI for HTTP endpoint and diagnostics #193361

Merged
merged 26 commits into from
Sep 22, 2024

Conversation

jen-huang
Copy link
Contributor

@jen-huang jen-huang commented Sep 18, 2024

Summary

Resolves #153950.

This PR implements a UI to configure advanced Elastic Agent monitoring options under agent policy settings. These advanced options include enabling HTTP monitoring endpoints and various options for agent diagnostics. They are shown under an a toggle under the existing agent monitoring logs and metrics collection options:

image

If the base HTTP monitoring endpoint is not enabled, the rest of the HTTP options are disabled:

image

The following new fields are added to agent policy schema to support this:

monitoring_http
monitoring_pprof_enabled
monitoring_diagnostics

This work supersedes the previous HTTP monitoring endpoint options under Advanced Settings at the bottom of the page. Any previous configuration under an agent policy's advanced_settings.agent_monitoring_http saved object field are migrated over to the new monitoring_http field and the old field is deleted. See the migration fn backfillAgentPolicyToV4.

These new options are compiled to agent yaml like this:

agent:
  monitoring:
    enabled: true
    use_output: default
    logs: true
    metrics: true
    traces: true
    namespace: default
    pprof:
      enabled: true
    http:
      enabled: true
      host: localhost
      port: 6791
    diagnostics:
      limit:
        interval: 1m
        burst: 1
      uploader:
        max_retries: 10
        init_dur: 1s
        max_dur: 10m

Summarize your PR. If it involves visual changes include a screenshot or gif.

To-do

  • API integration tests
  • Full manual test of SO migration
  • Full manual test with agent using these settings

Checklist

Delete any items that are not applicable to this PR.

@jen-huang jen-huang self-assigned this Sep 18, 2024
@obltmachine
Copy link

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@jen-huang jen-huang added Team:Fleet Team label for Observability Data Collection Fleet team release_note:feature Makes this part of the condensed release notes backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) labels Sep 18, 2024
@jen-huang jen-huang marked this pull request as ready for review September 18, 2024 21:22
@jen-huang jen-huang requested review from a team as code owners September 18, 2024 21:22
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@jen-huang jen-huang changed the title [UII] Agent policy advanced monitoring options UI for HTTP endpoint and diagnostics [UII] Advanced agent monitoring options UI for HTTP endpoint and diagnostics Sep 18, 2024
@kc13greiner kc13greiner self-requested a review September 18, 2024 21:44
@elastic-vault-github-plugin-prod elastic-vault-github-plugin-prod bot requested a review from a team as a code owner September 18, 2024 21:58
@juliaElastic
Copy link
Contributor

The code changes look great! It would be nice to test that these options are actually applied on the agent side correctly.

Copy link
Contributor

@kc13greiner kc13greiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jen-huang
Copy link
Contributor Author

jen-huang commented Sep 19, 2024

The code changes look great! It would be nice to test that these options are actually applied on the agent side correctly.

@juliaElastic I spent some time testing today:

  1. Verified that all config options are passed to agent correctly using inspect
  2. Verified that HTTP endpoint /liveness, host, and port config options are respected
  3. Endpoint /buffer set with http.buffer.enabled doesn't work, after discussion with Craig I pulled it out, see this comment for more info
  4. Endpoint /debug/pprof set with pprof.enabled doesn't work, there is PR to fix this bug: Add pprof endpoints to the monitoring server if enabled elastic-agent#5562
  5. Filed Diagnostics rate limiting config doesn't seem to work elastic-agent#5570 for issues with diagnostics.limit config options and for help with confirming that diagnostics.uploader options are applied correctly

Copy link
Contributor

@juliaElastic juliaElastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing! LGTM

Copy link
Contributor

@kilfoyle kilfoyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the new UI docs link. 👍

@jen-huang jen-huang enabled auto-merge (squash) September 20, 2024 18:52
@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] Fleet Cypress Tests #4 / View agents list Agent status filter should filter on healthy (16 result)
  • [job] [logs] Fleet Cypress Tests #4 / View agents list Bulk actions should allow to bulk upgrade agents and cancel that upgrade

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
fleet 1207 1208 +1

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
fleet 1238 1241 +3

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
aiAssistantManagementSelection 91.0KB 91.0KB +77.0B
fleet 1.7MB 1.7MB +12.3KB
lists 143.5KB 143.5KB +77.0B
total +12.4KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
core 454.5KB 454.6KB +77.0B
Unknown metric groups

API count

id before after diff
fleet 1361 1364 +3

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @jen-huang

Copy link
Contributor

@gsoldevila gsoldevila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core changes LGTM. Additive (compatible) mapping updates only ✅

@jen-huang jen-huang merged commit 87cdc2d into elastic:main Sep 22, 2024
25 checks passed
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Sep 22, 2024
…nostics (elastic#193361)

## Summary

Resolves elastic#153950.

This PR implements a UI to configure advanced Elastic Agent monitoring
options under agent policy settings. These advanced options include
enabling HTTP monitoring endpoints and various options for agent
diagnostics. They are shown under an a toggle under the existing agent
monitoring logs and metrics collection options:

<img width="1326" alt="image"
src="https://github.com/user-attachments/assets/ac8cbe00-d838-4c9a-8a35-3dbf31222dc9">

If the base HTTP monitoring endpoint is not enabled, the rest of the
HTTP options are disabled:

<img width="1328" alt="image"
src="https://github.com/user-attachments/assets/2eac787c-3055-4862-b3eb-2566a39ee86c">

The following new fields are added to agent policy schema to support
this:
```
monitoring_http
monitoring_pprof_enabled
monitoring_diagnostics
```

This work supersedes the previous `HTTP monitoring endpoint` options
under `Advanced Settings` at the bottom of the page. Any previous
configuration under an agent policy's
`advanced_settings.agent_monitoring_http` saved object field are
migrated over to the new `monitoring_http` field and the old field is
deleted. See the migration fn `backfillAgentPolicyToV4`.

These new options are compiled to agent yaml like this:

```yml
agent:
  monitoring:
    enabled: true
    use_output: default
    logs: true
    metrics: true
    traces: true
    namespace: default
    pprof:
      enabled: true
    http:
      enabled: true
      host: localhost
      port: 6791
    diagnostics:
      limit:
        interval: 1m
        burst: 1
      uploader:
        max_retries: 10
        init_dur: 1s
        max_dur: 10m
```
Summarize your PR. If it involves visual changes include a screenshot or
gif.

### To-do
- [x] API integration tests
- [x] Full manual test of SO migration
- [x] Full manual test with agent using these settings

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
  - elastic/ingest-docs#1333
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit 87cdc2d)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Sep 22, 2024
…d diagnostics (#193361) (#193658)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[UII] Advanced agent monitoring options UI for HTTP endpoint and
diagnostics (#193361)](#193361)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Jen
Huang","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-09-22T10:49:33Z","message":"[UII]
Advanced agent monitoring options UI for HTTP endpoint and diagnostics
(#193361)\n\n## Summary\r\n\r\nResolves
https://github.com/elastic/kibana/issues/153950.\r\n\r\nThis PR
implements a UI to configure advanced Elastic Agent
monitoring\r\noptions under agent policy settings. These advanced
options include\r\nenabling HTTP monitoring endpoints and various
options for agent\r\ndiagnostics. They are shown under an a toggle under
the existing agent\r\nmonitoring logs and metrics collection
options:\r\n\r\n<img width=\"1326\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/ac8cbe00-d838-4c9a-8a35-3dbf31222dc9\">\r\n\r\nIf
the base HTTP monitoring endpoint is not enabled, the rest of
the\r\nHTTP options are disabled:\r\n\r\n<img width=\"1328\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/2eac787c-3055-4862-b3eb-2566a39ee86c\">\r\n\r\nThe
following new fields are added to agent policy schema to
support\r\nthis:\r\n```\r\nmonitoring_http\r\nmonitoring_pprof_enabled\r\nmonitoring_diagnostics\r\n```\r\n\r\nThis
work supersedes the previous `HTTP monitoring endpoint` options\r\nunder
`Advanced Settings` at the bottom of the page. Any
previous\r\nconfiguration under an agent
policy's\r\n`advanced_settings.agent_monitoring_http` saved object field
are\r\nmigrated over to the new `monitoring_http` field and the old
field is\r\ndeleted. See the migration fn
`backfillAgentPolicyToV4`.\r\n\r\nThese new options are compiled to
agent yaml like this:\r\n\r\n```yml\r\nagent:\r\n monitoring:\r\n
enabled: true\r\n use_output: default\r\n logs: true\r\n metrics:
true\r\n traces: true\r\n namespace: default\r\n pprof:\r\n enabled:
true\r\n http:\r\n enabled: true\r\n host: localhost\r\n port: 6791\r\n
diagnostics:\r\n limit:\r\n interval: 1m\r\n burst: 1\r\n uploader:\r\n
max_retries: 10\r\n init_dur: 1s\r\n max_dur: 10m\r\n```\r\nSummarize
your PR. If it involves visual changes include a screenshot
or\r\ngif.\r\n\r\n### To-do\r\n- [x] API integration tests\r\n- [x] Full
manual test of SO migration\r\n- [x] Full manual test with agent using
these settings\r\n\r\n### Checklist\r\n\r\nDelete any items that are not
applicable to this PR.\r\n\r\n- [x] Any text added follows [EUI's
writing\r\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\r\nsentence case text and includes
[i18n\r\nsupport](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)\r\n-
[
]\r\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\r\nwas
added for features that require explanation or tutorials\r\n -
elastic/ingest-docs#1333 \r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common
scenarios\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<[email protected]>","sha":"87cdc2db728b088a44ff6e1977679f326bfd38d2","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Fleet","v9.0.0","release_note:feature","backport:prev-minor"],"title":"[UII]
Advanced agent monitoring options UI for HTTP endpoint and
diagnostics","number":193361,"url":"https://github.com/elastic/kibana/pull/193361","mergeCommit":{"message":"[UII]
Advanced agent monitoring options UI for HTTP endpoint and diagnostics
(#193361)\n\n## Summary\r\n\r\nResolves
https://github.com/elastic/kibana/issues/153950.\r\n\r\nThis PR
implements a UI to configure advanced Elastic Agent
monitoring\r\noptions under agent policy settings. These advanced
options include\r\nenabling HTTP monitoring endpoints and various
options for agent\r\ndiagnostics. They are shown under an a toggle under
the existing agent\r\nmonitoring logs and metrics collection
options:\r\n\r\n<img width=\"1326\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/ac8cbe00-d838-4c9a-8a35-3dbf31222dc9\">\r\n\r\nIf
the base HTTP monitoring endpoint is not enabled, the rest of
the\r\nHTTP options are disabled:\r\n\r\n<img width=\"1328\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/2eac787c-3055-4862-b3eb-2566a39ee86c\">\r\n\r\nThe
following new fields are added to agent policy schema to
support\r\nthis:\r\n```\r\nmonitoring_http\r\nmonitoring_pprof_enabled\r\nmonitoring_diagnostics\r\n```\r\n\r\nThis
work supersedes the previous `HTTP monitoring endpoint` options\r\nunder
`Advanced Settings` at the bottom of the page. Any
previous\r\nconfiguration under an agent
policy's\r\n`advanced_settings.agent_monitoring_http` saved object field
are\r\nmigrated over to the new `monitoring_http` field and the old
field is\r\ndeleted. See the migration fn
`backfillAgentPolicyToV4`.\r\n\r\nThese new options are compiled to
agent yaml like this:\r\n\r\n```yml\r\nagent:\r\n monitoring:\r\n
enabled: true\r\n use_output: default\r\n logs: true\r\n metrics:
true\r\n traces: true\r\n namespace: default\r\n pprof:\r\n enabled:
true\r\n http:\r\n enabled: true\r\n host: localhost\r\n port: 6791\r\n
diagnostics:\r\n limit:\r\n interval: 1m\r\n burst: 1\r\n uploader:\r\n
max_retries: 10\r\n init_dur: 1s\r\n max_dur: 10m\r\n```\r\nSummarize
your PR. If it involves visual changes include a screenshot
or\r\ngif.\r\n\r\n### To-do\r\n- [x] API integration tests\r\n- [x] Full
manual test of SO migration\r\n- [x] Full manual test with agent using
these settings\r\n\r\n### Checklist\r\n\r\nDelete any items that are not
applicable to this PR.\r\n\r\n- [x] Any text added follows [EUI's
writing\r\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\r\nsentence case text and includes
[i18n\r\nsupport](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)\r\n-
[
]\r\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\r\nwas
added for features that require explanation or tutorials\r\n -
elastic/ingest-docs#1333 \r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common
scenarios\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<[email protected]>","sha":"87cdc2db728b088a44ff6e1977679f326bfd38d2"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/193361","number":193361,"mergeCommit":{"message":"[UII]
Advanced agent monitoring options UI for HTTP endpoint and diagnostics
(#193361)\n\n## Summary\r\n\r\nResolves
https://github.com/elastic/kibana/issues/153950.\r\n\r\nThis PR
implements a UI to configure advanced Elastic Agent
monitoring\r\noptions under agent policy settings. These advanced
options include\r\nenabling HTTP monitoring endpoints and various
options for agent\r\ndiagnostics. They are shown under an a toggle under
the existing agent\r\nmonitoring logs and metrics collection
options:\r\n\r\n<img width=\"1326\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/ac8cbe00-d838-4c9a-8a35-3dbf31222dc9\">\r\n\r\nIf
the base HTTP monitoring endpoint is not enabled, the rest of
the\r\nHTTP options are disabled:\r\n\r\n<img width=\"1328\"
alt=\"image\"\r\nsrc=\"https://github.com/user-attachments/assets/2eac787c-3055-4862-b3eb-2566a39ee86c\">\r\n\r\nThe
following new fields are added to agent policy schema to
support\r\nthis:\r\n```\r\nmonitoring_http\r\nmonitoring_pprof_enabled\r\nmonitoring_diagnostics\r\n```\r\n\r\nThis
work supersedes the previous `HTTP monitoring endpoint` options\r\nunder
`Advanced Settings` at the bottom of the page. Any
previous\r\nconfiguration under an agent
policy's\r\n`advanced_settings.agent_monitoring_http` saved object field
are\r\nmigrated over to the new `monitoring_http` field and the old
field is\r\ndeleted. See the migration fn
`backfillAgentPolicyToV4`.\r\n\r\nThese new options are compiled to
agent yaml like this:\r\n\r\n```yml\r\nagent:\r\n monitoring:\r\n
enabled: true\r\n use_output: default\r\n logs: true\r\n metrics:
true\r\n traces: true\r\n namespace: default\r\n pprof:\r\n enabled:
true\r\n http:\r\n enabled: true\r\n host: localhost\r\n port: 6791\r\n
diagnostics:\r\n limit:\r\n interval: 1m\r\n burst: 1\r\n uploader:\r\n
max_retries: 10\r\n init_dur: 1s\r\n max_dur: 10m\r\n```\r\nSummarize
your PR. If it involves visual changes include a screenshot
or\r\ngif.\r\n\r\n### To-do\r\n- [x] API integration tests\r\n- [x] Full
manual test of SO migration\r\n- [x] Full manual test with agent using
these settings\r\n\r\n### Checklist\r\n\r\nDelete any items that are not
applicable to this PR.\r\n\r\n- [x] Any text added follows [EUI's
writing\r\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\r\nsentence case text and includes
[i18n\r\nsupport](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)\r\n-
[
]\r\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\r\nwas
added for features that require explanation or tutorials\r\n -
elastic/ingest-docs#1333 \r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common
scenarios\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<[email protected]>","sha":"87cdc2db728b088a44ff6e1977679f326bfd38d2"}}]}]
BACKPORT-->

---------

Co-authored-by: Jen Huang <[email protected]>
@jen-huang jen-huang deleted the feat/http-monitoring branch September 23, 2024 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) release_note:feature Makes this part of the condensed release notes Team:Fleet Team label for Observability Data Collection Fleet team v8.16.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Fleet] Add ability to enable and configure HTTP Monitoring
9 participants