Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discover][Observability] Set o11y recommended fields #198562

Open
flash1293 opened this issue Oct 31, 2024 · 8 comments
Open

[Discover][Observability] Set o11y recommended fields #198562

flash1293 opened this issue Oct 31, 2024 · 8 comments
Labels
Project:OneDiscover Enrich Discover with contextual awareness Team:obs-ux-logs Observability Logs User Experience Team

Comments

@flash1293
Copy link
Contributor

πŸ““ Summary

In an o11y space, the extension point introduced in #192556 should be used to set suitable recommended fields.

Which fields should be recommended is tbd.

βœ” Acceptance Criteria

  • Set recommended fields for the o11y logs profile

❓ Open questions

  • What fields should be recommended?
@flash1293 flash1293 added Project:OneDiscover Enrich Discover with contextual awareness Team:obs-ux-logs Observability Logs User Experience Team labels Oct 31, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-logs-team (Team:obs-ux-logs)

@flash1293
Copy link
Contributor Author

@LucaWintergerst could you take a pass on recommended fields? Something that's not clear to me is what we should do with technology-specific fields like kubernetes - there are the agent integration fields, otel fields, ...

@LucaWintergerst
Copy link
Contributor

As a first list of candidates I'd propose

event.dataset
log.level
service.name
host.hostname

we have new telemetry in 8.16 that gives us better insights here, we can refine this over time once we have more data

@flash1293
Copy link
Contributor Author

Agreed about event.dataset as it unifies integrations and old-school beat modules (at least that's my understanding - integrations always set data_stream.dataset and event.dataset).

About host.hostname - should this be host.name instead?

The difference is:

host.name:
It can contain what hostname returns on Unix systems, the fully qualified domain name (FQDN), or a name specified by the user. The recommended value is the lowercase FQDN of the host.
host.hostname
It normally contains what the hostname command returns on the host machine.

In integrations, host.name is referenced a bit more often, but in general it would be preferable because it contains more information. E.g. of my Mac the hostname returned from hostname is just Mac, but the host.name is Joe's MacBook Pro

@flash1293
Copy link
Contributor Author

Actually, about event.dataset - it's not optimal to use it because data_stream.dataset is mapped as constant keyword, so it will be much more performant to filter on that one instead of event.dataset, as constant keywords allow you to completely skip indices in the search.

@tonyghiani
Copy link
Contributor

In the logs contextual resolution, we are resolving sub-profiles that match the following log sources:

  • Apache error logs
  • AWS S3 access logs
  • Kubernetes container logs
  • Nginx access logs
  • Nginx error logs
  • System logs
  • Windows logs

In these cases, it might be worth also considering specific dataset fields for a more curated experience. What do you think?

@LucaWintergerst
Copy link
Contributor

@tonyghiani from what I recall our extension point would initially only be a single group of fields. We can explore the option of showing sub-profiles too, but we'd need to put them into the same group which may or may not be a good idea.

e.g. if we make the default

event.dataset
host.name
log.level
service.name

and then when selecting kubernetes we add k8s fields, we'd likely sort them alphabetically

event.dataset
host.name
kubernetes.cluster.name
kubernetes.node.name
kubernetes.pod.name
...
log.level
service.name

I think it could work, but ideally we'd first get a more flexible extension point so we could just do something like

<- logs fields ->
event.dataset
host.name
log.level
service.name

<- kubernetes fields ->
kubernetes.cluster.name
kubernetes.node.name
kubernetes.pod.name
...

<- all fields ->
...
...

@tonyghiani
Copy link
Contributor

and then when selecting kubernetes we add k8s fields, we'd likely sort them alphabetically

Exactly, that's what I meant and since those profiles (sub-profiles is not the right definition, my bad, those are just profiles with a more specific resolution logic) are already in place, I thought you would start with those definitions already, but it makes sense to start small and then expand when the extension point offers the capabilities for multiple sections πŸ‘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Project:OneDiscover Enrich Discover with contextual awareness Team:obs-ux-logs Observability Logs User Experience Team
Projects
None yet
Development

No branches or pull requests

4 participants