Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding geolocation attrs to app_logs #11699

Merged
merged 12 commits into from
Sep 26, 2023
3 changes: 3 additions & 0 deletions apmpackage/apm/changelog.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
- version: generated
changes:
- description: Adding client.geo* attrs to mobile events
type: enhancement
link: https://github.com/elastic/apm-server/pull/11699
- description: Define data retentions to support DLM
type: enhancement
link: https://github.com/elastic/apm-server/pull/11539
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,14 @@ processors:
name: observer_ids
- pipeline:
name: remove_ecs_version
- geoip:
if: ctx.event?.category == 'device'
field: client.ip
target_field: client.geo
ignore_missing: true
database_file: GeoLite2-City.mmdb
on_failure:
- remove:
field: client.ip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to remove the field on failure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we should tbh, although it seems like the same is done in other places when using the geoip processor which made me think that there might be a reason for it, so I just left it to keep consistency with all other usages, but I'm ok to remove this part if it's not needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on this. Actually, it may be helpful to just use a predefined geo ip pipeline here.

  - pipeline:
      name: client_geoip

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with using that predefined pipeline, however, doing so would append geo attrs to all logs, instead of only the ones with the attribute event.category set to device. I'm not sure is that could cause problems to the apm server.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LikeTheSalad you can execute the pipeline conditionally as well:

- pipeline: 
      if: ...
      name: client_geoip

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just discussed this with Silvia. We think that unless there's a specific reason, we'd prefer running the pipeline without the if check. Since we are already running client_geoip ingest pipeline for most of the other data streams, the fact that it is missing in this app_logs data stream appears to be only an oversight. Also, client.ip is only populated for swift / android otel agents AND capture_personal_data: true, so it should be fine to run client_geoip whenever client.ip exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! That makes things simpler, cheers!

Copy link
Contributor Author

@LikeTheSalad LikeTheSalad Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finished creating the system test for this use case and I also applied the changes to use the existing client_geoip pipeline without conditions.

I found out a couple of things while I was working on the test:

  • The capture_personal_data flag seems to be enabled by default (at least for the system tests), because when I was adding a new config inside systemtest/apmservertest/config.go, I printed some logs in here where we're reading the config and realized that the if body was always getting executed no matter if I added the new config or not.
  • There were 2 reasons why I wasn't able to see geo attributes locally: The first one is that the agent.name has to be either android/java or iOS/swift for the client.ip attr to be populated (as mentioned by @carsonip), and the second one was that the client.ip value received by the server during the tests was 127.0.0.1 which the server doesn't know how to translate into a location (which makes sense), so I had to override the local IP during testing by setting a real one using the X-Forwarded-For header so that the server could find a location from the client.ip attr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • You're right. capture_personal_data is default to true for both apm-server standalone and integration.
  • Overriding X-Forwarded-For sounds good.

ignore_failure: true
ignore_missing: true