-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sdk/log: Drop duplicated KeyValues #5086
Comments
The problem of dropping duplicates in the logger is that they can be passed not only by the Bridge API but also by the processor (which can add/set attributes and set a body). Therefore, the duplicates should be deduplicated as far int the pipeline as possible. Therefore, my proposal is that the OTLP exporters handle deduplication. We can have an option in the OTLP exporters to disable the deduplication to improve the OTLP exporter's performance. This is looks to be acceptable:
Especially, by @tigrannajaryan:
I will try to make a second try and update the specification here to make it more clear. I plan to add something more or less like:
@open-telemetry/go-approvers Please leave 👍 if you agree, if so I will close the issue and update the description of: |
The alternative, could be creating a processor which would handle deduplication. I am leaning to this approach as I find that it would be more-performant by default. It follows the behavior of all popular Go logging libraries. At last, even if we would do it by default - the user would be able to by-pass and duplicates at some point (e.g. by implementing a custom processor). Please leave 👍 if you agree, if so if so I will close this issue and create a new one to add I propose open-telemetry/opentelemetry-specification#3987 to allow flexibility. |
Another alternative, could be adding deduplication to the simple and batch processor. It is the best place to add if we would like to "enable deduplication be default". I assume most people would use the batch processor in production. We could still implement it via a Side note: Simple processor does not currently accept any options. I am not sure if we should add options or just focus on batch processor which is intended for production purposes. Please leave 👍 if you agree, if so if so I will close this issue and create a new one to add |
This makes it sound like the data passed to an exporter must already be deduplicated.
Is tigran saying this can't be the default? |
I am not sure if it is a suggestion or a requirement. Personally, I would start with no deduplication by default. We could introduce this behavior as default to the batch processor later (#5086 (comment)) if needed. I have not heard about many problems because of the lack of deduplication in .NET. They also prefer it as opt-in. Reference: open-telemetry/opentelemetry-dotnet#4324 |
I think we could add this afterwards. Given the OTLP seems to be the only place that de-duplication needs to be done we could start there and if a general processor is needed we could add that later. |
I think we should add options (even though there is none) to PS. My guts tell me that we would have to go with #5086 (comment). |
Based on open-telemetry/opentelemetry-specification#3931 (comment) it looks like it it looks my guts (previous comment) were right. |
Partial implementation of attribute de-duplication: #5190 |
…map as opt-in (#3987) Fixes #3931 Per agreement: #3931 (comment) > The SDKs should handle the key-value deduplication by default. It is acceptable to add an option to disable deduplication. Previous PR: #3938 > I think it is fine to do the deduplication anywhere you want as long as externally observable data complies with this document. The main purpose of this PR is to have an agreement for following questions (and update the specification to to make it more clear): 1. Is the deduplication required for all log exporters or only OTLP log exporters? Answer: It is required for all exporters. 2. Can the key-value deduplication for log records be opt-in? Answer: Yes, it is OK as long as it is documented that it can cause problems in case maps duplicated keys are exported. Related to: - open-telemetry/opentelemetry-go#5086 - open-telemetry/opentelemetry-dotnet#4324
Resolved by #5230 |
…map as opt-in (open-telemetry#3987) Fixes open-telemetry#3931 Per agreement: open-telemetry#3931 (comment) > The SDKs should handle the key-value deduplication by default. It is acceptable to add an option to disable deduplication. Previous PR: open-telemetry#3938 > I think it is fine to do the deduplication anywhere you want as long as externally observable data complies with this document. The main purpose of this PR is to have an agreement for following questions (and update the specification to to make it more clear): 1. Is the deduplication required for all log exporters or only OTLP log exporters? Answer: It is required for all exporters. 2. Can the key-value deduplication for log records be opt-in? Answer: Yes, it is OK as long as it is documented that it can cause problems in case maps duplicated keys are exported. Related to: - open-telemetry/opentelemetry-go#5086 - open-telemetry/opentelemetry-dotnet#4324
Why
From
https://opentelemetry.io/docs/specs/otel/logs/data-model/#type-mapstring-any:
Also: open-telemetry/opentelemetry-specification#3931 (comment)
Adding an option to avoid deduplication (and allow duplicated keys) is tracked as a separate issue. #5133
What
Drop duplicated KeyValues in simple and batch processor OR record.
More:
The text was updated successfully, but these errors were encountered: