Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(new_relic sink): Do not quote paths containing periods for the event API #21323

Merged
merged 3 commits into from
Sep 19, 2024

Conversation

bruceg
Copy link
Member

@bruceg bruceg commented Sep 19, 2024

This could not be accomplished the same way as #21305 since the event API cannot handle nested JSON data. Instead, a new LogEvent::convert_to_fields_unquoted method was added to produce the flattened data without quoting.

…ent API

This could not be accomplished the same way as #21305 since the event API cannot
handle nested JSON data. Instead, a new `LogEvent::convert_to_fields_unquoted`
method was added to produce the flattened data without quoting.
@bruceg bruceg added type: bug A code related bug. domain: logs Anything related to Vector's log events sink: new_relic Anything `new_relic` sink related labels Sep 19, 2024
@bruceg bruceg requested a review from a team as a code owner September 19, 2024 15:00
@github-actions github-actions bot added domain: sinks Anything related to the Vector's sinks domain: core Anything related to core crates i.e. vector-core, core-common, etc labels Sep 19, 2024
@@ -10,14 +10,14 @@ static IS_VALID_PATH_SEGMENT: Lazy<Regex> = Lazy::new(|| Regex::new(r"^[a-zA-Z0-

/// Iterates over all paths in form `a.b[0].c[1]` in alphabetical order
/// and their corresponding values.
pub fn all_fields(fields: &ObjectMap) -> FieldsIter {
FieldsIter::new(fields)
pub fn all_fields(fields: &ObjectMap, quote_periods: bool) -> FieldsIter {
Copy link
Contributor

@pront pront Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this change. This iterator now returns parsable paths which is the desirable behavior.

Are you looking for the old behavior of this iterator? Then I would add all_fields_unquoted, similar to all_fields_non_object_root.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With quote_periods = true, the current behavior of the iterator is preserved, as demonstrated by the tests. It only changes the behavior when that parameter is false. I don't want the old behavior where periods are escaped neither, as that too caused problems in the new_relic sink.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I am not opposed to adding a second all_fields_unquoted function as described, as that is what I originally had coded. I used a parameter instead since the two functions differ just in the value of the parameter, and all_fields is actually crate-local and so the change is entirely contained within vector-core.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plain bool parameters in utils can be an anti-pattern. Also, eventually all these iterator should return OwnedTargetPaths. Are the paths returned by this new iterator parsable with parse_target_path?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have split up the all_fields function. No, the unquoted paths are no longer parsable. Technically, I don't think the current paths are all parsable neither if they contain quotes since those are not escaped.

@datadog-vectordotdev
Copy link

datadog-vectordotdev bot commented Sep 19, 2024

Datadog Report

Branch report: bruceg/OPA-2327-fix-new-relic-event-quoting
Commit report: a03b8bc
Test service: vector

✅ 0 Failed, 7 Passed, 0 Skipped, 25.47s Total Time

Copy link
Contributor

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@bruceg bruceg added this pull request to the merge queue Sep 19, 2024
Copy link

Regression Detector Results

Run ID: 489592f2-7b93-45ec-8332-51d7e37c3e99 Metrics dashboard

Baseline: 8238e5a
Comparison: 141ea8c

Performance changes are noted in the perf column of each table:

  • ✅ = significantly better comparison variant performance
  • ❌ = significantly worse comparison variant performance
  • ➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Experiments ignored for regressions

Regressions in experiments with settings containing erratic: true are ignored.

perf experiment goal Δ mean % Δ mean % CI links
file_to_blackhole egress throughput +14.45 [+6.87, +22.03]

Fine details of change detection per experiment

perf experiment goal Δ mean % Δ mean % CI links
file_to_blackhole egress throughput +14.45 [+6.87, +22.03]
http_text_to_http_json ingress throughput +3.89 [+3.78, +4.01]
datadog_agent_remap_blackhole_acks ingress throughput +2.83 [+2.70, +2.95]
socket_to_socket_blackhole ingress throughput +2.82 [+2.74, +2.89]
otlp_http_to_blackhole ingress throughput +1.57 [+1.42, +1.72]
syslog_humio_logs ingress throughput +1.51 [+1.39, +1.63]
otlp_grpc_to_blackhole ingress throughput +1.38 [+1.26, +1.49]
fluent_elasticsearch ingress throughput +1.18 [+0.69, +1.68]
datadog_agent_remap_blackhole ingress throughput +0.96 [+0.84, +1.07]
syslog_loki ingress throughput +0.91 [+0.82, +0.99]
http_to_s3 ingress throughput +0.49 [+0.21, +0.77]
http_to_http_noack ingress throughput +0.10 [+0.04, +0.17]
http_to_http_json ingress throughput +0.05 [+0.00, +0.10]
splunk_hec_to_splunk_hec_logs_noack ingress throughput +0.01 [-0.08, +0.10]
splunk_hec_to_splunk_hec_logs_acks ingress throughput -0.00 [-0.10, +0.10]
splunk_hec_indexer_ack_blackhole ingress throughput -0.01 [-0.09, +0.07]
datadog_agent_remap_datadog_logs ingress throughput -0.18 [-0.37, +0.00]
syslog_log2metric_splunk_hec_metrics ingress throughput -0.24 [-0.35, -0.13]
datadog_agent_remap_datadog_logs_acks ingress throughput -0.38 [-0.54, -0.21]
syslog_log2metric_tag_cardinality_limit_blackhole ingress throughput -0.40 [-0.49, -0.31]
http_to_http_acks ingress throughput -0.55 [-1.77, +0.68]
syslog_splunk_hec_logs ingress throughput -0.60 [-0.69, -0.50]
http_elasticsearch ingress throughput -2.20 [-2.39, -2.01]
splunk_hec_route_s3 ingress throughput -2.63 [-2.96, -2.30]
syslog_log2metric_humio_metrics ingress throughput -2.65 [-2.76, -2.55]
syslog_regex_logs2metric_ddmetrics ingress throughput -2.89 [-3.03, -2.75]

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

  1. Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.

  2. Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.

  3. Its configuration does not mark it "erratic".

Merged via the queue into master with commit 141ea8c Sep 19, 2024
80 checks passed
@bruceg bruceg deleted the bruceg/OPA-2327-fix-new-relic-event-quoting branch September 19, 2024 18:42
AndrooTheChen pushed a commit to discord/vector that referenced this pull request Sep 23, 2024
…ent API (vectordotdev#21323)

* fix(new_relic sink): Do not quote paths containing periods for the event API

This could not be accomplished the same way as vectordotdev#21305 since the event API cannot
handle nested JSON data. Instead, a new `LogEvent::convert_to_fields_unquoted`
method was added to produce the flattened data without quoting.

* Fix and add tests

* Drop the extra parameter from `all_fields`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: core Anything related to core crates i.e. vector-core, core-common, etc domain: logs Anything related to Vector's log events domain: sinks Anything related to the Vector's sinks sink: new_relic Anything `new_relic` sink related type: bug A code related bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants