You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RIPE Atlas probes will occasionally emit hop replies containing an "error" field (Appearing in 0.019% of traceroute measurements I checked). This behavior is not documented for any firmware version on https://atlas.ripe.net/docs/apis/result-format/. Upon some investigation, there are a few places in the measurement code where this field can be produced. According to git blame, this has and continues to be part of the probe behavior for at least 10 years now.
At the moment I am somewhat unsure if this is an issue with the documentation or probe code. I can think of a few arguments for each side.
[For bug in code] This field performs a very similar function to the error field on traceroute hops. Errors described by this field appear to always be fatal and will usually (but not always) be the only reply in a given hop. Given that they both have the same name and value type, there is a decent chance that this field was simply added to the wrong object by accident. They should be consolidated to a single field to simplify parsing of the data. Additionally, leaving this error inside of hop replies may give the false impression that this error is recoverable when that does not appear to have been the case.
[For documentation] This field has likely been appearing in data for the past 10 years now without issue. It is easily identifiable in the output JSON and has a consistent value type and behavior. Since replies containing an error do not contain any other fields, this closely mirrors the naming and fields of ping result variants having Timeout/Error/Reply (As compared to the current Timeout/Reply variants specified for hop replies). Additionally, marking an entire hop as having errored may be misleading as valid replies could have been recorded for that hop prior to the error.
Effected Measurement Examples
I dumped every measurement from one of the RIPE Atlas hourly traceroute dump files (traceroute-2022-10-14T0400.bz2) that contained this field. This data is stored as newline delimited JSON and can be found in this gist. This appears to be extremely rare and only occurred in 1690 of the 8,912,306 traceroute measurements from that data file (0.019%).
Examples
Here are a couple measurements I arbitrarily chose to show off what they look like in the context of the data.
As far as I have seen, this field is always a string when it appears. Here are the number of occurrences for all the different values I found when dumping the effected measurements.
count value
9 "bind failed: Address already in use"
11 "bind failed: Address not available"
47 "bind failed: Cannot assign requested address"
334 "bind failed: Invalid argument"
804 "sendto failed: Network is unreachable"
104 "sendto failed: Network unreachable"
364 "sendto failed: Operation not permitted"
17 "sendto failed: Permission denied"
Tests
I found this field when I was in the process of writing some code to parse measurements in Rust with serde. To verify that I had written my tests correctly I had it attempt to parse every measurement in one of the hourly data dumps of traceroute measurements. Thanks to Rust and serde, this code is able to detect and verify the following aspects when parsing the input data. However the error field was the only* notable inconsistency it found with the API documentation.
All fields are required to be present in the input data unless explicitly marked optional.
No unexpected fields may be present in the input data. (Thanks to #[serde(deny_unknown_fields)]).
Field types must be enforced. Implicit conversions between types such as strings, integers, booleans, and floats would result in an error.
Mutually exclusive fields**. For example, traceroute replies must contain either rtt or late, but not both.
Objects that could be parsed as one of multiple variants must satisfy a single variant. (Ex: The timeout case can not overlap with the reply case)
[Partial] Fields such as af and proto that only have a couple possible values must be one of the specified values. I only enforced this check on some fields. Fields such as type and flags were left to read any string since they were not necessary for my use case.
* Excluding the bug where empty objects could be emitted alongside traceroute hop replies.
** When making error and result mutually exclusive in traceroute hops, I decided to give a pass to hop sometimes not appearing during an error.
The text was updated successfully, but these errors were encountered:
Issue
RIPE Atlas probes will occasionally emit hop replies containing an
"error"
field (Appearing in 0.019% of traceroute measurements I checked). This behavior is not documented for any firmware version on https://atlas.ripe.net/docs/apis/result-format/. Upon some investigation, there are a few places in the measurement code where this field can be produced. According to git blame, this has and continues to be part of the probe behavior for at least 10 years now.At the moment I am somewhat unsure if this is an issue with the documentation or probe code. I can think of a few arguments for each side.
error
field on traceroute hops. Errors described by this field appear to always be fatal and will usually (but not always) be the only reply in a given hop. Given that they both have the same name and value type, there is a decent chance that this field was simply added to the wrong object by accident. They should be consolidated to a single field to simplify parsing of the data. Additionally, leaving this error inside of hop replies may give the false impression that this error is recoverable when that does not appear to have been the case.Effected Measurement Examples
I dumped every measurement from one of the RIPE Atlas hourly traceroute dump files (traceroute-2022-10-14T0400.bz2) that contained this field. This data is stored as newline delimited JSON and can be found in this gist. This appears to be extremely rare and only occurred in 1690 of the 8,912,306 traceroute measurements from that data file (0.019%).
Examples
Here are a couple measurements I arbitrarily chose to show off what they look like in the context of the data.
Values
As far as I have seen, this field is always a string when it appears. Here are the number of occurrences for all the different values I found when dumping the effected measurements.
Tests
I found this field when I was in the process of writing some code to parse measurements in Rust with serde. To verify that I had written my tests correctly I had it attempt to parse every measurement in one of the hourly data dumps of traceroute measurements. Thanks to Rust and serde, this code is able to detect and verify the following aspects when parsing the input data. However the
error
field was the only* notable inconsistency it found with the API documentation.#[serde(deny_unknown_fields)]
).rtt
orlate
, but not both.af
andproto
that only have a couple possible values must be one of the specified values. I only enforced this check on some fields. Fields such astype
andflags
were left to read any string since they were not necessary for my use case.* Excluding the bug where empty objects could be emitted alongside traceroute hop replies.
** When making
error
andresult
mutually exclusive in traceroute hops, I decided to give a pass tohop
sometimes not appearing during an error.The text was updated successfully, but these errors were encountered: