Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUMM-1744 Add E2E tests for Kronos (NTP) #703

Conversation

ncreated
Copy link
Member

What and why?

⚙️🧪 This PR adds E2E tests for Kronos and its NTP sync logic. It removes tests which perform real UDP from unit tests target and moves them to E2E target which is ran nightly.

This is to:

  • isolate tests depending on real networking from unit tests bundle (to mitigate flakiness);
  • keep unit tests fast and stable (both in local and CI);
  • prepare the room for adding more telemetry to Kronos execution (we can now additionally instrument these E2E tests to add checks on resolved IPs and ensure it never leads to sending UDP to local IP).

How?

I added tests for high level Kronos APIs:

  • KronosClock.sync() - to test it against all 4 Datadog NTP pools;
  • KronosNTPClient.query(pool:) - to test it against all IPv4 and IPv6 addresses from 2.datadog.pool.ntp.org;

Each test sends either INFO or WARN log upon its completion. This will be later (RUMM-1859) used to create E2E monitors in Mobile Integration org:

Screenshot 2021-12-30 at 19 21 26

Similar, each test is measured by performance span, so we can understand how fast Kronos performs its job, e.g.:

Screenshot 2021-12-30 at 19 11 13

Later (next PRs), we will use this system to send more logs regarding internal telemetry from Kronos to better understand its performance on more data points from nightly runs. Hopefully this will led to more conclusion for #647. In parallel, we will send similar telemetry from our dogfood project to collect samples from Datadog iOS app in production.

Review checklist

  • Feature or bugfix MUST have appropriate tests (unit, integration)
  • Make sure each commit and the PR mention the Issue number or JIRA reference

so it can be monitored in nightly runs.
@ncreated ncreated requested a review from a team as a code owner December 30, 2021 18:27
@ncreated ncreated self-assigned this Dec 30, 2021
Comment on lines +99 to +106
// Inconsistent result may correspond to flaky execution, e.g. if network was unreachable or if **all** NTP calls received timeout.
// We track inconsistent result as WARN log that will be watched by E2E monitor.
logger.warn("KronosClock.sync() completed with inconsistent result for \(ddNTPPool)", attributes: [
"serverDate_firstReceived": result.firstReceivedDate.flatMap { iso8601DateFormatter.string(from: $0) },
"serverDate_lastReceived": result.lastReceivedDate.flatMap { iso8601DateFormatter.string(from: $0) },
"serverOffset_firstReceived": result.firstReceivedOffset,
"serverOffset_lastReceived": result.lastReceivedOffset,
])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example WARN log received from here:

Screenshot 2021-12-30 at 19 09 00

Comment on lines +181 to +186
logger.info(
"KronosNTPClient.query(pool:) completed with consistent result receiving \(result.numberOfCompletedSamples)/\(result.expectedNumberOfSamples) NTP packets",
attributes: [
"offsets_received": receivedOffsets
]
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example INFO log received from here:

Screenshot 2021-12-30 at 19 09 25

Copy link
Member

@maxep maxep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

@ncreated ncreated merged commit 3f48e78 into ncreated/RUMM-1744-embed-Kronos-directly-into-SDK Dec 31, 2021
@ncreated ncreated deleted the ncreated/RUMM-1744-stabilize-Kronos-tests branch December 31, 2021 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants