Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat_: batch all telemetry data and send request every 10 seconds #5251

Merged
merged 4 commits into from
Jun 13, 2024

Conversation

adklempner
Copy link
Contributor

@adklempner adklempner commented May 28, 2024

Instead of making an http request for every individual telemetric, push each to a channel and periodically batch in a single request.

A description to understand introduced changes without reading the code.

Important changes:

Closes #5240

@status-im-auto
Copy link
Member

status-im-auto commented May 28, 2024

Jenkins Builds

Click to see older builds (46)
Commit #️⃣ Finished (UTC) Duration Platform Result
✖️ 28b4307 #1 2024-05-28 23:17:19 ~2 min tests 📄log
✔️ 28b4307 #1 2024-05-28 23:19:11 ~4 min linux 📦zip
✔️ 28b4307 #1 2024-05-28 23:20:09 ~5 min ios 📦zip
✔️ 28b4307 #1 2024-05-28 23:20:42 ~6 min android 📦aar
✖️ 32ea02b #2 2024-06-05 04:47:56 ~2 min tests 📄log
✔️ 32ea02b #2 2024-06-05 04:49:12 ~4 min linux 📦zip
✔️ 32ea02b #2 2024-06-05 04:50:48 ~5 min ios 📦zip
✔️ 32ea02b #2 2024-06-05 04:51:15 ~6 min android 📦aar
✔️ b9f45de #3 2024-06-06 20:09:35 ~2 min linux 📦zip
✔️ b9f45de #3 2024-06-06 20:10:10 ~3 min ios 📦zip
✔️ b9f45de #3 2024-06-06 20:12:33 ~5 min android 📦aar
✖️ b9f45de #3 2024-06-06 20:09:49 ~2 min tests 📄log
✖️ 60a4e7f #4 2024-06-06 20:21:43 ~1 min tests 📄log
✔️ 60a4e7f #4 2024-06-06 20:22:48 ~2 min linux 📦zip
✔️ 60a4e7f #4 2024-06-06 20:23:04 ~2 min android 📦aar
✔️ 60a4e7f #4 2024-06-06 20:23:45 ~3 min ios 📦zip
✖️ fda8b07 #5 2024-06-06 20:25:14 ~1 min tests 📄log
✔️ fda8b07 #5 2024-06-06 20:26:44 ~3 min android 📦aar
✔️ fda8b07 #5 2024-06-06 20:27:13 ~3 min linux 📦zip
✔️ fda8b07 #5 2024-06-06 20:27:45 ~3 min ios 📦zip
✖️ 017f006 #6 2024-06-06 20:37:57 ~1 min tests 📄log
✔️ 017f006 #6 2024-06-06 20:39:14 ~2 min linux 📦zip
✔️ 017f006 #6 2024-06-06 20:39:41 ~2 min android 📦aar
✔️ 017f006 #6 2024-06-06 20:39:50 ~3 min ios 📦zip
✖️ 2982f05 #7 2024-06-07 18:27:37 ~1 min tests 📄log
✔️ 2982f05 #7 2024-06-07 18:28:09 ~1 min android 📦aar
✔️ 2982f05 #7 2024-06-07 18:28:19 ~2 min linux 📦zip
✔️ 2982f05 #7 2024-06-07 18:29:20 ~3 min ios 📦zip
✔️ 5f3eac3 #8 2024-06-12 21:12:47 ~3 min linux 📦zip
✔️ 5f3eac3 #8 2024-06-12 21:13:16 ~4 min ios 📦zip
✔️ 5f3eac3 #8 2024-06-12 21:14:04 ~5 min android 📦aar
✔️ 605ab69 #9 2024-06-12 21:44:51 ~2 min android 📦aar
✔️ 605ab69 #9 2024-06-12 21:46:11 ~3 min linux 📦zip
✔️ 605ab69 #11 2024-06-13 02:23:11 ~3 min ios 📦zip
✔️ 5f3eac3 #8 2024-06-12 21:50:31 ~41 min tests 📄log
✔️ 605ab69 #9 2024-06-12 22:31:58 ~41 min tests 📄log
✖️ be1081a #10 2024-06-13 18:21:37 ~1 min tests 📄log
✖️ be1081a #11 2024-06-13 18:53:10 ~1 min tests 📄log
✖️ be1081a #12 2024-06-13 21:14:39 ~17 sec tests 📄log
✔️ be1081a #10 2024-06-13 18:22:37 ~2 min linux 📦zip
✔️ be1081a #12 2024-06-13 18:23:21 ~2 min ios 📦zip
✔️ be1081a #10 2024-06-13 18:24:59 ~4 min android 📦aar
✔️ 60d38d2 #11 2024-06-13 21:20:28 ~2 min android 📦aar
✔️ 60d38d2 #13 2024-06-13 21:20:40 ~2 min ios 📦zip
✔️ 60d38d2 #11 2024-06-13 21:20:50 ~2 min linux 📦zip
✖️ 60d38d2 #13 2024-06-13 21:20:35 ~2 min tests 📄log
Commit #️⃣ Finished (UTC) Duration Platform Result
✔️ 8582881 #12 2024-06-13 21:30:00 ~1 min android 📦aar
✔️ 8582881 #12 2024-06-13 21:30:34 ~2 min linux 📦zip
✔️ 8582881 #14 2024-06-13 21:31:16 ~2 min ios 📦zip
✔️ 8582881 #14 2024-06-13 22:08:58 ~40 min tests 📄log

Copy link
Member

@richard-ramos richard-ramos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR!
Just left few comments and questions for your consideration

@@ -50,5 +50,7 @@ func (c *BandwidthTelemetryClient) PushProtocolStats(relayStats metrics.Stats, s
_, err := c.httpClient.Post(url, "application/json", bytes.NewBuffer(body))
if err != nil {
c.logger.Error("Error sending message to telemetry server", zap.Error(err))
} else {
c.logger.Debug("Successfully pushed protocol stats to telemetry server")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be too noisy since this is the expected result. Consider not logging this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in latest commit

@@ -96,5 +233,7 @@ func (c *Client) UpdateEnvelopeProcessingError(shhMessage *types.Message, proces
_, err := c.httpClient.Post(url, "application/json", bytes.NewBuffer(body))
if err != nil {
c.logger.Error("Error sending envelope update to telemetry server", zap.Error(err))
} else {
c.logger.Debug("Successfully pushed envelope processing error to telemetry server", zap.String("hash", types.EncodeHex(shhMessage.Hash)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be too noisy due to it being the expected result. Consider not logging this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in latest commit

}()
}

func (c *Client) Start() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pass a context to the Start() function?
We could use it for killing the go routine once logout happens, i.e.:

select {
...
case <-ctx.Done():
    return
...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in latest commit

}
}

func (c *Client) CollectAndProcessTelemetry() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should accept a context here too to exit the goroutine on logout and avoid having a dangling goroutine? (The messenger should have a m.ctx context you could use)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in latest commit

}

func (c *Client) PushReceivedMessages(filter transport.Filter, sshMessage *types.Message, messages []*v1protocol.StatusMessage) {
func (c *Client) pushTelemetryRequest(request []TelemetryRequest) {
c.logger.Debug("Pushing telemetry data", zap.Any("request", request))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider not logging this unless it's strictly necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in latest commit

@adklempner adklempner force-pushed the feat/telemetry-channels branch 6 times, most recently from 2982f05 to 5f3eac3 Compare June 12, 2024 21:08
@adklempner adklempner force-pushed the feat/telemetry-channels branch 3 times, most recently from be1081a to 60d38d2 Compare June 13, 2024 21:17
@adklempner adklempner merged commit 1bbb253 into develop Jun 13, 2024
8 of 11 checks passed
@adklempner adklempner deleted the feat/telemetry-channels branch June 13, 2024 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: had noticed in multiple instances that enabling telemetry is causing message loss
4 participants