Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling tracing by usage of opentelemetry #99

Open
wants to merge 4 commits into
base: new-index
Choose a base branch
from

Conversation

rem1-dev
Copy link

I've added tracing to the project by introducing mainly telemetry, opentelemetry and Tokyo crates. Having tracing allows analysis of bottleneck functions (functions that take majority of execution time).

@RCasatta RCasatta changed the title Enabling tracing by usage of opentelemtry Enabling tracing by usage of opentelemetry Aug 15, 2024
@RCasatta
Copy link
Collaborator

Isn't prometheus trying to achieve the same thing? Should we remove that if we decide to use opentelemetry?

@philippem
Copy link
Collaborator

Isn't prometheus trying to achieve the same thing? Should we remove that if we decide to use opentelemetry?

  • prometheus gives us metrics ( example there is a histogram of times spent making RPC calls to bitcoind). Many alerts can be directly built on prometheus metrics.
  • opentelemtry (OTLP) gives us tracing within and between processes (for example we can look at a mempool update operation and see where the time is being spent)

They both enhance observability but are complementary.

@shesek
Copy link
Collaborator

shesek commented Aug 24, 2024

Isn't prometheus trying to achieve the same thing? Should we remove that if we decide to use opentelemetry?

...
They both enhance observability but are complementary.

I'm still not quite sure I understand the differences. Prometheus can and is being used to track function execution times, for example in these functions (via _timer and its Drop) where opentelemetry was added too:

+    #[instrument(skip_all, name="Mempool::utxo")]
     pub fn utxo(&self, scripthash: &[u8]) -> Vec<Utxo> {
         let _timer = self.latency.with_label_values(&["utxo"]).start_timer();
         let entries = match self.history.get(scripthash) {
@@ -209,6 +216,7 @@ impl Mempool {
             .collect()
     }
 
+    #[instrument(skip_all, name="Mempool::stats")]
     // @XXX avoid code duplication with ChainQuery::stats()?
     pub fn stats(&self, scripthash: &[u8]) -> ScriptStats {
         let _timer = self.latency.with_label_values(&["stats"]).start_timer();
@@ -258,12 +266,14 @@ impl Mempool {
         stats
     }
 
+    #[instrument(skip_all, name="Mempool::txids")]
     // Get all txids in the mempool
     pub fn txids(&self) -> Vec<&Txid> {
         let _timer = self.latency.with_label_values(&["txids"]).start_timer();
         self.txstore.keys().collect()
     }

@@ -16,7 +16,7 @@ use electrs::{
errors::*,
metrics::Metrics,
new_index::{precache, ChainQuery, FetchFrom, Indexer, Mempool, Query, Store},
rest,
otlp_trace, rest,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing newline (should probably cargo fmt the other changes too)

@@ -623,7 +643,7 @@ impl Connection {

pub fn run(mut self, receiver: Receiver<Message>) {
self.stats.clients.inc();
conditionally_log_rpc_event!(self, json!({ "event": "connection established" }));
//conditionally_log_rpc_event!(self, json!({ "event": "connection established" }));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this intentionally disabled?

@@ -378,7 +378,7 @@ impl Config {
} else {
stderrlog::Timestamp::Off
});
log.init().expect("logging initialization failed");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Cargo.toml Outdated
@@ -51,7 +51,13 @@ url = "2.2.0"
hyper = "0.14"
hyperlocal = "0.8"
# close to same tokio version as dependent by hyper v0.14 and hyperlocal 0.8 -- things can go awry if they mismatch
tokio = { version = "1", features = ["sync", "macros"] }
tokio = { version = "1", features = ["sync", "macros", "rt-multi-thread", "rt"] }
opentelemetry = { version = "0.20.0", features = ["rt-tokio"] }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 new (direct) dependencies, oh my 😅

Should we perhaps put this behind a feature flag so that the extra dependencies could be avoided by users that don't require tracing?

@shesek
Copy link
Collaborator

shesek commented Aug 24, 2024

It's not critical, but it could aid review if the addition of #[instrument()] attributes was separated to its own commit

junderw pushed a commit to junderw/electrs that referenced this pull request Sep 4, 2024
…-type

Add support for anchor output type
@@ -16,6 +16,8 @@ use serde_json::{from_str, from_value, Value};
use bitcoin::consensus::encode::{deserialize, serialize_hex};
#[cfg(feature = "liquid")]
use elements::encode::{deserialize, serialize_hex};
#[cfg(feature = "tracing-enabled")]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the feature name should maybe be "otlp-tracing" or "tracing". "enabled" is what we do with it.

@@ -14,6 +14,14 @@ edition = "2018"
[features]
liquid = [ "elements" ]
electrum-discovery = [ "electrum-client"]
tracing-enabled = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's call this "tracing"

@shesek
Copy link
Collaborator

shesek commented Sep 20, 2024

How do you feel about defining a no-op #[instrument] macro when the tracing feature is disabled, so that we don't have to wrap it in #[cfg_attr(feature = "tracing",... )) every time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants