Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT - DO NOT MERGE] Basic OpenTelemetry integration #1201

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2,469 changes: 1,832 additions & 637 deletions Cargo.lock

Large diffs are not rendered by default.

13 changes: 13 additions & 0 deletions dropshot/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,11 @@ tokio-rustls = "0.25.0"
toml = "0.8.19"
waitgroup = "0.1.2"

opentelemetry = { version = "0.27", optional = true }
opentelemetry-http = { version = "0.27", features = ["hyper"], optional = true }
opentelemetry-semantic-conventions = { version = "0.27", optional = true }
tracing = { version = "0.1", optional = true }

[dependencies.chrono]
version = "0.4.38"
features = [ "serde", "std", "clock" ]
Expand Down Expand Up @@ -108,6 +113,8 @@ pem = "3.0"
rcgen = "0.13.1"
# Used in a doc-test demonstrating the WebsocketUpgrade extractor.
tokio-tungstenite = "0.24.0"
# Used in otel-tracing examples
equinix-otel-tools = { git = "https://github.com/nshalman/rust-otel-tools", rev = "cae47d6a0a891bc095a2d920e2bf2ccbda2be178" }

[dev-dependencies.reqwest]
version = "0.12.9"
Expand Down Expand Up @@ -136,6 +143,12 @@ version_check = "0.9.5"
[features]
usdt-probes = ["usdt/asm"]
internal-docs = ["simple-mermaid"]
otel-tracing = ["opentelemetry", "opentelemetry-http", "opentelemetry-semantic-conventions"]
tokio-tracing = ["tracing"]

[package.metadata.docs.rs]
features = ["internal-docs"]

[[example]]
name = "otel"
required-features = ["otel-tracing"]
4 changes: 4 additions & 0 deletions dropshot/examples/multiple-servers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,10 @@ use tokio::sync::Mutex;

#[tokio::main]
async fn main() -> Result<(), String> {
// XXX Is there interest in adding the optional integration into some of the existing examples?
#[cfg(feature = "otel-tracing")]
let _otel_guard = equinix_otel_tools::init(env!("CARGO_CRATE_NAME"));

// Initial set of servers to start. Once they're running, we may add or
// remove servers based on client requests.
let initial_servers = [("A", "127.0.0.1:12345"), ("B", "127.0.0.1:12346")];
Expand Down
282 changes: 282 additions & 0 deletions dropshot/examples/otel.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,282 @@
// Copyright 2024 Oxide Computer Company
//! Example use of Dropshot with OpenTelemetry integration.
//!
//! equinix-otel-tools will parse the standard otel exporter
//! environment variables, e.g.
//! If you launch an otel-collector or otel-enabled jaeger-all-in-one
//! listening for otlp over grpc, then you can do:
//!
//! ```bash
//! export OTEL_EXPORTER_OTLP_ENDPOINT=grpc://localhost:4317
//! export OTEL_EXPORTER_OTLP_INSECURE=true
//! export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
//! cargo run --features=otel-tracing --example otel&
//! curl http://localhost:4000/get
//! ```
//!
//! And you should see an example trace.

use dropshot::endpoint;
use dropshot::ApiDescription;
use dropshot::ConfigDropshot;
use dropshot::ConfigLogging;
use dropshot::ConfigLoggingLevel;
use dropshot::HttpError;
use dropshot::HttpResponseOk;
use dropshot::HttpResponseUpdatedNoContent;
use dropshot::HttpServerStarter;
use dropshot::RequestContext;
use dropshot::TypedBody;
use schemars::JsonSchema;
use serde::Deserialize;
use serde::Serialize;
use std::sync::atomic::AtomicU64;
use std::sync::atomic::Ordering;

use http_body_util::Full;
use hyper_util::{client::legacy::Client, rt::TokioExecutor};
use opentelemetry::{
global,
trace::{SpanKind, TraceContextExt, Tracer},
Context,
};
use opentelemetry_http::{Bytes, HeaderInjector};

#[tokio::main]
async fn main() -> Result<(), String> {
let _otel_guard = equinix_otel_tools::init("otel-demo");

let config_dropshot = ConfigDropshot {
bind_address: "127.0.0.1:4000".parse().unwrap(),
..Default::default()
};

// For simplicity, we'll configure an "info"-level logger that writes to
// stderr assuming that it's a terminal.
let config_logging =
ConfigLogging::StderrTerminal { level: ConfigLoggingLevel::Info };
let log = config_logging
.to_logger("example-basic")
.map_err(|error| format!("failed to create logger: {}", error))?;

// Build a description of the API.
let mut api = ApiDescription::new();
api.register(example_api_get_counter).unwrap();
api.register(example_api_put_counter).unwrap();
api.register(example_api_get).unwrap();
api.register(example_api_error).unwrap();
api.register(example_api_panic).unwrap();

// The functions that implement our API endpoints will share this context.
let api_context = ExampleContext::new();

// Set up the server.
let server =
HttpServerStarter::new(&config_dropshot, api, api_context, &log)
.map_err(|error| format!("failed to create server: {}", error))?
.start();

// Wait for the server to stop. Note that there's not any code to shut down
// this server, so we should never get past this point.
let _ = server.await;
Ok(())
}

/// Application-specific example context (state shared by handler functions)
#[derive(Debug)]
struct ExampleContext {
/// counter that can be manipulated by requests to the HTTP API
counter: AtomicU64,
}

impl ExampleContext {
/// Return a new ExampleContext.
pub fn new() -> ExampleContext {
ExampleContext { counter: AtomicU64::new(0) }
}
}

// HTTP API interface

/// `CounterValue` represents the value of the API's counter, either as the
/// response to a GET request to fetch the counter or as the body of a PUT
/// request to update the counter.
#[derive(Debug, Deserialize, Serialize, JsonSchema)]
struct CounterValue {
counter: u64,
}

/// Helper function for propagating a traceparent using hyper
async fn traced_request(
uri: &str,
cx: &Context,
) -> hyper::Request<Full<Bytes>> {
let mut req = hyper::Request::builder()
.uri(uri)
.method(hyper::Method::GET)
.header("accept", "application/json")
.header("content-type", "application/json");
global::get_text_map_propagator(|propagator| {
propagator.inject_context(
&cx,
&mut HeaderInjector(req.headers_mut().unwrap()),
)
});
req.body(Full::new(Bytes::from("".to_string()))).unwrap()
}

/// Do a bunch of work to show off otel tracing
#[endpoint {
method = GET,
path = "/get",
}]
#[cfg_attr(feature = "tokio-tracing", tracing::instrument(skip_all))]
async fn example_api_get(
rqctx: RequestContext<ExampleContext>,
) -> Result<HttpResponseOk<CounterValue>, HttpError> {
let trace_context = opentelemetry::Context::new();
let parent_context =
opentelemetry::trace::TraceContextExt::with_remote_span_context(
&trace_context,
rqctx.span_context.clone(),
);

let client = Client::builder(TokioExecutor::new()).build_http();
let tracer = global::tracer("");
let span = tracer
.span_builder(String::from("example_api_get"))
.with_kind(SpanKind::Internal)
.start_with_context(&tracer, &parent_context);
let cx = Context::current_with_span(span);
//assert!(cx.has_active_span());

let mut req = hyper::Request::builder()
.uri("http://localhost:4000/counter")
.method(hyper::Method::GET)
.header("accept", "application/json")
.header("content-type", "application/json");
global::get_text_map_propagator(|propagator| {
propagator.inject_context(
&cx,
&mut HeaderInjector(req.headers_mut().unwrap()),
)
});
let _res = client
.request(req.body(Full::new(Bytes::from("".to_string()))).unwrap())
.await;

let mut req = hyper::Request::builder()
.uri("http://localhost:4000/counter")
.method(hyper::Method::GET)
.header("accept", "application/json")
.header("content-type", "application/json");
global::get_text_map_propagator(|propagator| {
propagator.inject_context(
&cx,
&mut HeaderInjector(req.headers_mut().unwrap()),
)
});
let _res = client
.request(req.body(Full::new(Bytes::from("".to_string()))).unwrap())
.await;

let mut req = hyper::Request::builder()
.uri("http://localhost:4000/does-not-exist")
.method(hyper::Method::GET)
.header("accept", "application/json")
.header("content-type", "application/json")
.header("user-agent", "dropshot-otel-example");
global::get_text_map_propagator(|propagator| {
propagator.inject_context(
&cx,
&mut HeaderInjector(req.headers_mut().unwrap()),
)
});
let _res = client
.request(req.body(Full::new(Bytes::from("".to_string()))).unwrap())
.await;

let req = traced_request("http://localhost:4000/error", &cx).await;
let _res = client.request(req).await;

let req = traced_request("http://localhost:4000/panic", &cx).await;
let _res = client.request(req).await;

let api_context = rqctx.context();
Ok(HttpResponseOk(CounterValue {
counter: api_context.counter.load(Ordering::SeqCst),
}))
}

/// Fetch the current value of the counter.
#[endpoint {
method = GET,
path = "/counter",
}]
async fn example_api_get_counter(
rqctx: RequestContext<ExampleContext>,
) -> Result<HttpResponseOk<CounterValue>, HttpError> {
let api_context = rqctx.context();

Ok(HttpResponseOk(CounterValue {
counter: api_context.counter.load(Ordering::SeqCst),
}))
}

/// Cause an error!
#[endpoint {
method = GET,
path = "/error",
}]
async fn example_api_error(
_rqctx: RequestContext<ExampleContext>,
) -> Result<HttpResponseOk<CounterValue>, HttpError> {
//XXX Why does this create a 499 rather than a 500 error???
// This feels like a bug in dropshot. As a downstream consumer
// I just want anything blowing up in my handler to be a somewhat useful 500 error.
// It does help that the compiler is strict about what can otherwise be returned...
//panic!("This handler is totally broken!");

Err(HttpError::for_internal_error("This endpoint is broken".to_string()))
}

/// This endpoint panics!
#[endpoint {
method = GET,
path = "/panic",
}]
#[cfg_attr(feature = "tokio-tracing", tracing::instrument(skip_all, err))]
async fn example_api_panic(
_rqctx: RequestContext<ExampleContext>,
) -> Result<HttpResponseOk<CounterValue>, HttpError> {
//XXX Why does this create a 499 rather than a 500 error???
// This feels like a bug in dropshot. As a downstream consumer
// I just want anything blowing up in my handler to be a somewhat useful 500 error.
// It does help that the compiler is strict about what can otherwise be returned...
panic!("This handler is totally broken!");
}

/// Update the current value of the counter. Note that the special value of 10
/// is not allowed (just to demonstrate how to generate an error).
#[endpoint {
method = PUT,
path = "/counter",
}]
#[cfg_attr(feature = "tokio-tracing", tracing::instrument(skip_all, err))]
async fn example_api_put_counter(
rqctx: RequestContext<ExampleContext>,
update: TypedBody<CounterValue>,
) -> Result<HttpResponseUpdatedNoContent, HttpError> {
let api_context = rqctx.context();
let updated_value = update.into_inner();

if updated_value.counter == 10 {
Err(HttpError::for_bad_request(
Some(String::from("BadInput")),
format!("do not like the number {}", updated_value.counter),
))
} else {
api_context.counter.store(updated_value.counter, Ordering::SeqCst);
Ok(HttpResponseUpdatedNoContent())
}
}
15 changes: 15 additions & 0 deletions dropshot/src/handler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,8 @@ pub struct RequestContext<Context: ServerContext> {
pub log: Logger,
/// basic request information (method, URI, etc.)
pub request: RequestInfo,
#[cfg(feature = "otel-tracing")]
pub span_context: opentelemetry::trace::SpanContext,
Comment on lines +89 to +90
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is gross and I need something better.

}

// This is deliberately as close to compatible with `hyper::Request` as
Expand Down Expand Up @@ -377,6 +379,19 @@ where
}
}

impl std::fmt::Display for HandlerError {
fn fmt(&self, f: &mut Formatter) -> FmtResult {
write!(
f,
"{}",
match self {
Self::Handler { ref message, .. } => message,
Self::Dropshot(ref e) => &e.external_message,
}
)
}
}

Comment on lines +382 to +394
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is for tokio-tracing and could potentially be dropped or separated into a dedicated commit. Really all the tokio-tracing bits should be a separate commit now that I think about it.

/// An error type which can be converted into an HTTP response.
///
/// The error types returned by handlers must implement this trait, so that a
Expand Down
2 changes: 2 additions & 0 deletions dropshot/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -834,6 +834,8 @@ mod from_map;
mod handler;
mod http_util;
mod logging;
#[cfg(feature = "otel-tracing")]
mod otel;
mod pagination;
mod router;
mod schema_util;
Expand Down
Loading