-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add API to write OpenTelemetry logs to GreptimeDB #4755
base: main
Are you sure you want to change the base?
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4755 +/- ##
==========================================
- Coverage 84.54% 84.17% -0.38%
==========================================
Files 1117 1118 +1
Lines 202738 203341 +603
==========================================
- Hits 171408 171159 -249
- Misses 31330 32182 +852 |
PipelineValue::Map(Map { values: map }) | ||
} | ||
|
||
fn build_identity_schema() -> Vec<ColumnSchema> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by identity pipeline, I prefer it's an auto generated pipeline and schema based on input data. Here we already manually defined a schema for otel logs. So we better not to name it and assume it the identity pipeline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've considered ways to automatically generate a pipeline. But it still has to convert the ExportLogsServiceRequest
into a PipelineValue
before it can be processed, so for performance reasons I skipped that step and just converted the ExportLogsServiceRequest into a Rows
fn coerce_nested_value(v: &Value, transform: &Transform) -> Result<Option<ValueData>, String> { | ||
match &transform.type_ { | ||
Value::Array(_) | Value::Map(_) => (), | ||
t => { | ||
return Err(format!( | ||
"nested value type not supported {}", | ||
t.to_str_type() | ||
)) | ||
} | ||
} | ||
match v { | ||
Value::Map(_) => { | ||
let data: jsonb::Value = v.into(); | ||
Ok(Some(ValueData::BinaryValue(data.to_vec()))) | ||
} | ||
Value::Array(_) => { | ||
let data: jsonb::Value = v.into(); | ||
Ok(Some(ValueData::BinaryValue(data.to_vec()))) | ||
} | ||
_ => Err(format!("nested type not support {}", v.to_str_type())), | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type from transform is ignored if it's not a map or array. It feels strange that a type field is written but not used in the configuration. Should we add something like a binary type to be more specific?
pipeline_info: PipelineInfo, | ||
table_info: TableInfo, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should do this custom extractor. We can have headers: HeaderMap
and a function to extract header information.
)), | ||
}, | ||
GreptimeValue { | ||
value_data: Some(ValueData::StringValue(bytes_to_hex_string(&log.trace_id))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// A unique identifier for a trace. All logs from the same trace share
/// the same `trace_id`. The ID is a 16-byte array. An ID with all zeroes OR
/// of length other than 16 bytes is considered invalid (empty string in OTLP/JSON
/// is zero-length and thus is also invalid).
///
/// This field is optional.
///
/// The receivers SHOULD assume that the log record is not associated with a
/// trace if any of the following is true:
/// - the field is not present,
/// - the field contains an invalid value.
#[prost(bytes = "vec", tag = "9")]
pub trace_id: ::prost::alloc::vec::Vec<u8>,
According to the comment, we might want to
- validate the data
- convert it into human-readable string if the data is valid
Same as span_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should store anomalies. We just need to store the data, verifying if it's legit I don't think we need to do.
… type from hashmap to btremap to keep key order
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
What's changed and what's your intention?
Checklist