-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add subcommand stats to support data stream analysis. #134
Conversation
src/bin/ion/commands/stats.rs
Outdated
.expect("There is an error occurred while writing the symtab_count."); | ||
writeln!( | ||
writer, | ||
"The maximum depth of the input data stream is {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The maximum depth of the input data stream is {}", | |
"The maximum container depth is {}", |
src/bin/ion/commands/stats.rs
Outdated
let histogram = plot::Histogram::new(&out.size_vec, options); | ||
writeln!( | ||
writer, | ||
"The 'samples' field represents the total number of top-level value of input data stream. The unit of min, max ,avg size is bytes.\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The 'samples' field represents the total number of top-level value of input data stream. The unit of min, max ,avg size is bytes.\n\ | |
"The 'samples' field represents the total number of top-level value of input data stream. The unit of min, max, avg size is bytes.\n\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than explain what 'samples' means here, let's just rename 'samples' to "top-level values".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. We could always format the Histogram ourselves, but this is probably fine for now.
src/bin/ion/commands/stats.rs
Outdated
.expect("There is an error occurred while writing the symbols_count."); | ||
writeln!( | ||
writer, | ||
"The number of local symbol tables is {} ", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Local symbol tables
"The number of local symbol tables is {} ", | |
"Local symbol tables: {} ", |
src/bin/ion/commands/stats.rs
Outdated
histogram | ||
) | ||
.expect("There is an error occurred while plotting the size distribution of input data stream."); | ||
writeln!(writer, "The number of symbols is {} ", out.symbols_count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writeln!(writer, "The number of symbols is {} ", out.symbols_count) | |
writeln!(writer, "Symbols: {} ", out.symbols_count) |
src/bin/ion/commands/stats.rs
Outdated
.expect("There is an error occurred while writing the symtab_count."); | ||
writeln!( | ||
writer, | ||
"The maximum depth of the input data stream is {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The maximum depth of the input data stream is {}", | |
"Maximum container depth: {}", |
src/bin/ion/commands/stats.rs
Outdated
.expect_list() | ||
.unwrap() | ||
.iter() | ||
.map(|v| top_level_max_depth(v.unwrap(), depth + 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the number of times we've seen the unbounded recursion thing pop up, let's do this without recursion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, using iteration method would be a better option here.
* Using iteration method to calculate the maximum depth of container.
for field in struct_value { | ||
stack.push((field.unwrap().value(), depth + 1)); | ||
} | ||
} else if current_value.ion_type() == IonType::List { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch should include s-expressions too, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for calling it out, s-expressions
should be processed too. In my opinion, there should be another branch to process s-expressions
specifically. Please correct me if I'm wrong, but as I understand it, the current way we read the value out of LazyValue
doesn't allow us to merge both lists
and s-expressions
into one branch. We can only call current_value.read().unwrap().expect_list()
and current_value.read().unwrap().expect_sexp()
separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's fine.
Issue #, if available:
#42
This PR is the updated version of PR#104. Since a lot of refactoring has been done, creating a new PR would be easier to review.
Description of changes:
This PR adds option
stats
to analyze the input binary Ion data stream, allowing users to:This PR also removes
count
sincestats
already includes this information.Test:
A unit test has been added at the end of
stats.rs
Input:
ion --unstable stats test.10n
Output:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.