Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds incompleteness detection tests #819

Merged
merged 3 commits into from
Aug 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 0 additions & 13 deletions src/lazy/encoder/text/v1_0/value_writer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -166,19 +166,6 @@ pub(crate) struct TextContainerWriter_1_0<'a, W: Write> {
trailing_delimiter: &'static str,
}

impl<'a, W: Write> Drop for TextContainerWriter_1_0<'a, W> {
fn drop(&mut self) {
// If the user didn't call `end`, the closing delimiter was not written to output.
// It's too late to call it here because we can't return a `Result`.
if !self.has_been_closed {
panic!(
"Container writer ({:?}) was dropped without calling `end()`.",
self.container_type
);
}
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ I've disabled this check until we address #812.

impl<'a, W: Write> TextContainerWriter_1_0<'a, W> {
pub fn new(
writer: &'a mut LazyRawTextWriter_1_0<W>,
Expand Down
2 changes: 1 addition & 1 deletion src/lazy/expanded/macro_evaluator.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1781,7 +1781,7 @@ mod tests {
(values
(make_string "foo" '''bar''' "\x62\u0061\U0000007A")
(make_string
'''Hello'''
'''Hello'''
''', '''
"world!"))
)
Expand Down
5 changes: 4 additions & 1 deletion src/lazy/expanded/struct.rs
Original file line number Diff line number Diff line change
Expand Up @@ -534,7 +534,10 @@ impl<'top, D: Decoder> ExpandedStructIterator<'top, D> {
evaluator.push(expansion);
let expanded_value = match evaluator.next()? {
Some(item) => item,
None => return IonResult::decoding_error(format!("macros in field name position must produce a single struct; '{:?}' produced nothing", invocation)),
None => {
// The macro produced an empty stream; return to reading from input.
return Ok(());
}
Comment on lines -537 to +540
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ The ion-tests have a few examples where (:void) in struct field name position is elided. My implementation had required output of exactly-one struct. I've loosened it to zero-or-one. I imagine I'll loosen it further to zero-or-more later on.

};
let struct_ = match expanded_value.read()? {
ExpandedValueRef::Struct(s) => s,
Expand Down
404 changes: 312 additions & 92 deletions src/lazy/text/buffer.rs

Large diffs are not rendered by default.

6 changes: 4 additions & 2 deletions src/lazy/text/matched.rs
Original file line number Diff line number Diff line change
Expand Up @@ -309,8 +309,10 @@ impl MatchedDecimal {

let digits_text = sanitized.as_utf8(digits.offset())?;
let magnitude: Int = i128::from_str(digits_text)
.map_err(|_| {
IonError::decoding_error("decimal magnitude was larger than supported size")
.map_err(|e| {
IonError::decoding_error(format!(
"decimal magnitude '{digits_text}' was larger than supported size ({e:?}"
))
})?
.into();

Expand Down
24 changes: 17 additions & 7 deletions src/lazy/text/parse_result.rs
Original file line number Diff line number Diff line change
Expand Up @@ -141,13 +141,23 @@ impl<'data> From<InvalidInputError<'data>> for IonError {
let (buffer_head, buffer_tail) = match input.as_text() {
// The buffer contains UTF-8 bytes, so we'll display it as text
Ok(text) => {
let head = text.chars().take(NUM_CHARS_TO_SHOW).collect::<String>();
let tail_backwards = text
.chars()
.rev()
let mut head_chars = text.chars();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This part of the PR is making sure that the error's Debug formatting only includes a ... in the head or tail output when the buffer has more data than what's shown.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the little things.

let mut head = (&mut head_chars)
.take(NUM_CHARS_TO_SHOW)
.collect::<String>();
if head_chars.next().is_some() {
head.push_str("...");
}
let mut tail_chars = text.chars().rev();
let tail_backwards = (&mut tail_chars)
.take(NUM_CHARS_TO_SHOW)
.collect::<Vec<char>>();
let tail = tail_backwards.iter().rev().collect::<String>();
let mut tail = String::new();
if tail_chars.next().is_some() {
tail.push_str("...");
}
tail.push_str(tail_backwards.iter().rev().collect::<String>().as_str());

(head, tail)
}
// The buffer contains non-text bytes, so we'll show its contents as formatted hex
Expand All @@ -170,8 +180,8 @@ impl<'data> From<InvalidInputError<'data>> for IonError {
message,
r#"
offset={}
buffer head=<{}...>
buffer tail=<...{}>
buffer head=<{}>
buffer tail=<{}>
buffer len={}
"#,
invalid_input_error.input.offset(),
Expand Down
2 changes: 1 addition & 1 deletion src/lazy/text/raw/v1_1/reader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ impl<'a> Debug for LazyRawTextList_1_1<'a> {
#[derive(Debug, Copy, Clone)]
pub struct RawTextListIterator_1_1<'top> {
input: TextBufferView<'top>,
// If this iterator has returned an error, it should return `None` forever afterwards
// If this iterator has returned an error, it should return `None` forever afterward
has_returned_error: bool,
}

Expand Down
48 changes: 48 additions & 0 deletions tests/ion_tests/detect_incomplete_text.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#![cfg(feature = "experimental-reader-writer")]

use crate::ion_tests::{DataStraw, ELEMENT_GLOBAL_SKIP_LIST};
use ion_rs::{AnyEncoding, ElementReader, IonError, IonStream, Reader};
use std::fs;
use std::io::BufReader;
use test_generator::test_resources;

#[test_resources("ion-tests/iontestdata_1_1/good/**/*.ion")]
fn detect_incomplete_input(file_name: &str) {
// Canonicalize the file name so it can be compared to skip list file names without worrying
// about path separators.
let file_name: String = fs::canonicalize(file_name)
.unwrap()
.to_string_lossy()
.into();
// Map each 1.0 skip list file name to the 1.1 equivalent
let skip_list_1_1: Vec<String> = ELEMENT_GLOBAL_SKIP_LIST
.iter()
.map(|file_1_0| file_1_0.replace("_1_0", "_1_1"))
.filter_map(|filename| {
// Canonicalize the skip list file names so they're in the host OS' preferred format.
// This involves looking up the actual file; if canonicalization fails, the file could
// not be found/read which could mean the skip list is outdated.
fs::canonicalize(filename).ok()
})
.map(|filename| filename.to_string_lossy().into())
.collect();
if skip_list_1_1.contains(&file_name.to_owned()) {
return;
}
println!("testing {file_name}");
let file = fs::File::open(file_name).unwrap();
let buf_reader = BufReader::new(file);
let input = DataStraw::new(buf_reader);
let ion_stream = IonStream::new(input);
let mut reader = Reader::new(AnyEncoding, ion_stream).unwrap();
// Manually unwrap to allow for pretty-printing of errors
match reader.read_all_elements() {
Ok(_) => {}
Err(IonError::Decoding(e)) => {
panic!("{:?}: {}", e.position(), e);
}
Err(other) => {
panic!("{other:#?}");
}
}
}
29 changes: 29 additions & 0 deletions tests/ion_tests/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
#![allow(dead_code)]

use std::fs::read;
use std::io;
use std::io::Read;
use std::path::MAIN_SEPARATOR_STR as PATH_SEPARATOR;

use ion_rs::v1_0;
Expand All @@ -12,6 +14,7 @@ use ion_rs::{
Symbol, Value,
};

mod detect_incomplete_text;
pub mod lazy_element_ion_tests;

/// Concatenates two slices of string slices together.
Expand Down Expand Up @@ -435,3 +438,29 @@ pub const ELEMENT_EQUIVS_SKIP_LIST: SkipList = &[
"ion-tests/iontestdata_1_0/good/equivs/localSymbolTableNullSlots.ion",
"ion-tests/iontestdata_1_0/good/equivs/nonIVMNoOps.ion",
];

/// An implementation of `io::Read` that only yields a single byte on each
/// call to `read()`. This is used in tests to confirm that the reader's
/// data-incompleteness and retry logic will properly handle all corner
/// cases.
pub(crate) struct DataStraw<R> {
input: R,
}

impl<R> DataStraw<R> {
pub fn new(input: R) -> Self {
Self { input }
}
}

impl<R: Read> Read for DataStraw<R> {
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
let single_byte_buffer = &mut [0u8; 1];
let bytes_read = self.input.read(single_byte_buffer)?;
if bytes_read == 0 {
return Ok(0);
}
buf[0] = single_byte_buffer[0];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is byte-by-byte enough for full coverage, or do you also need to test what happens when the stream returns 0 bytes at any point (i.e. having the DataStraw alternate between returning one byte and zero bytes)? Or is the reader not continuable, but just needs to be able to handle arbitrary-sized chunks being provided from the input?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading one byte at a time should be enough for full coverage. The reader is always operating on a fixed input buffer slice that holds (at least) the current top level value. If the reader hit an Incomplete and the input yielded zero bytes, the next call to Reader::next() would be trying again on the same buffer slice.

Ok(1)
}
}
Loading