Skip to content

Commit

Permalink
Fix fragment corruption in the GlobalOrderWriter
Browse files Browse the repository at this point in the history
Previously there was an issue with the GlobalOrderWriter when using a
maximum fragment size. If we resumed a write where the first tile would
cause the partially written fragment to overflow the set maximum size,
we failed to check that a single tile could fit. This resulted in a
convoluted path where `dim_tiles[-1].cell_num()` return a reasonable
result (instead of segfaulting or throwing an exception).

This lead us to writing the wrong `last_tile_cell_num` value in the
FragmentMetadata. When reading the last tile from one of these corrupted
fragments, the logic in `Tile::load_chunk_data` would throw an exception
when the expected size was calculated off an invalid
`last_tile_cell_num` which caused an exception to be thrown and rendered
the last tile of the fragment unreadable.

The fix is simply to make sure we check if a single tile can fit and if
not, to create a new fragment. For a belt and suspenders approach to
safety, we've also changed the code in `WriterBase::set_coords_metadata`
to use `std::vector::at` instead of `std::vector::operator[]` so that
bounds checking is performed and assertion is thrown rather than writing
a bad FragmentMetadata to disk.
  • Loading branch information
davisp committed Sep 26, 2023
1 parent e534ae6 commit 0d33951
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 1 deletion.
2 changes: 1 addition & 1 deletion test/regression/targets/sc-34072.cc
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ TEST_CASE("Global Order Writer Resume Writes Bug", "[global-order-writer][bug][s
// `std::vector::at` which does bounds checking. This causes us to throw
// an exception in `WriterBase::set_coords_metadata` instead of silently
// setting the last tile cell number to some garbage value.
REQUIRE_THROWS(query.submit());
REQUIRE(query.submit() == Query::Status::COMPLETE);

array.close();
}
Expand Down
8 changes: 8 additions & 0 deletions tiledb/sm/query/writers/global_order_writer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -801,6 +801,14 @@ Status GlobalOrderWriter::global_write() {
// Compute the number of tiles that will fit in this fragment.
auto num = num_tiles_to_write(idx, tile_num, tiles);

// If we're resuming a fragment write and the first tile doesn't fit into
// the previous fragment, we need to start a new fragment and recalculate
// the number of tiles to write.
if (current_fragment_size_ > 0 && num == 0) {
RETURN_CANCEL_OR_ERROR(start_new_fragment());
num = num_tiles_to_write(idx, tile_num, tiles);
}

// Set new number of tiles in the fragment metadata
auto new_num_tiles = frag_meta->tile_index_base() + num;
throw_if_not_ok(frag_meta->set_num_tiles(new_num_tiles));
Expand Down

0 comments on commit 0d33951

Please sign in to comment.