Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce new Chunk iteration APIs #8553

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Introduce new Chunk iteration APIs #8553

wants to merge 7 commits into from

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Dec 20, 2024

  • Way more ergonomic
  • Same performance
  • Less code
  • Support for structs!
  • Basis for new upcoming codegen'd deserialization APIs (...at some point)

@teh-cmc teh-cmc added enhancement New feature or request 🏹 arrow concerning arrow 🧑‍💻 dev experience developer experience (excluding CI) 🔍 re_query affects re_query itself include in changelog labels Dec 20, 2024
Copy link

github-actions bot commented Dec 20, 2024

Web viewer built successfully. If applicable, you should also test it:

  • I have tested the web viewer
Result Commit Link Manifest
c77bd2a https://rerun.io/viewer/pr/8553 +nightly +main

Note: This comment is updated whenever you push a commit.

@teh-cmc teh-cmc marked this pull request as ready for review December 20, 2024 11:00
@Wumpf Wumpf self-requested a review January 3, 2025 12:15
Copy link
Member

@Wumpf Wumpf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noice

Comment on lines +871 to +921
let Some(inner_list_array) = array.as_any().downcast_ref::<Arrow2ListArray<i32>>() else {
if cfg!(debug_assertions) {
panic!("downcast failed for {component_name}, data discarded");
} else {
re_log::error_once!("downcast failed for {component_name}, data discarded");
}
return Either::Left(std::iter::empty());
};

let inner_offsets = inner_list_array.offsets();
let inner_lengths = inner_list_array.offsets().lengths().collect_vec();

let Some(fixed_size_list_array) = inner_list_array
.values()
.as_any()
.downcast_ref::<Arrow2FixedSizeListArray>()
else {
if cfg!(debug_assertions) {
panic!("downcast failed for {component_name}, data discarded");
} else {
re_log::error_once!("downcast failed for {component_name}, data discarded");
}
return Either::Left(std::iter::empty());
};

let Some(values) = fixed_size_list_array
.values()
.as_any()
.downcast_ref::<Arrow2PrimitiveArray<T>>()
else {
if cfg!(debug_assertions) {
panic!("downcast failed for {component_name}, data discarded");
} else {
re_log::error_once!("downcast failed for {component_name}, data discarded");
}
return Either::Left(std::iter::empty());
};

let size = fixed_size_list_array.size();
let values = values.values();

// NOTE: No need for validity checks here, `iter_offsets` already takes care of that.
Either::Right(component_offsets.map(move |(idx, len)| {
let inner_offsets = &inner_offsets.as_slice()[idx..idx + len];
let inner_lengths = &inner_lengths.as_slice()[idx..idx + len];
izip!(inner_offsets, inner_lengths)
.map(|(&idx, &len)| {
let idx = idx as usize;
bytemuck::cast_slice(&values[idx * size..idx * size + len * size])
})
.collect_vec()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything except collect_vec is not dependent on N -> we likely can save a lot of template instantiations by putting this into a second level utility method that only depends on T but not on N

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏹 arrow concerning arrow 🧑‍💻 dev experience developer experience (excluding CI) enhancement New feature or request include in changelog 🔍 re_query affects re_query itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants