-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support COUNT_ALL
and COUNT_VALID
as reduce
aggregations
#13756
Comments
I disagree with this change. I think I'd like to see libcudf split up the reduce algorithms into functional names like Regardless, I don't think we should provide a convenience wrapper on the existing |
Thank you for your perspective, @davidwendt.
I see what you mean. What's your view on the code such as the following? auto const reduce_results = [&] {
auto const return_dtype = cudf::detail::target_type(input.type(), aggr.kind);
if (aggr.kind == aggregation::COUNT_ALL) {
return cudf::make_fixed_width_scalar(input.size(), stream);
} else if (aggr.kind == aggregation::COUNT_VALID) {
return cudf::make_fixed_width_scalar(input.size() - input.null_count(), stream);
} else {
return cudf::reduction::detail::reduce(input,
*convert_to<cudf::reduce_aggregation>(aggr),
return_dtype,
std::nullopt,
stream,
rmm::mr::get_current_device_resource());
}
}(); This is the special-casing that I would like to avoid. If CC @harrism (who had advice on this in the past). |
From my perspective, putting the commonly used feature at the lower level (libcudf) is better than at the application level. Why? Because we can just call libcudf API and avoid reimplementing that same feature multiple times in multiple applications, even such implementation is very simple. |
If we were able to eliminate the enum However, as it stands with Given the tension between these two potential next steps, I would encourage spending more effort towards the functional API that @davidwendt proposed, but I wouldn't oppose expanding the existing reduction API's |
This is a follow-up item that arose from #13727.
It turns out that
COUNT_ALL
andCOUNT_VALID
are not supported incudf::reduce()
asreduce_aggregation
s.(This is likely because their values can trivially computed from the results of
column::size()
andcolumn::null_count()
.)However, this causes obfuscation in any code that attempts to dispatch reduce aggregations generically. E.g. Here.
It would be good to have
cudf::reduce()
supportCOUNT_ALL
andCOUNT_VALID
natively, so as not to require special handling in the calling code.The text was updated successfully, but these errors were encountered: