-
Notifications
You must be signed in to change notification settings - Fork 904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement HISTOGRAM
and MERGE_HISTOGRAM
aggregations
#14045
Changes from 108 commits
e385fda
e3df8d4
7bc7f91
0fd2000
1b04436
1977d69
6fa93fc
d11dd7f
b632570
e58f3e3
3cf1948
8488646
1994684
5dcbac9
6236fcc
42a778f
584ff8d
4a3d60d
4e74119
34cb488
c863b53
e73c07f
40e8730
95e4463
723ae4c
921243e
8fb7a9e
65427c8
bcc2db4
0c0c7ac
c066276
01cc1c2
f5a6a1a
dfbb720
924a2d6
a1b516e
e196ab4
09f68af
f107d98
cc185d8
547be01
4d93b1e
6d8be79
2d08539
7999c7e
2d47048
824dcad
3fb43f4
ee229a0
0ece05d
b71c7a8
1edeb4c
75c35c4
35f6374
c6c2c43
b5dd22a
17b8975
c0b245f
a8b3696
a7fee30
829017a
e53042e
49608ab
08aac0e
aaaf347
2f5b343
c11f939
d10842e
6abc7b5
f833f58
ef308e8
70e624d
ee91b2e
7c51faa
270bcb8
6447877
c766e43
baddf18
0afad9c
c05e595
8653053
8d6fdfe
d1fbda4
199d97b
4b0983e
0a8a03d
201d432
edf6816
8ac649e
04965fa
63ef1fa
d31de20
34a4268
502a3da
61377e0
dd72159
56516e9
00c9c79
424196b
26238dd
5001cbd
76f77a0
39ce6d1
ad09d30
e12df0f
b06ed2a
b6b720a
b30f70c
89f3628
c3ad104
f504c86
e9d723e
d1980e0
83b8a37
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In both |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
/* | ||
* Copyright (c) 2023, NVIDIA CORPORATION. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
#pragma once | ||
|
||
#include <cudf/column/column_view.hpp> | ||
#include <cudf/scalar/scalar.hpp> | ||
#include <cudf/table/table_view.hpp> | ||
|
||
#include <rmm/cuda_stream_view.hpp> | ||
#include <rmm/device_uvector.hpp> | ||
|
||
#include <memory> | ||
#include <optional> | ||
|
||
namespace cudf::reduction::detail { | ||
|
||
/** | ||
* @brief Compute the frequency for each distinct row in the input table. | ||
* | ||
* @param input The input table to compute histogram | ||
* @param partial_counts An optional column containing count for each row | ||
* @param stream CUDA stream used for device memory operations and kernel launches | ||
* @param mr Device memory resource used to allocate memory of the returned objects | ||
* @return A pair of array contains the (stable-order) indices of the distinct rows in the input | ||
* table, and their corresponding distinct counts | ||
*/ | ||
[[nodiscard]] std::pair<std::unique_ptr<rmm::device_uvector<size_type>>, std::unique_ptr<column>> | ||
compute_row_frequencies(table_view const& input, | ||
std::optional<column_view> const& partial_counts, | ||
rmm::cuda_stream_view stream, | ||
rmm::mr::device_memory_resource* mr); | ||
|
||
/** | ||
* @brief Create an empty histogram column. | ||
* | ||
* A histogram column is a structs column `STRUCT<T, int64_t>` where T is type of the input | ||
* values. | ||
* | ||
* @returns An empty histogram column | ||
*/ | ||
[[nodiscard]] std::unique_ptr<column> make_empty_histogram_like(column_view const& values); | ||
|
||
} // namespace cudf::reduction::detail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular order we are following for those aggregation kinds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea. I've never seen any defined order in this list.