-
Notifications
You must be signed in to change notification settings - Fork 43
[REVIEW] Replace function #106
base: master
Are you sure you want to change the base?
Conversation
Can one of the admins verify this patch? |
ok to test |
API seems reasonable to me |
@@ -0,0 +1,110 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be moved to a separate bench
folder as we discussed in the binary ops PR review?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And separated into benchmarks and unit tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -0,0 +1,143 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add more documentation/comments to the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nsakharnykh Can you approve your review since this is resolved?
…s to replace-test
…s to replace-test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider changing the function name or at least improving the documentation of what it does. One other minor suggestion.
include/gdf/cffi/functions.h
Outdated
/// \param[in] values contains the replacement values | ||
/// | ||
/// Note that `to_replace` and `values` are related by the index | ||
gdf_error gdf_replace(gdf_column * column, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interface is confusing to me. Why does it have column
and to_replace
parameters? What happens to column
? What happens to to_replace
? Why not just make a gdf_copy(out_column, in_column)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I see. The meaning of "replace" is not clear. It's not a copy. The semantics are actually those of "find_and_replace_all": For each value in to_replace
, find all instances of that value in column
and replace it with the corresponding value in values
. This should be made clear in the header documentation for the function. Consider changing the name to gdf_find_and_replace_all() or something like that...
@@ -0,0 +1,43 @@ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're adding google benchmark in this PR? Shouldn't that be orthogonal to gdf_replace
, and therefore a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is the first benchmark I've seen added to libgdf...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since no other PR depends of this, I put google benchmark here. But I could make another PR if it's relevant.
src/replace.cu
Outdated
if (NotSameDType(column, to_replace, values)) { return GDF_CUDA_ERROR; } | ||
|
||
switch (column->dtype) { | ||
#define WHEN(DTYPE) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like calling this macro "WHEN" -- it's not descriptive. Something like REPLACE_CASE
would be clearer. Then you would have
switch (column->type) {
REPLACE_CASE(INT8);
REPLACE_CASE(INT16);
// etc.
}
API proposal for the replace function.