[WIP] Binary Operators #94

wmalpica · 2018-08-13T19:13:07Z

Implemented so far is the first implementation concept using nested templating. It has been implemented with support for literals and 14 different binary operators.

We would like feedback on the API implemented, which is:

``
gdf_error gdf_scalar_operation(gdf_column* out, gdf_column* vax, gdf_scalar* vay, gdf_scalar* def, gdf_binary_operator ope);

gdf_error gdf_vector_operation(gdf_column* out, gdf_column* vax, gdf_column* vay, gdf_scalar* def, gdf_binary_operator ope);

enum gdf_binary_operator {
GDF_ADD,
GDF_SUB,
GDF_MUL,
GDF_DIV,
GDF_TRUE_DIV,
GDF_FLOOR_DIV,
GDF_MOD,
GDF_POW,
//GDF_COMBINE,
//GDF_COMBINE_FIRST,
//GDF_ROUND,
GDF_EQUAL,
GDF_NOT_EQUAL,
GDF_LESS,
GDF_GREATER,
GDF_LESS_EQUAL,
GDF_GREATER_EQUAL,
//GDF_PRODUCT,
//GDF_DOT
};

struct gdf_scalar {
union gdf_data {
std::int8_t si08;
std::int16_t si16;
std::int32_t si32;
std::int64_t si64;
float fp32;
double fp64;
};

gdf_data  data;
gdf_dtype type;

};
``

We have already started on a second implementation using NVRTC which uses the same API.

Unit tests have not yet been implemented and will be implemented after an implementation method has been decided.

This current implementation right now only supports the second operand being able to be a scalar, which can be improved upon easily later.

In order to compile the binary operation functionality, use the cmake variable "BINARY_OPERATION_VERSION" and choose the binary version.

Example:
cmake -DCMAKE_BUILD_TYPE=Release -DBINARY_OPERATION_VERSION:STRING=V1 ../../code/libgdf

This cmake switch was added right now because compilation time is very large due to all the functions generated by the templated code. For example compilation on a laptop with an 8-core i7 processor and 16Gb ram was:
Time to compile: 77m10.996s
Ram memory used: More than 13.5GB
Release binary size: 69,931,016 bytes

Implemented first approach, multiple dispatch and generic programming.

scopatz · 2018-08-14T15:08:02Z

This has a conflict with master now

The 2nd approach using NVRTC ("Jitify" library). Base version for review.

The 2nd approach using NVRTC ("Jitify" library). Base and stable version of the 2nd approach.

The 2nd approach using NVRTC ("Jitify" library). Added kernels for binary operations with default values.

The 2nd approach using NVRTC ("Jitify" library). Improved kernel implementation. Scalar type will be converted at kernel execution (type defined).

The 2nd approach using NVRTC ("Jitify" library). Added all base types to gdf_dtypes.

The 2nd approach using NVRTC ("Jitify" library). Changed gdf_scalar interface.

The 2nd approach using NVRTC ("Jitify" library). Erased 'valid' field input for kernels without a default value.

The 2nd approach using NVRTC ("Jitify" library). Updated valid field in kernels and added integration tests for all kernels.

Using Jitify utility to choose the optimal block size at kernel calls.

Fixed gtest compilation issues.

Fixed gtest issues

Added shared libraries for tests.

The 2nd approach using NVRTC ("Jitify" library). Erased unused functionality.

The 2nd approach using NVRTC ("Jitify" library). Added binary operations.

harrism

My biggest concern with this PR is that there is a lot of code and basically zero documentation. First, I think it needs high-level documentation of the functionality and organization, and it needs operational documentation for the main APIs being introduced (both external and internal APIs).

src/binary/binary2/kernel.cpp

harrism · 2018-08-28T01:54:56Z

include/gdf/cffi/types.h

@@ -16,6 +20,23 @@ typedef enum {
    N_GDF_TYPES, /* additional types should go BEFORE N_GDF_TYPES */
 } gdf_dtype;

+union gdf_data {


Are the names in this union required to only be four characters for some reason? It would be nice to spell them out better, especially "invd", "tmst", "dt32" and "dt64".

There isn't a particular reason. Is there any name style and code style for this project?

Human readable variable names is good practice for every project.

harrism · 2018-08-28T01:57:02Z

src/binary/binary2/kernel.cpp

+        }
+    }
+
+    template <typename TypeOut, typename TypeVax, typename TypeVay, typename TypeDef, typename TypeOpe>


Wow, TypeDef is very risky typename, considering its proximity to typedef. Could just use TDef, TVax, etc. Or reorder, putting Type last in the names.

No problem, it will be changed.

harrism · 2018-08-28T01:57:47Z

src/binary/binary2/kernel.cpp

+
+    template <typename TypeOut, typename TypeVax, typename TypeVay, typename TypeDef, typename TypeOpe>
+    __global__
+    void kernel_v_s_d(int size, gdf_data def_data,


Should gdf_data on this line be TypeDef?

harrism · 2018-08-28T02:01:55Z

src/binary/binary2/kernel.cpp

+            AbstractOperation<TypeOpe> operation;
+            out_data[i] = operation.template operate<TypeOut, TypeVax, TypeVay>(vax_data_aux, (TypeVay)vay_data);
+
+            __syncwarp();


What is the reason for __syncwarp() here? I don't see any sharing of data between threads. Is it necessary? Same question for all the other __syncwarp() calls.

It's not correct. It'll be removed.

harrism · 2018-08-28T02:10:04Z

src/binary/binary2/launcher.cpp

+        return *this;
+    }
+
+    Launcher& Launcher::instantiate(gdf_column* out, gdf_column* vax, gdf_column* vay, gdf_binary_operator ope) {


Since each of these instantiate functions varies only in the type of one argument, and their bodies are identical, couldn't you template them to reduce redundancy? You might even use variadic templates to collapse all into one instantiate and one launch function.

That's a good point

src/binary/binary2/operation.cpp

harrism · 2018-08-28T02:14:23Z

src/binary/binary2/traits.cpp

+R"***(
+#pragma once
+
+    struct IntegralSigned {};


Hmmmm, I believe Jitify already has some support for std::type_traits, e.g. https://github.com/NVIDIA/jitify/blob/master/jitify.hpp#L1124
Do you need to roll your own?

I had some issue with that. I expressed it in my daily scrum. It is supposed to be working, but it isn't. I included the libraries, however "common_type" and others aren't working, because of the error messages using nvrtc.

Let's work with Kate Clark and Ben Barsdell on making Jitify better. Are you in touch with them?

No. I'm not. I remember having issues related to "type_traits" header and with compiler option "--device-as-default-execution-space", which wasn't working.

I will contact them.

harrism · 2018-08-28T02:17:46Z

src/tests/binary-operation/library/operation.h

+            return (TypeOut)pow((double)vax, (double)vay);
+        }
+    };
+}


It's nice in this situation (closing brackets far from their openings) to comment what is being closed, e.g.:

} // namespace operation } // namespace test } // namespace gdf

Documentation is important. I was focused on implementation.

The 2nd approach using NVRTC ("Jitify" library). Erased syncwarp primitive.

…into binary-operators-draft

Changed submodule "Jitify" to https protocol.

Added CUDA configuration for NVRTC library.

Added 'stubs' directory in cmake for CUDA library.

Changed default string variable for selection of binary operations.

Added google benchmark for binary operations - NVRTC implementation.

Changed interface for cffi python.

Solved issues related to other tests due to a fix in the storage duration of Jitify cache.

Added 'stubs' directory in the library path (python) for NVRTC and CUDA libraries.

Improved vector library code.

Fixed issue related to inputs verification.

nsakharnykh

We did a live PR review with @jrhemstad @williamBlazing and @ironbit - requesting more changes. Also, we need the python bindings updated to use the new binary ops API: @mt-jones, can you post an update on where we are with this?

nsakharnykh · 2018-09-24T20:44:21Z

cmake/Templates/GoogleBenchmark.CMakeLists.txt.cmake

@@ -0,0 +1,36 @@
+#=============================================================================


Can we use this for CI? @mike-wendt

nsakharnykh · 2018-09-24T20:46:06Z

include/gdf/cffi/types.h

+    GDF_GREATER,
+    GDF_LESS_EQUAL,
+    GDF_GREATER_EQUAL,
+    //GDF_COMBINE,


Add a description/comment why those are not implemented

Or, preferably, please remove.

nsakharnykh · 2018-09-24T20:48:48Z

src/binary/CMakeLists.txt

+message(STATUS "Binary Operation Version: V2 - NVRTC")
+
+set(source_files
+    src/binary/binary2/binary.cpp


Rename binary2 folder to binary_nvrtc or binary_jitify

The source code has been reorganized, that's why I think it's convenient to reply to every observation.
https://github.com/BlazingDB/libgdf/tree/binary-operators-draft/src/binary-operation/jit

nsakharnykh · 2018-09-24T20:51:09Z

src/binary/binary2/kernel.cpp

+
+    template <typename TypeOut, typename TypeVax, typename TypeVay, typename TypeOpe>
+    __global__
+    void kernel_v_s(int size, TypeOut* out_data, TypeVax* vax_data, gdf_data vay_data) {


We need kernel_s_v as well - should be 6 combinations overall, one kernel per the top level GDF function

Implemented the remaining scalar-vector operations.

nsakharnykh · 2018-09-24T20:53:50Z

src/binary/binary2/launcher.cpp

+#include "binary/binary2/cuda.h"
+
+namespace gdf {
+    static thread_local jitify::JitCache JitCache;


We should be able to store JitCache when exiting the process to store in a file, then use an env variable to read it from the specified file to have the cache enabled between different processes

nsakharnykh · 2018-09-24T20:57:46Z

src/binary/binary2/operation.cpp

+        }
+    };
+/*
+    struct Add : public AbstractOperation<Add> {


Should add a comment with the description why this could be useful, i.e. for the overflows - if we ever wanted to enable this "feature" in the future

The source code has been optimized for jit.
The commented code has been moved outside the jit code and it has been added a description.
https://github.com/BlazingDB/libgdf/blob/96012d91aff817e6105a8fa8983e1587d237493f/src/binary-operation/jit/code/operation.cpp#L159

nsakharnykh · 2018-09-24T21:01:21Z

src/library/field.h

@@ -0,0 +1,109 @@
+/*


Move library under bench/binary or tests/binary since it's only used in tests, probably also rename to something like binary_helper

https://github.com/BlazingDB/libgdf/tree/binary-operators-draft/src/tests/binary-operation/util

nsakharnykh · 2018-09-24T21:05:01Z

src/bench/binary-operation-benchmark.cpp

@@ -0,0 +1,203 @@
+/*


Needs to be moved to src/bench/binary to have a folder per libgdf operator (one folder for binary, one for parquet, one for joins, etc.)

https://github.com/BlazingDB/libgdf/tree/binary-operators-draft/src/bench/binary-operation

nsakharnykh · 2018-09-24T21:12:37Z

src/binary/binary2/kernel.cpp

+
+        for (int i=start; i<size; i+=step) {
+            AbstractOperation<TypeOpe> operation;
+            out_data[i] = operation.template operate<TypeOut, TypeVax, TypeVay>(vax_data[i], (TypeVay)vay_data);


We need to set the output valid bit mask - I don't see it being handled in the code. It should be an OR between the two bit masks of the two input operands.

The output valid bitmask is processed in all kernels.
The kernels also have been optimized using 'Bit Hacks'. In the benchmark, the kernel (v_v_v_d) reduces its time approx. in 5us in all of its benchmarks, while the kernel (v_v_v) increments its time approx. in 7us due to the bitmask processing.
https://github.com/BlazingDB/libgdf/blob/729cbfac6ae2281894b99644c58643184bd18d85/src/binary-operation/jit/code/kernel.cpp#L82

Reorganized the source code.

Optimized the source code in Jit.

Added namespace documentation.

scopatz · 2018-09-26T16:54:48Z

src/binary-operation/jit/util/type.cpp

+                return "Pow";
+            //case GDF_COMBINE:
+            //case GDF_COMBINE_FIRST:
+            //case GDF_ROUND:


Are these simply not implemented? Probably should remove or make a comment

Moved 'util' library used in tests and benchmark.

Optimized Jit kernel code (bit hacks) and added the output valid bitmask in some kernels.

Added documentation related to the new interface. Changed the interface ('valid' field in gdf_scalar).

Added the changes for scalar vector operations.

Added the standard header for math functions.

Added a 'libcuda' soft link.

…igned types

…ignificant bit format

…ce was too large

… ws working before but could have had weird errors at sizes above 1bn records

Binary Operators

b92f977

Implemented first approach, multiple dispatch and generic programming.

kkraus14 added the 2 - In Progress Currenty a work in progress label Aug 14, 2018

Christian Noboa Mardini and others added 14 commits August 17, 2018 15:17

Binary Operations

40577cb

The 2nd approach using NVRTC ("Jitify" library). Base version for review.

Binary Operations

916b695

The 2nd approach using NVRTC ("Jitify" library). Base and stable version of the 2nd approach.

Binary Operations

cf20eef

The 2nd approach using NVRTC ("Jitify" library). Added kernels for binary operations with default values.

Binary Operations

144aa49

The 2nd approach using NVRTC ("Jitify" library). Improved kernel implementation. Scalar type will be converted at kernel execution (type defined).

Binary Operations

8b228bb

The 2nd approach using NVRTC ("Jitify" library). Added all base types to gdf_dtypes.

Binary Operations

3717b4f

The 2nd approach using NVRTC ("Jitify" library). Changed gdf_scalar interface.

Binary Operations

a24d15f

The 2nd approach using NVRTC ("Jitify" library). Erased 'valid' field input for kernels without a default value.

Binary Operations

2e384e6

The 2nd approach using NVRTC ("Jitify" library). Updated valid field in kernels and added integration tests for all kernels.

Binary Operations

a11dc77

Using Jitify utility to choose the optimal block size at kernel calls.

Binary Operations

0097521

Fixed gtest compilation issues.

Binary Operations

88d3c45

Fixed gtest issues

Binary Operations

280f52a

Added shared libraries for tests.

Binary Operations

fc9efab

The 2nd approach using NVRTC ("Jitify" library). Erased unused functionality.

Binary Operations

c397a7e

The 2nd approach using NVRTC ("Jitify" library). Added binary operations.

harrism suggested changes Aug 28, 2018

View reviewed changes

ironbit and others added 12 commits August 28, 2018 16:33

Merge branch 'master' into binary-operators-draft

1ac172d

Binary Operations

335485e

The 2nd approach using NVRTC ("Jitify" library). Erased syncwarp primitive.

Merge branch 'binary-operators-draft' of github.com:BlazingDB/libgdf …

d112885

…into binary-operators-draft

Binary Operations

6145569

Changed submodule "Jitify" to https protocol.

Binary Operations

d311a81

Added CUDA configuration for NVRTC library.

Binary Operations

74c0b68

Added 'stubs' directory in cmake for CUDA library.

Binary Operations

d0e4dfe

Changed default string variable for selection of binary operations.

Binary Operations

891e2d9

Added google benchmark for binary operations - NVRTC implementation.

Binary Operations

3a18c62

Changed interface for cffi python.

Binary Operations

ba15ada

Solved issues related to other tests due to a fix in the storage duration of Jitify cache.

Binary Operations

191e549

Added 'stubs' directory in the library path (python) for NVRTC and CUDA libraries.

Binary Operations

6061920

Improved vector library code.

Binary Operations

47b0905

Fixed issue related to inputs verification.

nsakharnykh suggested changes Sep 24, 2018

View reviewed changes

Christian Noboa Mardini added 3 commits September 25, 2018 12:33

Binary Operations

9ba4b73

Reorganized the source code.

Binary Operations

6f6d804

Optimized the source code in Jit.

Binary Operations

96012d9

Added namespace documentation.

scopatz reviewed Sep 26, 2018

View reviewed changes

kkraus14 mentioned this pull request Sep 26, 2018

Dask gdf fixes rapidsai/cudf#262

Merged

Christian Noboa Mardini and others added 11 commits September 27, 2018 12:28

Binary Operations

5773a58

Moved 'util' library used in tests and benchmark.

Merge branch 'master' into binary-operators-draft

55c5fd2

Binary Operations

10dff5d

Optimized Jit kernel code (bit hacks) and added the output valid bitmask in some kernels.

Merge branch 'master' into binary-operators-draft

729cbfa

Binary Operations

a3cbde8

Added documentation related to the new interface. Changed the interface ('valid' field in gdf_scalar).

Merge branch 'master' into binary-operators-draft

59b5ba6

Merge branch 'master' into binary-operators-draft

36241ec

Binary Operations

1c521a1

Added the changes for scalar vector operations.

Binary Operations

b6f5cce

Added the standard header for math functions.

Binary Operations

a166c7c

Added a 'libcuda' soft link.

new order by api

3a74ff8

mike-wendt added 4 - Needs Rework Additional work is needed migrate to cudf and removed 2 - In Progress Currenty a work in progress labels Oct 24, 2018

aocsa and others added 7 commits October 24, 2018 17:45

[binary-operators-draft]: udpate get_column_byte_width to support uns…

f7c3d4f

…igned types

[binary-operator-draft] update gpu_apply_stencil to work with least-s…

cc75849

…ignificant bit format

fixed some issues with binary ops not being able to output if the spa…

2c356d0

…ce was too large

made a small fix to the sorting code so it gets the size properly, it…

d0a76c1

… ws working before but could have had weird errors at sizes above 1bn records

removed some couts

dfa27b8

removed some couts

e47aebe

TODO use a fixed compute capability

9722f0e

mike-wendt mentioned this pull request Dec 20, 2018

[libgdf-PR-94] Binary Operators rapidsai/cudf#593

Closed

devavret mentioned this pull request Feb 7, 2019

[REVIEW] Jitify versions of binaryops for non-homogeneous types rapidsai/cudf#892

Merged

5 tasks

		@@ -0,0 +1,36 @@
		#=============================================================================

[WIP] Binary Operators #94

Are you sure you want to change the base?

[WIP] Binary Operators #94

Conversation

wmalpica commented Aug 13, 2018

scopatz commented Aug 14, 2018

harrism left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nsakharnykh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment