Reorganize /opencl and add missing matrix_cl overloads #1364

rok-cesnovar · 2019-09-21T11:26:58Z

Summary

This PR reorganizes the /opencl folder:

matrix_cl overloads of /prim/mat functions go to /opencl/prim
core OpenCL code and functions that are not overloads of /prim functions stay in /opencl
create a non-inplace cholesky decompose (the in-place one is moved to ::opencl)
minor doxygen cleanup

It also creates the following signatures for matrix_cl overloads by reusing already merged OpenCL code:

cholesky_decompose in /prim is in the form A = cholesky_decompose(B), the one in /opencl is moved to the opencl:: namespace and still works inplace
mdivide_left_tri_low(A,b)
mdivide_left_tri_low(A)
mdivide_right_tri_low(b,A)

After this PR gets merged the we will have the list of the matrix_cl overloads listed here ready.

This PR also fixes a bug in the matrix_cl copy_assignment that was introduced with the removal of const qualifiers. Currently the buffers size was not modified on copy assignment.

Tests

The following tests are added in opencl/prim:

non-inplace cholesky (exceptions tests, zero sized matrix, ...)
mdivide_left_tri_low
mdivide_right_tri_low

Side Effects

/

Checklist

Math issue Reorganize the /opencl folder #1352
Copyright holder: Rok Češnovar (Univ. of Ljubljana)
the basic tests are passing
the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested

…espace

…ocations # Conflicts: # stan/math/opencl/normal_id_glm_lpdf.hpp

stan-buildbot · 2019-09-21T16:27:06Z

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.02)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.98)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 0.99)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.01)
(performance.compilation, 1.03)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.01)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.01)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.0)
Result: 1.00396939091
Commit hash: 72f88a1

…ocations

…gs/RELEASE_500/final)

rok-cesnovar

@SteveBronder @t4c1 This is now ready to take a look. I would like both of you to take a look here. The changes are mostly moves and a few added tests due to new function signatures.

rok-cesnovar · 2019-09-28T17:46:07Z

stan/math/opencl/matrix_cl.hpp

@@ -424,6 +424,8 @@ class matrix_cl<T, enable_if_arithmetic<T>> {
    rows_ = a.rows();
    cols_ = a.cols();
    this->wait_for_read_write_events();
+    cl::Context& ctx = opencl_context.context();
+    buffer_cl_ = cl::Buffer(ctx, CL_MEM_READ_WRITE, sizeof(T) * a.size());


This was to address the bug that was introduced with the removal of const from rows/cols.

rok-cesnovar · 2019-09-28T17:46:45Z

stan/math/opencl/multiply.hpp

@@ -102,92 +102,7 @@ inline matrix_cl<return_type_t<T1, T2>> multiply(const matrix_cl<T1>& A,
  return temp;
 }
 }  // namespace opencl
-


This section was move to /prim. The actual implementation that was already in opencl:: was left in /opencl.

rok-cesnovar · 2019-09-28T18:03:29Z

stan/math/opencl/prim/mdivide_left_tri_low.hpp

+    const matrix_cl<T1>& A, const matrix_cl<T2>& b) {
+  check_square("mdivide_left_tri_low", "A", A);
+  check_multiplicable("mdivide_left_tri_low", "A", A, "b", b);
+  return tri_inverse<matrix_cl_view::Lower>(A) * b;


In order to force the Lower I had to extend the tri_inverse with a template.

The other option would be to force a change of A before calling tri_inverse but that changes the view for A globally, which is bad.

The third option would be to have the "forced" view as an argument to tri_inverse. I am also fine with that.

I like the option you chose.

rok-cesnovar · 2019-09-28T18:04:04Z

stan/math/opencl/prim/multiply.hpp

@@ -0,0 +1,135 @@
+#ifndef STAN_MATH_OPENCL_PRIM_MULTIPLY_HPP
+#define STAN_MATH_OPENCL_PRIM_MULTIPLY_HPP


This code was just moved from /opencl/multiply.hpp

rok-cesnovar · 2019-09-28T18:05:55Z

test/unit/math/opencl/prim/cholesky_decompose_test.cpp

+  for (int i = 0; i < A.size(); i++)    \
+    EXPECT_NEAR(A(i), B(i), DELTA);
+
+TEST(MathMatrix, cholesky_decompose_cl_expections) {


These are the same tests as for the in-place cholesky, just on the non-inplace one.

rok-cesnovar · 2019-09-28T18:06:55Z

test/unit/math/opencl/prim/mdivide_left_tri_low_test.cpp

+#include <gtest/gtest.h>
+#include <algorithm>
+
+#define EXPECT_MATRIX_NEAR(A, B, DELTA) \


These tests and the mdivide_right_tri_low follow the same idea as the OpenCL tests we have for math/prim/mdivide_left_tri

…ocations

stan-buildbot · 2019-09-29T11:33:30Z

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.96)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.99)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.01)
(performance.compilation, 1.04)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 0.99)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.02)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.0)
Result: 1.00016837813
Commit hash: 197977f

t4c1

Looks good! Just one question.

stan/math/opencl/prim/cholesky_decompose.hpp

t4c1 · 2019-10-07T07:34:44Z

stan/math/opencl/prim/mdivide_left_tri_low.hpp

+    const matrix_cl<T1>& A, const matrix_cl<T2>& b) {
+  check_square("mdivide_left_tri_low", "A", A);
+  check_multiplicable("mdivide_left_tri_low", "A", A, "b", b);
+  return tri_inverse<matrix_cl_view::Lower>(A) * b;


I like the option you chose.

…ocations # Conflicts: # stan/math/opencl/matrix_cl.hpp # stan/math/opencl/multiply.hpp # stan/math/opencl/opencl.hpp # stan/math/opencl/tri_inverse.hpp

rok-cesnovar · 2019-10-10T11:33:37Z

This is ready for a re-review. But not a critical thing for 2.21. @SteveBronder you are probably busy these days, take a look whenever you have time. Dont wont to merge this one without your input.

SteveBronder · 2019-10-10T18:53:27Z

I'll try to take a look tmrw if not then Monday

stan-buildbot · 2019-10-10T22:23:12Z

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 0.99)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.95)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 0.99)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 1.01)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 0.96)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 0.94)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.0)
(performance.compilation, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 0.99)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 0.99)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 0.99)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 0.94)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.0)
Result: 0.98330278973
Commit hash: fdd62f2

SteveBronder

Cool!

One thing I'm thinking about, maybe it's time we put the OpenCL prim functions inside of their respective prim files. Then we just keep in opencl the things that are not directly exposed to users.

We can do that another day though

rok-cesnovar · 2019-10-11T15:16:03Z

I would do that once we add compiler support to use them directly. I will work on that in the coming weeks. Then we can finnaly remove the ifdefed code from CPU /prim files.

And then I would be fine moving /opencl/prim files to /prim/opencl.

rok-cesnovar added 14 commits September 20, 2019 20:03

moved add and subtract

f60a88d

moved multiply

077bec2

Merge branch 'develop' into refactor/match-opencl-function-signatures

29146a0

fix header path

798f661

fixed assignment bug

912c05c

moved divide_columns and gp_exp_quad_cov

88ee0c4

moved transpose and multiply, inplace cholesky is now in opencl:: nam…

a2464ac

…espace

cholesky moved

b1cd81d

doxy for multiply cleanup

0b4d1ea

Merge branch 'develop' into refactor/opencl-function-signatures-and-l…

0e084a2

…ocations # Conflicts: # stan/math/opencl/normal_id_glm_lpdf.hpp

clang format

b36103f

doxygen fix

de53951

fix include path

72f88a1

simplify prim/mat/cholesky call

40be0f0

rok-cesnovar and others added 7 commits September 25, 2019 20:26

moving mdivide_left_tri

675df84

Merge branch 'develop' into refactor/opencl-function-signatures-and-l…

4e5927d

…ocations

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

522d83b

…gs/RELEASE_500/final)

added mdivide_right_tri_low and mdivide_left_tri_low

7a8f326

removed the .hpp test file

b416f74

added comments and fixed typos

af84a9e

move GLMs

aca4aca

rok-cesnovar changed the title ~~[WIP] Reorganize /opencl and add missing matrix_cl overloads~~ Reorganize /opencl and add missing matrix_cl overloads Sep 28, 2019

cleaned up headers and added guards

3c57a14

rok-cesnovar commented Sep 28, 2019

View reviewed changes

rok-cesnovar added 4 commits September 28, 2019 20:11

added newlines to files

d097a9e

Merge branch 'develop' into refactor/opencl-function-signatures-and-l…

d4d52ff

…ocations

added missing PRIM in guards, moved new GLMs, added missing sum include

75c83f6

fixed header

197977f

t4c1 reviewed Oct 7, 2019

View reviewed changes

rok-cesnovar added 2 commits October 10, 2019 13:08

Merge branch 'develop' into refactor/opencl-function-signatures-and-l…

d69b112

…ocations # Conflicts: # stan/math/opencl/matrix_cl.hpp # stan/math/opencl/multiply.hpp # stan/math/opencl/opencl.hpp # stan/math/opencl/tri_inverse.hpp

added new requires, resolved review comments

9bc190e

returned back 2 headers

fdd62f2

SteveBronder approved these changes Oct 11, 2019

View reviewed changes

rok-cesnovar merged commit 262e205 into stan-dev:develop Oct 11, 2019

rok-cesnovar deleted the refactor/opencl-function-signatures-and-locations branch October 11, 2019 15:18

rok-cesnovar mentioned this pull request Oct 11, 2019

bugfix intercept only GLMs #1399

Merged

5 tasks

serban-nicusor-toptal added this to the 3.0.0 milestone Oct 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reorganize /opencl and add missing matrix_cl overloads #1364

Reorganize /opencl and add missing matrix_cl overloads #1364

rok-cesnovar commented Sep 21, 2019 •

edited

Loading

stan-buildbot commented Sep 21, 2019

rok-cesnovar left a comment

rok-cesnovar Sep 28, 2019

rok-cesnovar Sep 28, 2019

rok-cesnovar Sep 28, 2019

t4c1 Oct 7, 2019

rok-cesnovar Sep 28, 2019

rok-cesnovar Sep 28, 2019

rok-cesnovar Sep 28, 2019

stan-buildbot commented Sep 29, 2019

t4c1 left a comment

t4c1 Oct 7, 2019

rok-cesnovar commented Oct 10, 2019

SteveBronder commented Oct 10, 2019

stan-buildbot commented Oct 10, 2019

SteveBronder left a comment

rok-cesnovar commented Oct 11, 2019

		@@ -0,0 +1,135 @@
		#ifndef STAN_MATH_OPENCL_PRIM_MULTIPLY_HPP
		#define STAN_MATH_OPENCL_PRIM_MULTIPLY_HPP

Reorganize /opencl and add missing matrix_cl overloads #1364

Reorganize /opencl and add missing matrix_cl overloads #1364

Conversation

rok-cesnovar commented Sep 21, 2019 • edited Loading

Tests

Side Effects

Checklist

stan-buildbot commented Sep 21, 2019

rok-cesnovar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stan-buildbot commented Sep 29, 2019

t4c1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rok-cesnovar commented Oct 10, 2019

SteveBronder commented Oct 10, 2019

stan-buildbot commented Oct 10, 2019

SteveBronder left a comment

Choose a reason for hiding this comment

rok-cesnovar commented Oct 11, 2019

rok-cesnovar commented Sep 21, 2019 •

edited

Loading