Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorganize /opencl and add missing matrix_cl overloads #1364

Conversation

rok-cesnovar
Copy link
Member

@rok-cesnovar rok-cesnovar commented Sep 21, 2019

Summary

Fixes #1352

This PR reorganizes the /opencl folder:

  • matrix_cl overloads of /prim/mat functions go to /opencl/prim
  • core OpenCL code and functions that are not overloads of /prim functions stay in /opencl
  • create a non-inplace cholesky decompose (the in-place one is moved to ::opencl)
  • minor doxygen cleanup

It also creates the following signatures for matrix_cl overloads by reusing already merged OpenCL code:

  • cholesky_decompose in /prim is in the form A = cholesky_decompose(B), the one in /opencl is moved to the opencl:: namespace and still works inplace
  • mdivide_left_tri_low(A,b)
  • mdivide_left_tri_low(A)
  • mdivide_right_tri_low(b,A)

After this PR gets merged the we will have the list of the matrix_cl overloads listed here ready.

This PR also fixes a bug in the matrix_cl copy_assignment that was introduced with the removal of const qualifiers. Currently the buffers size was not modified on copy assignment.

Tests

The following tests are added in opencl/prim:

  • non-inplace cholesky (exceptions tests, zero sized matrix, ...)
  • mdivide_left_tri_low
  • mdivide_right_tri_low

Side Effects

/

Checklist

  • Math issue Reorganize the /opencl folder #1352

  • Copyright holder: Rok Češnovar (Univ. of Ljubljana)

  • the basic tests are passing

  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.02)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.98)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 0.99)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.01)
(performance.compilation, 1.03)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.01)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.01)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.0)
Result: 1.00396939091
Commit hash: 72f88a1

@rok-cesnovar rok-cesnovar changed the title [WIP] Reorganize /opencl and add missing matrix_cl overloads Reorganize /opencl and add missing matrix_cl overloads Sep 28, 2019
Copy link
Member Author

@rok-cesnovar rok-cesnovar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SteveBronder @t4c1 This is now ready to take a look. I would like both of you to take a look here. The changes are mostly moves and a few added tests due to new function signatures.

@@ -424,6 +424,8 @@ class matrix_cl<T, enable_if_arithmetic<T>> {
rows_ = a.rows();
cols_ = a.cols();
this->wait_for_read_write_events();
cl::Context& ctx = opencl_context.context();
buffer_cl_ = cl::Buffer(ctx, CL_MEM_READ_WRITE, sizeof(T) * a.size());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to address the bug that was introduced with the removal of const from rows/cols.

@@ -102,92 +102,7 @@ inline matrix_cl<return_type_t<T1, T2>> multiply(const matrix_cl<T1>& A,
return temp;
}
} // namespace opencl

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section was move to /prim. The actual implementation that was already in opencl:: was left in /opencl.

const matrix_cl<T1>& A, const matrix_cl<T2>& b) {
check_square("mdivide_left_tri_low", "A", A);
check_multiplicable("mdivide_left_tri_low", "A", A, "b", b);
return tri_inverse<matrix_cl_view::Lower>(A) * b;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to force the Lower I had to extend the tri_inverse with a template.

The other option would be to force a change of A before calling tri_inverse but that changes the view for A globally, which is bad.

The third option would be to have the "forced" view as an argument to tri_inverse. I am also fine with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the option you chose.

@@ -0,0 +1,135 @@
#ifndef STAN_MATH_OPENCL_PRIM_MULTIPLY_HPP
#define STAN_MATH_OPENCL_PRIM_MULTIPLY_HPP
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was just moved from /opencl/multiply.hpp

for (int i = 0; i < A.size(); i++) \
EXPECT_NEAR(A(i), B(i), DELTA);

TEST(MathMatrix, cholesky_decompose_cl_expections) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the same tests as for the in-place cholesky, just on the non-inplace one.

#include <gtest/gtest.h>
#include <algorithm>

#define EXPECT_MATRIX_NEAR(A, B, DELTA) \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests and the mdivide_right_tri_low follow the same idea as the OpenCL tests we have for math/prim/mdivide_left_tri

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.96)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.99)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.01)
(performance.compilation, 1.04)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 0.99)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.02)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.0)
Result: 1.00016837813
Commit hash: 197977f

Copy link
Contributor

@t4c1 t4c1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just one question.

stan/math/opencl/prim/cholesky_decompose.hpp Outdated Show resolved Hide resolved
const matrix_cl<T1>& A, const matrix_cl<T2>& b) {
check_square("mdivide_left_tri_low", "A", A);
check_multiplicable("mdivide_left_tri_low", "A", A, "b", b);
return tri_inverse<matrix_cl_view::Lower>(A) * b;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the option you chose.

…ocations

# Conflicts:
#	stan/math/opencl/matrix_cl.hpp
#	stan/math/opencl/multiply.hpp
#	stan/math/opencl/opencl.hpp
#	stan/math/opencl/tri_inverse.hpp
@rok-cesnovar
Copy link
Member Author

This is ready for a re-review. But not a critical thing for 2.21. @SteveBronder you are probably busy these days, take a look whenever you have time. Dont wont to merge this one without your input.

@SteveBronder
Copy link
Collaborator

I'll try to take a look tmrw if not then Monday

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 0.99)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.95)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 0.99)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 1.01)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 0.96)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 0.94)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.0)
(performance.compilation, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 0.99)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 0.99)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 0.99)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 0.94)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.0)
Result: 0.98330278973
Commit hash: fdd62f2

Copy link
Collaborator

@SteveBronder SteveBronder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

One thing I'm thinking about, maybe it's time we put the OpenCL prim functions inside of their respective prim files. Then we just keep in opencl the things that are not directly exposed to users.

We can do that another day though

@rok-cesnovar
Copy link
Member Author

I would do that once we add compiler support to use them directly. I will work on that in the coming weeks. Then we can finnaly remove the ifdefed code from CPU /prim files.

And then I would be fine moving /opencl/prim files to /prim/opencl.

@rok-cesnovar rok-cesnovar merged commit 262e205 into stan-dev:develop Oct 11, 2019
@rok-cesnovar rok-cesnovar deleted the refactor/opencl-function-signatures-and-locations branch October 11, 2019 15:18
@rok-cesnovar rok-cesnovar mentioned this pull request Oct 11, 2019
5 tasks
@serban-nicusor-toptal serban-nicusor-toptal added this to the 3.0.0 milestone Oct 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reorganize the /opencl folder
5 participants