Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement functions in OpenCL #16

Open
t4c1 opened this issue Oct 30, 2019 · 3 comments
Open

How to implement functions in OpenCL #16

t4c1 opened this issue Oct 30, 2019 · 3 comments

Comments

@t4c1
Copy link
Member

t4c1 commented Oct 30, 2019

Description

As requested by @rok-cesnovar I prepared a list of all functions in Stan Math and proposed how to implement them in OpenCL. Most are probably possible using kernel generator.

Legend:

kg Possible with kernel generator. Might require adding new operations.
no OpenCL implementation makes no sense.
? I am not sure after quickly looking trough code.

List

Phi kg
Phi_approx kg
abs kg
algebra_solver ?
append_array kg
append_col kg
append_row kg
atan2 kg
bernoulli_ccdf_log kg?
bernoulli_cdf kg?
bernoulli_cdf_log kg?
bernoulli_lccdf kg?
bernoulli_lcdf kg?
bernoulli_log kg
bernoulli_logit_glm_lpmf kg
bernoulli_logit_log kg
bernoulli_logit_lpmf kg
bernoulli_logit_rng kg+some work?
bernoulli_lpmf kg
bernoulli_rng kg+some work?
bessel_first_kind kg?
bessel_second_kind kg?
beta_binomial_ccdf_log ?
beta_binomial_cdf ?
beta_binomial_cdf_log ?
beta_binomial_lccdf ?
beta_binomial_lcdf ?
beta_binomial_log ?
beta_binomial_lpmf ?
beta_binomial_rng kg+some work?
beta_ccdf_log ?
beta_cdf ?
beta_cdf_log ?
beta_lccdf ?
beta_lcdf ?
beta_log ?
beta_lpdf kg?
beta_proportion_ccdf_log ?
beta_proportion_cdf_log kg
beta_proportion_lccdf ?
beta_proportion_lcdf kg
beta_proportion_log kg?
beta_proportion_lpdf kg?
beta_proportion_rng kg+some work?
beta_rng kg+some work?
binary_log_loss kg
binomial_ccdf_log kg?
binomial_cdf kg?
binomial_cdf_log kg?
binomial_coefficient_log kg
binomial_lccdf kg?
binomial_lcdf kg?
binomial_log kg?
binomial_logit_log kg?
binomial_logit_lpmf kg?
binomial_lpmf kg?
binomial_rng kg+some work?
block kg
categorical_log kg
categorical_logit_glm_lpmf kg
categorical_logit_log kg
categorical_logit_lpmf kg
categorical_logit_rng kg+some work?
categorical_lpmf kg
categorical_rng kg+some work?
cauchy_ccdf_log kg?
cauchy_cdf kg?
cauchy_cdf_log kg?
cauchy_lccdf kg?
cauchy_lcdf kg?
cauchy_log kg?
cauchy_lpdf kg?
cauchy_rng kg+some work?
chi_square_ccdf_log kg?
chi_square_cdf kg?
chi_square_cdf_log kg?
chi_square_lccdf kg?
chi_square_lcdf kg?
chi_square_log kg?
chi_square_lpdf kg?
chi_square_rng kg+some work?
choose kg?
col kg
cols no
columns_dot_product kg
columns_dot_self kg
cov_exp_quad kg
crossprod can
csr_extract_u ?
csr_extract_v ?
csr_extract_w ?
csr_matrix_times_vector ?
csr_to_dense_matrix ?
cumulative_sum separate kernel
determinant separate kernel
diag_matrix kg
diag_post_multiply kg
diag_pre_multiply kg
diagonal kg
digamma kg
dims no
dirichlet_log kg
dirichlet_lpdf kg
dirichlet_rng kg+some work?
distance kg
divide kg
dot_product kg
dot_self kg
double_exponential_ccdf_log kg?
double_exponential_cdf kg?
double_exponential_cdf_log kg?
double_exponential_lccdf kg?
double_exponential_lcdf kg?
double_exponential_log kg?
double_exponential_lpdf kg?
double_exponential_rng kg+some work?
e can't find
eigenvalues_sym separate kernel
eigenvectors_sym separate kernel
exp_mod_normal_ccdf_log kg?
exp_mod_normal_cdf kg?
exp_mod_normal_cdf_log kg?
exp_mod_normal_lccdf kg?
exp_mod_normal_lcdf kg?
exp_mod_normal_log kg?
exp_mod_normal_lpdf kg?
exp_mod_normal_rng kg+some work?
exponential_ccdf_log kg?
exponential_cdf kg?
exponential_cdf_log kg?
exponential_lccdf kg?
exponential_lcdf kg?
exponential_log kg?
exponential_lpdf kg?
exponential_rng kg+some work?
fabs kg
falling_factorial kg?
fdim kg
fma kg
fmax kg
fmin kg
fmod kg?
frechet_ccdf_log kg?
frechet_cdf kg?
frechet_cdf_log kg?
frechet_lccdf kg?
frechet_lcdf kg?
frechet_log kg?
frechet_lpdf kg?
frechet_rng kg+some work?
gamma_ccdf_log kg?
gamma_cdf kg?
gamma_cdf_log kg?
gamma_lccdf kg?
gamma_lcdf kg?
gamma_log kg?
gamma_lpdf kg?
gamma_p kg
gamma_q kg
gamma_rng kg+some work?
gaussian_dlm_obs_log separate kernel?
gaussian_dlm_obs_lpdf separate kernel?
gumbel_ccdf_log kg?
gumbel_cdf kg?
gumbel_cdf_log kg?
gumbel_lccdf kg?
gumbel_lcdf kg?
gumbel_log kg?
gumbel_lpdf kg?
gumbel_rng kg+some work?
head kg
hypergeometric_log kg?
hypergeometric_lpmf kg?
hypergeometric_rng kg+some work?
hypot kg
if_else kg
inc_beta kg
int_step kg
integrate_1d separate kernel?
integrate_ode separate kernel?
integrate_ode_adams separate kernel?
integrate_ode_bdf separate kernel?
integrate_ode_rk45 separate kernel?
inv kg
inv_Phi kg
inv_chi_square_ccdf_log kg?
inv_chi_square_cdf kg?
inv_chi_square_cdf_log kg?
inv_chi_square_lccdf kg?
inv_chi_square_lcdf kg?
inv_chi_square_log kg?
inv_chi_square_lpdf kg?
inv_chi_square_rng kg+some work?
inv_cloglog kg
inv_gamma_ccdf_log kg?
inv_gamma_cdf kg?
inv_gamma_cdf_log kg?
inv_gamma_lccdf kg?
inv_gamma_lcdf kg?
inv_gamma_log kg?
inv_gamma_lpdf kg?
inv_gamma_rng kg+some work?
inv_logit kg
inv_sqrt kg
inv_square kg
inv_wishart_log kg?
inv_wishart_lpdf kg?
inv_wishart_rng kg+some work?
inverse separate kernel
inverse_spd separate kernel
is_inf kg
is_nan kg
lbeta kg
lchoose can't find
lkj_corr_cholesky_log kg?
lkj_corr_cholesky_lpdf kg?
lkj_corr_cholesky_rng kg+some work?
lkj_corr_log kg?
lkj_corr_lpdf kg?
lkj_corr_rng kg+some work?
lkj_cov_log kg?
lmgamma kg
lmultiply can't find
log1m kg
log1m_exp kg
log1m_inv_logit kg
log1p_exp kg
log_determinant separate kernel?
log_diff_exp kg
log_falling_factorial kg
log_inv_logit kg
log_mix kg
log_rising_factorial kg
log_softmax kg?
log_sum_exp kg
logical_and kg
logical_eq kg
logical_gt kg
logical_gte kg
logical_lt kg
logical_lte kg
logical_negation kg
logical_neq kg
logical_or kg
logistic_ccdf_log kg?
logistic_cdf kg?
logistic_cdf_log kg?
logistic_lccdf kg?
logistic_lcdf kg?
logistic_log kg?
logistic_lpdf kg?
logistic_rng kg+some work?
logit kg
lognormal_ccdf_log kg?
lognormal_cdf kg?
lognormal_cdf_log kg?
lognormal_lccdf kg?
lognormal_lcdf kg?
lognormal_log kg?
lognormal_lpdf kg?
lognormal_rng kg+some work?
machine_precision no
map_rect no?
matrix_exp separate kernel?
matrix_exp_multiply separate kernel?
max kg
mdivide_left separate kernel
mdivide_left_spd separate kernel
mdivide_right separate kernel
mdivide_right_spd separate kernel
mean kg
min kg
minus kg
modified_bessel_first_kind kg?
modified_bessel_second_kind kg?
modulus kg
multi_gp_cholesky_log kg?
multi_gp_cholesky_lpdf kg?
multi_gp_log ?
multi_gp_lpdf ?
multi_normal_cholesky_log kg?
multi_normal_cholesky_lpdf kg?
multi_normal_cholesky_rng kg+some work?
multi_normal_log kg?
multi_normal_lpdf kg?
multi_normal_prec_log kg?
multi_normal_prec_lpdf kg?
multi_normal_rng kg+some work?
multi_student_t_log kg?
multi_student_t_lpdf kg?
multi_student_t_rng kg+some work?
multinomial_log kg?
multinomial_lpmf kg?
multinomial_rng kg+some work?
multiply_log kg
multiply_lower_tri_self_transpose existing kernel?
neg_binomial_2_ccdf_log kg?
neg_binomial_2_cdf kg?
neg_binomial_2_cdf_log kg?
neg_binomial_2_lccdf kg?
neg_binomial_2_lcdf kg?
neg_binomial_2_log kg
neg_binomial_2_log_glm_lpmf
neg_binomial_2_log_log kg
neg_binomial_2_log_lpmf kg
neg_binomial_2_log_rng kg+some work?
neg_binomial_2_lpmf kg
neg_binomial_2_rng kg+some work?
neg_binomial_ccdf_log kg?
neg_binomial_cdf kg?
neg_binomial_cdf_log kg?
neg_binomial_lccdf kg?
neg_binomial_lcdf kg?
neg_binomial_log kg
neg_binomial_lpmf kg
neg_binomial_rng kg+some work?
negative_infinity no
normal_ccdf_log kg?
normal_cdf kg?
normal_cdf_log kg?
normal_id_glm_lpdf kg
normal_lccdf kg?
normal_lcdf kg?
normal_log kg
normal_lpdf kg
normal_rng kg+some work?
not_a_number no
num_elements no
ordered_logistic_glm_lpmf kg
ordered_logistic_log kg
ordered_logistic_lpmf kg
ordered_logistic_rng kg+some work?
ordered_probit_log kg?
ordered_probit_lpmf kg?
ordered_probit_rng kg+some work?
owens_t kg?
pareto_ccdf_log kg?
pareto_cdf kg?
pareto_cdf_log kg?
pareto_lccdf kg?
pareto_lcdf kg?
pareto_log kg?
pareto_lpdf kg?
pareto_rng kg+some work?
pareto_type_2_ccdf_log kg?
pareto_type_2_cdf kg?
pareto_type_2_cdf_log kg?
pareto_type_2_lccdf kg?
pareto_type_2_lcdf kg?
pareto_type_2_log kg?
pareto_type_2_lpdf kg?
pareto_type_2_rng kg+some work?
pi no
plus can't find
poisson_ccdf_log kg?
poisson_cdf kg?
poisson_cdf_log kg?
poisson_lccdf kg?
poisson_lcdf kg?
poisson_log kg
poisson_log_glm_lpmf kg
poisson_log_log kg
poisson_log_lpmf kg
poisson_log_rng kg+some work?
poisson_lpmf kg
poisson_rng kg+some work?
positive_infinity no
pow kg
prod kg
qr_Q separate kernel
qr_R separate kernel
qr_thin_Q separate kernel
qr_thin_R separate kernel
quad_form existing separate kernel
quad_form_diag separate kernel?
quad_form_sym existing separate kernel
rank kg
rayleigh_ccdf_log kg?
rayleigh_cdf kg?
rayleigh_cdf_log kg?
rayleigh_lccdf kg?
rayleigh_lcdf kg?
rayleigh_log kg?
rayleigh_lpdf kg?
rayleigh_rng kg+some work?
rep_array kg
rep_matrix kg
rep_row_vector kg
rep_vector kg
rising_factorial ?
row kg
rows no
rows_dot_product kg
rows_dot_self kg
scale_matrix_exp_multiply ?
scaled_inv_chi_square_ccdf_log kg?
scaled_inv_chi_square_cdf kg?
scaled_inv_chi_square_cdf_log kg?
scaled_inv_chi_square_lccdf kg?
scaled_inv_chi_square_lcdf kg?
scaled_inv_chi_square_log kg?
scaled_inv_chi_square_lpdf kg?
scaled_inv_chi_square_rng kg+some work?
sd kg
segment kg
singular_values separate kernel
size no
skew_normal_ccdf_log kg?
skew_normal_cdf kg?
skew_normal_cdf_log kg?
skew_normal_lccdf kg?
skew_normal_lcdf kg?
skew_normal_log kg?
skew_normal_lpdf kg?
skew_normal_rng kg+some work?
softmax kg
sort_asc separate kernel
sort_desc separate kernel
sort_indices_asc separate kernel
sort_indices_desc separate kernel
sqrt2 kg
square kg
squared_distance kg
std_normal_log kg
std_normal_lpdf kg
step kg
student_t_ccdf_log kg?
student_t_cdf kg?
student_t_cdf_log kg?
student_t_lccdf kg?
student_t_lcdf kg?
student_t_log kg?
student_t_lpdf kg?
student_t_rng kg+some work?
sub_col kg
sub_row kg
sum kg
tail kg
tcrossprod separate kernel?
to_array_1d no
to_array_2d no
to_matrix no
to_row_vector no
to_vector no
trace kg
trace_gen_quad_form existing separate kernel
trace_quad_form existing separate kernel
transpose kg
trigamma kg
trunc kg
uniform_ccdf_log kg?
uniform_cdf kg?
uniform_cdf_log kg?
uniform_lccdf kg?
uniform_lcdf kg?
uniform_log kg?
uniform_lpdf kg?
uniform_rng kg+some work?
variance kg
von_mises_log kg?
von_mises_lpdf kg?
von_mises_rng kg+some work?
weibull_ccdf_log kg?
weibull_cdf kg?
weibull_cdf_log kg?
weibull_lccdf kg?
weibull_lcdf kg?
weibull_log kg?
weibull_lpdf kg?
weibull_rng kg+some work?
wiener_log kg?
wiener_lpdf kg?
wishart_log kg?
wishart_lpdf kg?
wishart_rng kg+some work?
@rok-cesnovar
Copy link
Member

rok-cesnovar commented Oct 30, 2019

Fantastic stuff Tadej!

To provide a bit more context to anyone reading this. I asked Tadej to go over this list of supported Stan Math signatures and to evaluate what functions could be solved using the kernel generator he is working on and which functions would benefit from a custom OpenCL kernel.

So there are a total of 31 signatures that would still benefit from a custom implementation (+ whichever of those 25+ question marks will require custom kernels):

cumulative_sum
determinant
eigenvalues_sym
eigenvectors_sym
gaussian_dlm_obs_log
gaussian_dlm_obs_lpdf
integrate_1d
integrate_ode
integrate_ode_adams
integrate_ode_bdf
integrate_ode_rk45
inverse
inverse_spd
log_determinant
matrix_exp
matrix_exp_multiply
mdivide_left
mdivide_left_spd
mdivide_right
mdivide_right_spd
qr_Q
qr_R
qr_thin_Q
qr_thin_R
quad_form_diag
singular_values
sort_asc
sort_desc
sort_indices_asc
sort_indices_desc
tcrossprod

Others will be solved using the generator at some point. There is also a lot of overlap in these function so it wont actually be completely separate kernels. The integrate functions require a function argument so that one will need some additional compiler support. As will the algebra solver.

@SteveBronder
Copy link
Collaborator

Thanks @t4c1! fyi the reason I'm so adamant about the kernel generator being well doc'd and tested is because of how much stuff it's going to touch in the future (which is good and cool!)

tcrossprod could use our current stuff right? Though if doing crossprod transpose with a custom kernel could probably be more clever with the indices so you don't need to do an actual transpose

@t4c1
Copy link
Member Author

t4c1 commented Oct 30, 2019

Stan has many function, so there might be errors in the list. Each function should be reevaluated when OpenCL implementation is being developed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants