Introduction

Sparse matrix vector multiplication (SpMV) is of significant importance to computing in a number of scientific and engineering disciplines. However due to the arbitrary sparsity patterns and sizes of sparse matrices, the parellisation of SpMV is still beset with many operational issues including poor memory coalescing, thread divergence and load imbalance. I implement five different GPU based algorithms in CUDA and analyze their performance on different types and sizes of data. I try to draw out insights on the strengths and weaknesses of these algorithms and the situations in which they are best used. I look at performance in terms of computational throughput, memory bandwidth utilization and other system generated metrics.

Implementation

To compile, simply run make

To run the program, type ./spmv

To download data, refer to Matrices.pdf

Requirements

cuda-9.1 compute capability >= 5.2

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
code		code
gunrock		gunrock
Appendix.pdf		Appendix.pdf
Matrices.pdf		Matrices.pdf
README.md		README.md
SpMV_Report.pdf		SpMV_Report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Implementation

Requirements

About

Releases

Packages

Languages

yuang-chen/spmv-gpu

Folders and files

Latest commit

History

Repository files navigation

Introduction

Implementation

Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages