Implementation of ApplicationClassification
workflow.
Produces identical results to PNNL code on georgiy
and rmat
example datasets.
- Optimize performance
- Fuse kernels
- Get rid of transposes
- Segmented reduce instead of reduce by key?
- Could probably improve memory usage
- Profiling
- optional more correctness checking