Slicing: compare number of processors correctly #76

alejandrogallo · 2019-10-09T15:06:49Z

Hello,

we updated our version of CTF and were having some issues with regard
to the performance of the slicing in our code.

After some sniffing around we found this line in the slice method.
In version 1.4.1 the if statement had the < operator in it.
Some time after that this was changed into <= which according to my
understanding renders the else codeblock useless, i.e.

...
} else {
      tsr_A->read_local_nnz(&sz_A, &all_data_A);
//      printf("sz_A+%ld\n",sz_A);
}
...

This means that when the number of processors where tsr_B is
distributed among is equal to tsr_A, there is also a checking of the
dimensions and padding for A, and this means that CTF has to read
the data from A,

...
      tsr_A->write(blk_sz_B, sr->mulid(), sr->addid(), blk_data_B, 'r');
...

which makes this block slower.

If this is true, this pull request would be a fix to the problem. We have
certainly tested it and it confirmed our suspicion. For slices of big tensors
the difference between the < and the <= version is up to 50% in time,
according to our benchmarks. However, for small tensors it appears to be
roughly equivalent, which makes sense.

Thank you very much for your great project!

solomonik · 2019-10-10T14:41:38Z

Thanks a lot Alejandro!

These changes seem to make the python tests fail though, likely due to some subtleties involving sparsity, so I will need to investigate/adjust a bit before merging.

Slicing: compare number of processors correctly

738301d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slicing: compare number of processors correctly #76

Slicing: compare number of processors correctly #76

alejandrogallo commented Oct 9, 2019

solomonik commented Oct 10, 2019

Slicing: compare number of processors correctly #76

Are you sure you want to change the base?

Slicing: compare number of processors correctly #76

Conversation

alejandrogallo commented Oct 9, 2019

solomonik commented Oct 10, 2019