Replicability in parallelized code #38

rgreminger · 2020-11-16T09:05:33Z

Hi,

Another addition to the page that could be potentially useful would be to have resources showing how to ensure replicability when running things in parallel. Specifically, when drawing random numbers in a function that then is run multiple times in parallel (e.g. run independent MC simulations), it will produce different draws depending on the number of cores on which the code is run, unless the seed is set in a specific way (e.g. only using set.seed(2) in R is not enough).

I'm not sure how well recognized this issue is (I've seen replication packages that completely ignore this, leading to different results when I executed the code on my machine), so it may already be useful to just generally highlight this. But let me know what you think, and if you think this would be useful, I (or someone else) can add a section on this at one point.

Best,
Rafael

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicability in parallelized code #38

Replicability in parallelized code #38

rgreminger commented Nov 16, 2020

Replicability in parallelized code #38

Replicability in parallelized code #38

Comments

rgreminger commented Nov 16, 2020