Bias max_offset in pareto distribution #534

morgo · 2024-08-07T17:18:36Z

We use sysbench to test Spirit (an online schema change tool).

Spirit has an optimization where it can ignore keys that have been modified above a certain known point. Unfortunately this is a little bit difficult to test in sysbench because while many of our workloads look to be roughly pareto, it is the higher keys that are modified, and not the first keys in the table.

What I would like to propose is an option to the pareto distribution function. i.e. currently defined as:

uint32_t sb_rand_pareto(uint32_t a, uint32_t b) # starting-value, max-offset
{
  return a + (uint32_t) ((b - a + 1) *
                         pow(sb_rand_uniform_double(), pareto_power));
}

For our use-case we would instead like to have something like:

uint32_t sb_rand_pareto(uint32_t a, uint32_t b) # starting-value, max-offset
{
  return b - (uint32_t) ((b - a + 1) *
                         pow(sb_rand_uniform_double(), pareto_power));
}

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bias max_offset in pareto distribution #534

Bias max_offset in pareto distribution #534

morgo commented Aug 7, 2024 •

edited

Loading

Bias max_offset in pareto distribution #534

Bias max_offset in pareto distribution #534

Comments

morgo commented Aug 7, 2024 • edited Loading

morgo commented Aug 7, 2024 •

edited

Loading