Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong results with uneven rankings #5

Open
julian-urbano opened this issue Aug 12, 2024 · 1 comment
Open

Wrong results with uneven rankings #5

julian-urbano opened this issue Aug 12, 2024 · 1 comment

Comments

@julian-urbano
Copy link

Hi!

We recently published a paper about handling ties in RBO, partly because, as you well know, it's not clear from the original paper how to do this. So we took a deep look at this, provided a solution to this gap (what we called $RBO^w$), plus two other variants to handle ties more in line with the Statistics literature ($RBO^a$ and $RBO^b$).

We of course took a look at the approach you took in your code, but we're afraid it has unintended consequences that make the whole RBO computation wrong whenever rankings are uneven (different length), regardless of whether there are ties or not.

Take for example these rankings:

  • Even without ties: ['N', 'H', 'M', 'A'] and ['C', 'G', 'N', 'A']
  • Even with ties: [{'N', 'H'}, 'M', 'A'] and ['C', 'G', {'N', 'A'}]
  • Uneven without ties: ['N', 'H', 'M', 'A', 'C', 'F', 'L'] and ['C', 'G', 'N', 'A']
  • Uneven with ties: [{'N', 'H'}, 'M', {'A', 'C', 'F'}, 'L'] and ['C', 'G', {'N', 'A'}]

The table below compares the results of the original implementation by Webber, our own implementation, and yours. In your case we find 2 places where you mention changes and when to comment/uncomment lines so that we get one result or another: lines 96 vs 98, and lines 227 vs 228:

Length Ties Webber Ours (w) 96 & 228 (base) 99 & 228 96 & 227 99 & 227
even no 0.3915 0.3915 0.3915 0.3915 0.3915 0.3915
even yes - 0.3876 0.405 0.5265 0.405 0.54
uneven no 0.4904 0.4904 0.451 0.5069 0.418 0.4904
uneven yes - 0.5156 0.4661 0.7627 0.4562 0.81

As you can see, while the results are always correct with even rankings, the current code base gives incorrect results with uneven rankings (would need to bring back lines 99 and 227). The results with ties are all different, but we intended to calculate different things anyway (I think).

So we wanted to let you know about this, because the modification for ties makes the no-ties results wrong. But then again, we strongly suggest to take a look at the paper because we dig into this problem in detail. Indeed, the formulations with ties following the original idea is different from what you implemented, besides presenting two other RBO variants. Again, our full implementation is available in Python and in R.

Cheers

@julian-urbano julian-urbano changed the title Wrong results with uneven rankings and with ties Wrong results with uneven rankings Aug 12, 2024
@julian-urbano
Copy link
Author

ping @dlukes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant