-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hist] Improve calculation of TAxis::FindBin(x)
when x is at the bin edges
#14105
base: master
Are you sure you want to change the base?
Conversation
Starting build on |
125c995
to
44cf282
Compare
Starting build on |
Test Results 17 files 17 suites 4d 7h 27m 53s ⏱️ For more details on these failures, see this check. Results for commit 8f0a7cc. ♻️ This comment has been updated with latest results. |
Given that this was noticed again in the Forum (https://root-forum.cern.ch/t/numbering-convention-for-inner-histogram-bins/59077/3), can we perhaps rebase, merge and port to 6.32? |
44cf282
to
7f33a48
Compare
hist/hist/src/TAxis.cxx
Outdated
Int_t TAxis::DoFindFixBin(Double_t x) const | ||
{ | ||
int bin = 0; | ||
if (!fXbins.fN) { //*-* fix bins | ||
bin = 1 + int (fNbins * (x-fXmin)/(fXmax-fXmin) ); | ||
// apply correction when x is at the bin edge value | ||
// (computed as in GetBinLowEdge) a numerical error in the | ||
// above division can cause a migration in a neighbour bin. | ||
double binwidth = (fXmax - fXmin) / Double_t(fNbins); | ||
double upEdge = fXmin + (bin) * binwidth; | ||
double lowEdge = fXmin + (bin - 1) * binwidth; | ||
if (upEdge <= x) bin += 1; | ||
if (lowEdge > x) bin -= 1; | ||
} else { //*-* variable bin sizes | ||
//for (bin =1; x >= fXbins.fArray[bin]; bin++); | ||
bin = 1 + TMath::BinarySearch(fXbins.fN,fXbins.fArray,x); | ||
} | ||
return bin; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Int_t TAxis::DoFindFixBin(Double_t x) const | |
{ | |
int bin = 0; | |
if (!fXbins.fN) { //*-* fix bins | |
bin = 1 + int (fNbins * (x-fXmin)/(fXmax-fXmin) ); | |
// apply correction when x is at the bin edge value | |
// (computed as in GetBinLowEdge) a numerical error in the | |
// above division can cause a migration in a neighbour bin. | |
double binwidth = (fXmax - fXmin) / Double_t(fNbins); | |
double upEdge = fXmin + (bin) * binwidth; | |
double lowEdge = fXmin + (bin - 1) * binwidth; | |
if (upEdge <= x) bin += 1; | |
if (lowEdge > x) bin -= 1; | |
} else { //*-* variable bin sizes | |
//for (bin =1; x >= fXbins.fArray[bin]; bin++); | |
bin = 1 + TMath::BinarySearch(fXbins.fN,fXbins.fArray,x); | |
} | |
return bin; | |
} | |
Int_t TAxis::DoFindFixBin(Double_t x) const | |
{ | |
if (!fXbins.fN) { //*-* fix bins | |
// shift x and fXmax by fXmin: | |
x = x - fXmin; | |
const double b = fXmax - fXmin; | |
const double s = fNbins * x; // scaled version of x | |
double m = std::floor(s / b); | |
// iterative correction in case of floating point errors due to floating point division | |
while (m * b >= s) m = m - 1; | |
while ((m + 1) * b < s) m = m + 1; | |
return 1 + Int_t(m); | |
} else { //*-* variable bin sizes | |
//for (bin =1; x >= fXbins.fArray[bin]; bin++); | |
return 1 + TMath::BinarySearch(fXbins.fN,fXbins.fArray,x); | |
} | |
} |
I have some improvement suggestions:
- Better do the correction iteratively to make sure that even when we jumped around by more than one bin because of numeric effects, things are still OK
- Avoid floating point division in the correction by rewriting the equations for the comparisons
- My gut feeling is to better do the conversion to integer after the corrections and to use
std::floor
before. Like this, we reduce the number of int-with-float multiplications to a minimum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Lorenzo, thank you very much for this PR! After thinking about this for a while, I have made some suggestions to improve the algorithm inline. In particular, I think we should avoid floating point division in the correction step, for performance and precision reasons.
Hi Jonas, |
…n edges When x is at the bin edge values numerical error can cause the computation of the bin to be the one before (or after). Correct for this case assuming the bin edge are computed as in `TH1::getBinLowEdge` and `TH1::GetBinUpEdge`. It is clear that the bin edge values are represented as floating point, so depending on how they are computed they can be sometime different. However, it is better to have teh consistency to return the correct value when computed as internally in TAxis. This should fix teh problem reported in https://root-forum.cern.ch/t/bug-in-taxis-findbin/57210 and root-project#14091
7f33a48
to
3959abc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the PR and addressing the review comments!
hist/hist/src/TAxis.cxx
Outdated
//////////////////////////////////////////////////////////////////////////////// | ||
Int_t TAxis::DoFindFixBin(Double_t x) const | ||
{ | ||
int bin = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused variable
hist/hist/src/TAxis.cxx
Outdated
while (m * b >= s) | ||
m = m - 1; | ||
while ((m + 1) * b < s) | ||
m = m + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while (m * b >= s) | |
m = m - 1; | |
while ((m + 1) * b < s) | |
m = m + 1; | |
while (m * b > s) // if left bin boundary is greater then x, decrement | |
m = m - 1; | |
while ((m + 1) * b <= s) //if right bin boundary is smaller or equal, increment | |
m = m + 1; |
Sorry, I did the typical edge case mistake! I understand now that if the value is on the lower boundary, it should be considered in the bin.
Implement suggestions by Jonas R. to make FindFixBin more robust to numerical errors
3959abc
to
8f0a7cc
Compare
I see that the test is still failing. I'm suspecting it's now because of the implementation of I see that it first does the division to get the bin width, and then the multiplication. The multiplication is with |
Here is a little study to demonstrate this effect: int fNbins = 90;
double fXmin = 0;
double fXmax = 10;
int mybin = 8;
double xdm = fXmin + (mybin - 1) * ((fXmax - fXmin) / fNbins);
double xmd = (fXmin + (mybin - 1) * (fXmax - fXmin)) / fNbins;
std::cout << std::setprecision(20);
std::cout << "div then mult : " << xdm << std::endl;
std::cout << "mult then div : " << xmd << std::endl; The output is: div then mult : 0.77777777777777767909
mult then div : 0.77777777777777779011 So right now, there is unnecessary precision loss in the bin edge functions. |
When x is at the bin edge values numerical error can cause the computation of the bin to be the one before (or after). Correct for this case assuming the bin edge are computed as in
TH1::getBinLowEdge
andTH1::GetBinUpEdge
. It is clear that the bin edge values are represented as floating point, so depending on how they are computed they can be sometime different. However, it is better to have teh consistency to return the correct value when computed as internally in TAxis.This should fix the problem reported in https://root-forum.cern.ch/t/bug-in-taxis-findbin/57210
Fixes #14091
A test is also added for FindBin