Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent handling of tolerance can lead to cryptic failure #190

Closed
RossBoylan opened this issue Mar 21, 2024 · 1 comment
Closed

Inconsistent handling of tolerance can lead to cryptic failure #190

RossBoylan opened this issue Mar 21, 2024 · 1 comment

Comments

@RossBoylan
Copy link

Symptom

> expect_equal(0.105487503092834, 0.10548753, tolerance = 1.5e-7)

Error: 0.105487503092834 (`actual`) not equal to 0.10548753 (`expected`).

actual != expected but don't know how to show the difference

Encountered while using testthat but debugging shows the problem lies in waldo, and so filing this report here.

Summary Analysis

The outer calls lead to waldo's num_equal which applies the tolerance to $|x-y|/|y| < \delta$ as long as $|y|>\delta$ where $\delta$ is the tolerance. Equivalently, it checks $|x-y|<\delta |y|$. Having discovered the difference, it then turns it over to compare_numeric() which, I think, attempts to show only as many digits as are necessary to show the difference. However, the logic here ignores the fact that the absolute difference that causes the test to fail is usually $\delta |y|$, and just uses $\delta$.

With the values shown above this means it's one digit short of what it needs to show the difference; since the numbers agree before that point there are no differences in the formatted values and so the list of (string) differences to be shown is empty. The result is the that compare_numeric() reports the "don't know how to show the difference error."

Possible Fix

In min_digits(),

n <- min(n, digits(tolerance))

use instead

  if (!is.null(tolerance)) {
   if (abs(y)>tolerance)
       tolerance <- tolerance*abs(y)
    n <- min(n, digits(tolerance))
  }

Although this may solve the immediate problem, it does not address the fact that the logic for interpreting the tolerance is living in at least 2 separate places, which seems undesirable.

Fuller Analysis

Here's the call stack, with the innermost call on the top

num_equal takes a tolerance arg -> FALSE
3.2. compare_numeric produces
actual != expected but don't know how to show the difference
3.1. num_equal says they don't match even though difference is -2.7e-8 and tol is 1.5e-7. x is 0.105, and so it must be doing relative tolerance. (it divided average absolute diff among those that differ by average y for same, unless average y < tolerance)
x = 0.105487503092834
y = 0.10548753
3. compare_vector
2. compare_by_attr -> empty string

  1. compare_terminate-> empty string
    compare_structure
    waldo::compare
    waldo_compare
    expect_waldo_equal
    expect_equal

The numbers indicate calls made in sequence: compare_structure() first called compare_terminate() which returned an empty string, then compare_by_attr() and finally compare_vector(). The latter first called num_equal() which said the arguments were not equal, and then compare_numeric() which first decided there were differences and then, after processing them, decided there weren't any.

The formatting weirdness between the stack line starting with 2. and the one with 1. is not intentional or significant; I just don't know how to prevent it.

Related Issues

As my note on num_equal() suggests, I found the interpretation of the tolerance surprising, consistent with #188 (at least with with scalar comparison I wasn't exposed to the average difference approach that it apparently uses). At least in waldo it is documented; in testthat the tolerance parameter is named, but absolutely no information on its exact meaning appears.

@hadley
Copy link
Member

hadley commented May 6, 2024

Closing since it's tracked in waldo

@hadley hadley closed this as completed May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants