Fixes for tests on nightly builds of numpy/scipy/pandas #6897

janezd · 2024-09-20T19:12:09Z

Issue

Scientific Python nightly wheels fails.

Description of changes

Cython code in Relief: Import NAN and INFTY from libc.math instead of from numpy.math. This caused an error in cythonizing.
Importin pytz: In test_pandas, do not import pytz. We do not (directly) depend on it, and pandas removes it in 3.0.
Indices in sparse matrices: Index-out-of-range error is a bug in scipy and was fixed in BUG: sparse: fix indexing after ellipsis and 2D array indexing scipy/scipy#21616.
Different result in nomograms: scipy changed the optimizer that is used for fitting logistic regression. @markotoplak fixed this by increasing the number of iterations
Time resolution in pandas: Pandas can now import time in 1 s resolution, where it previously used 1 ns, and the resulting type is <M8[s] instead of <M8[ns]. We convert this to floats containing seconds from epoch, so it doesn't affect us. A test failed just because it expected to see <M8[ns], so I changed the test.
Subclassing in pandas (OrangeDataFrame.sparse.todense will stop working in pandas 3.0 #6902): is caused by an old bug in pandas, now fixed in BUG: Fix SparseFrameAccessor.to_dense return type pandas-dev/pandas#59967. Our workaround stopped working in pandas 3. This PR correctly defines OrangeDataFrame, but for pandas < 3 it dynamically patches the bug in pandas. The patch can be removed when we raise pandas requirements to 3.0.

To be fixed elsewhere

Variable type guessing in Group By: The test that fails on Group by is a result of pandas being able to parse more date formats. This is a problem in Group By's design and is being fixed in GroupBy: Avoid guessing variable types #6906.

codecov · 2024-09-23T07:57:28Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.20%. Comparing base (eed39ef) to head (85638d9).
Report is 10 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6897      +/-   ##
==========================================
- Coverage   88.21%   88.20%   -0.01%     
==========================================
  Files         326      326              
  Lines       71264    71266       +2     
==========================================
  Hits        62863    62863              
- Misses       8401     8403       +2

markotoplak · 2024-10-04T08:59:41Z

In test_ownomogram.py, self._test_helper(self.lr_cls, [61, 39]) gives AssertionError: 'Probability: 61' not found in '<html><head> (...) <br/>Probability: 59%</body></html>\n. Did logistic regression changed and we can't do anything about it?

The optimizer in scipy changed. The correct final result is still the same. The problem is that Logistic Regression has a quite small default max_iter argument. Thus, the new and old optimizers stopped at different solutions. Increasing the number of iterations makes it OK for both.

This is a proper solution that would work before and would continue to work in pandas>=3 -- if it was not for a bug in pandas (pandas-dev/pandas#59913). Hence, this commit also (dynamically) patches the bug in pandas.

janezd force-pushed the fix-scientific-build branch from 26eaa93 to ed02e75 Compare September 23, 2024 07:41

janezd force-pushed the fix-scientific-build branch from 8347967 to 696c942 Compare September 23, 2024 11:52

janezd changed the title ~~Relief: Fix import from numpy.math~~ Fixes for latest versions of numpy/scipy/pandas Sep 23, 2024

janezd changed the title ~~Fixes for latest versions of numpy/scipy/pandas~~ Fixes for tests on nightly builds of numpy/scipy/pandas Sep 23, 2024

janezd force-pushed the fix-scientific-build branch 4 times, most recently from 973111f to f93e022 Compare October 3, 2024 07:39

janezd added the needs discussion Core developers need to discuss the issue label Oct 3, 2024

janezd removed the needs discussion Core developers need to discuss the issue label Oct 4, 2024

janezd mentioned this pull request Oct 4, 2024

OrangeDataFrame.sparse.todense will stop working in pandas 3.0 #6902

Closed

janezd force-pushed the fix-scientific-build branch from 066088a to ec7fa0a Compare October 4, 2024 15:11

janezd added 2 commits October 4, 2024 17:15

Relief: Fix import from numpy.math

44ef76b

test_pandas: Don't import pytz for newer pandas

1d0ff26

janezd force-pushed the fix-scientific-build branch from ec7fa0a to c57931c Compare October 4, 2024 15:15

janezd and others added 4 commits October 6, 2024 10:25

OrangeDataFrame: Fix patched constructor

cb482b5

This is a proper solution that would work before and would continue to work in pandas>=3 -- if it was not for a bug in pandas (pandas-dev/pandas#59913). Hence, this commit also (dynamically) patches the bug in pandas.

_convert_datetime: Resolve pandas FutureWarning

8f5f6aa

CSVImport tests: allow different time resolution (on pandas 3)

559f666

fix test_nomogram_lr for developlement scikit

85638d9

janezd force-pushed the fix-scientific-build branch from c57931c to 85638d9 Compare October 6, 2024 08:26

markotoplak merged commit b0acfd2 into biolab:master Oct 9, 2024
27 of 31 checks passed

janezd mentioned this pull request Oct 10, 2024

Update translations #6912

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes for tests on nightly builds of numpy/scipy/pandas #6897

Fixes for tests on nightly builds of numpy/scipy/pandas #6897

janezd commented Sep 20, 2024 •

edited

Loading

codecov bot commented Sep 23, 2024 •

edited

Loading

markotoplak commented Oct 4, 2024

Fixes for tests on nightly builds of numpy/scipy/pandas #6897

Fixes for tests on nightly builds of numpy/scipy/pandas #6897

Conversation

janezd commented Sep 20, 2024 • edited Loading

Issue

Description of changes

To be fixed elsewhere

codecov bot commented Sep 23, 2024 • edited Loading

Codecov Report

markotoplak commented Oct 4, 2024

janezd commented Sep 20, 2024 •

edited

Loading

codecov bot commented Sep 23, 2024 •

edited

Loading