Skip to content

Commit

Permalink
Merge branch 'main' into gmIC
Browse files Browse the repository at this point in the history
  • Loading branch information
tingshanL authored Jul 12, 2024
2 parents 634aeb1 + dc6c01c commit e2f9a77
Show file tree
Hide file tree
Showing 24 changed files with 119 additions and 175 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,6 @@ on:
types:
- labeled

# In order to remove the "CUDA CI" label we need to have write permissions for PRs
permissions:
pull-requests: write

jobs:
tests:
if: contains(github.event.pull_request.labels.*.name, 'CUDA CI')
Expand All @@ -21,9 +17,6 @@ jobs:
timeout-minutes: 20
name: Run Array API unit tests
steps:
- uses: actions-ecosystem/action-remove-labels@v1
with:
labels: CUDA CI
- uses: actions/setup-python@v5
with:
# XXX: The 3.12.4 release of Python on GitHub Actions is corrupted:
Expand Down
23 changes: 23 additions & 0 deletions .github/workflows/cuda-label-remover.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: Remove "CUDA CI" Label

# This workflow removes the "CUDA CI" label that triggers the actual
# CUDA CI. It is separate so that we can use the `pull_request_target`
# trigger which has a API token with write access.
on:
pull_request_target:
types:
- labeled

# In order to remove the "CUDA CI" label we need to have write permissions for PRs
permissions:
pull-requests: write

jobs:
label-remover:
if: contains(github.event.pull_request.labels.*.name, 'CUDA CI')
name: Remove "CUDA CI" Label
runs-on: ubuntu-20.04
steps:
- uses: actions-ecosystem/action-remove-labels@v1
with:
labels: CUDA CI
14 changes: 7 additions & 7 deletions doc/developers/maintainer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -105,12 +105,12 @@ This PR will be used to push commits related to the release as explained in
:ref:`making_a_release`.

You can also create a second PR from main and targeting main to increment the
``__version__`` variable in `sklearn/__init__.py` and in `pyproject.toml` to increment
the dev version. This means while we're in the release candidate period, the latest
stable is two versions behind the main branch, instead of one. In this PR targeting
main you should also include a new file for the matching version under the
``doc/whats_new/`` folder so PRs that target the next version can contribute their
changelog entries to this file in parallel to the release process.
``__version__`` variable in `sklearn/__init__.py` to increment the dev version.
This means while we're in the release candidate period, the latest stable is
two versions behind the main branch, instead of one. In this PR targeting main
you should also include a new file for the matching version under the
``doc/whats_new/`` folder so PRs that target the next version can contribute
their changelog entries to this file in parallel to the release process.

Minor version release (also known as bug-fix release)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -211,7 +211,7 @@ Making a release
enough) and to update the on-going development entry.

2. On the branch for releasing, update the version number in ``sklearn/__init__.py``,
the ``__version__`` variable, and in `pyproject.toml`.
the ``__version__`` variable.

For major releases, please add a 0 at the end: `0.99.0` instead of `0.99`.

Expand Down
2 changes: 2 additions & 0 deletions doc/modules/array_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -117,12 +117,14 @@ Metrics
- :func:`sklearn.metrics.d2_tweedie_score`
- :func:`sklearn.metrics.max_error`
- :func:`sklearn.metrics.mean_absolute_error`
- :func:`sklearn.metrics.mean_absolute_percentage_error`
- :func:`sklearn.metrics.mean_gamma_deviance`
- :func:`sklearn.metrics.mean_squared_error`
- :func:`sklearn.metrics.mean_tweedie_deviance`
- :func:`sklearn.metrics.pairwise.additive_chi2_kernel`
- :func:`sklearn.metrics.pairwise.chi2_kernel`
- :func:`sklearn.metrics.pairwise.cosine_similarity`
- :func:`sklearn.metrics.pairwise.cosine_distances`
- :func:`sklearn.metrics.pairwise.euclidean_distances` (see :ref:`device_support_for_float64`)
- :func:`sklearn.metrics.pairwise.paired_cosine_distances`
- :func:`sklearn.metrics.pairwise.rbf_kernel` (see :ref:`device_support_for_float64`)
Expand Down
8 changes: 6 additions & 2 deletions doc/whats_new/v1.6.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,17 @@ See :ref:`array_api` for more details.
- :func:`sklearn.metrics.max_error` :pr:`29212` by :user:`Edoardo Abati <EdAbati>`;
- :func:`sklearn.metrics.mean_absolute_error` :pr:`27736` by :user:`Edoardo Abati <EdAbati>`
and :pr:`29143` by :user:`Tialo <Tialo>` and :user:`Loïc Estève <lesteve>`;
- :func:`sklearn.metrics.mean_gamma_deviance` :pr:`29239` by :usser:`Emily Chen <EmilyXinyi>`;
- :func:`sklearn.metrics.mean_absolute_percentage_error` :pr:`29300` by :user:`Emily Chen <EmilyXinyi>`;
- :func:`sklearn.metrics.mean_gamma_deviance` :pr:`29239` by :user:`Emily Chen <EmilyXinyi>`;
- :func:`sklearn.metrics.mean_squared_error` :pr:`29142` by :user:`Yaroslav Korobko <Tialo>`;
- :func:`sklearn.metrics.mean_tweedie_deviance` :pr:`28106` by :user:`Thomas Li <lithomas1>`;
- :func:`sklearn.metrics.pairwise.additive_chi2_kernel` :pr:`29144` by :user:`Yaroslav Korobko <Tialo>`;
- :func:`sklearn.metrics.pairwise.chi2_kernel` :pr:`29267` by :user:`Yaroslav Korobko <Tialo>`;
- :func:`sklearn.metrics.pairwise.cosine_similarity` :pr:`29014` by :user:`Edoardo Abati <EdAbati>`;
- :func:`sklearn.metrics.pairwise.paired_cosine_distances` :pr:`29112` by :user:`Edoardo Abati <EdAbati>`.
- :func:`sklearn.metrics.pairwise.cosine_distances` :pr:`29265` by :user:`Emily Chen <EmilyXinyi>`;
- :func:`sklearn.metrics.pairwise.euclidean_distances` :pr:`29433` by :user:`Omar Salman <OmarManzoor>`;
- :func:`sklearn.metrics.pairwise.paired_cosine_distances` :pr:`29112` by :user:`Edoardo Abati <EdAbati>`;
- :func:`sklearn.metrics.pairwise.rbf_kernel` :pr:`29433` by :user:`Omar Salman <OmarManzoor>`.

**Classes:**

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,6 @@ def plot_permutation_importance(clf, X, y, ax):

mdi_importances = pd.Series(clf.feature_importances_, index=X_train.columns)
tree_importance_sorted_idx = np.argsort(clf.feature_importances_)
tree_indices = np.arange(0, len(clf.feature_importances_)) + 0.5

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 8))
mdi_importances.sort_values().plot.barh(ax=ax1)
Expand Down
11 changes: 3 additions & 8 deletions examples/linear_model/plot_quantile_regression.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,11 +109,6 @@
#
# We will use the quantiles at 5% and 95% to find the outliers in the training
# sample beyond the central 90% interval.
from sklearn.utils.fixes import parse_version, sp_version

# This is line is to avoid incompatibility if older SciPy version.
# You should use `solver="highs"` with recent version of SciPy.
solver = "highs" if sp_version >= parse_version("1.6.0") else "interior-point"

# %%
from sklearn.linear_model import QuantileRegressor
Expand All @@ -122,7 +117,7 @@
predictions = {}
out_bounds_predictions = np.zeros_like(y_true_mean, dtype=np.bool_)
for quantile in quantiles:
qr = QuantileRegressor(quantile=quantile, alpha=0, solver=solver)
qr = QuantileRegressor(quantile=quantile, alpha=0)
y_pred = qr.fit(X, y_normal).predict(X)
predictions[quantile] = y_pred

Expand Down Expand Up @@ -184,7 +179,7 @@
predictions = {}
out_bounds_predictions = np.zeros_like(y_true_mean, dtype=np.bool_)
for quantile in quantiles:
qr = QuantileRegressor(quantile=quantile, alpha=0, solver=solver)
qr = QuantileRegressor(quantile=quantile, alpha=0)
y_pred = qr.fit(X, y_pareto).predict(X)
predictions[quantile] = y_pred

Expand Down Expand Up @@ -254,7 +249,7 @@
from sklearn.metrics import mean_absolute_error, mean_squared_error

linear_regression = LinearRegression()
quantile_regression = QuantileRegressor(quantile=0.5, alpha=0, solver=solver)
quantile_regression = QuantileRegressor(quantile=0.5, alpha=0)

y_pred_lr = linear_regression.fit(X, y_pareto).predict(X)
y_pred_qr = quantile_regression.fit(X, y_pareto).predict(X)
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "scikit-learn"
version = "1.6.dev0"
dynamic = ["version"]
description = "A set of python modules for machine learning and data mining"
readme = "README.rst"
maintainers = [
Expand Down
17 changes: 8 additions & 9 deletions sklearn/decomposition/_lda.py
Original file line number Diff line number Diff line change
Expand Up @@ -194,15 +194,14 @@ class LatentDirichletAllocation(
In general, if the data size is large, the online update will be much
faster than the batch update.
Valid options::
'batch': Batch variational Bayes method. Use all training data in
each EM update.
Old `components_` will be overwritten in each iteration.
'online': Online variational Bayes method. In each EM update, use
mini-batch of training data to update the ``components_``
variable incrementally. The learning rate is controlled by the
``learning_decay`` and the ``learning_offset`` parameters.
Valid options:
- 'batch': Batch variational Bayes method. Use all training data in each EM
update. Old `components_` will be overwritten in each iteration.
- 'online': Online variational Bayes method. In each EM update, use mini-batch
of training data to update the ``components_`` variable incrementally. The
learning rate is controlled by the ``learning_decay`` and the
``learning_offset`` parameters.
.. versionchanged:: 0.20
The default learning method is now ``"batch"``.
Expand Down
6 changes: 1 addition & 5 deletions sklearn/kernel_approximation.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,9 @@

import numpy as np
import scipy.sparse as sp
from scipy.fft import fft, ifft
from scipy.linalg import svd

try:
from scipy.fft import fft, ifft
except ImportError: # scipy < 1.4
from scipy.fftpack import fft, ifft

from .base import (
BaseEstimator,
ClassNamePrefixFeaturesOutMixin,
Expand Down
29 changes: 8 additions & 21 deletions sklearn/linear_model/_quantile.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ class QuantileRegressor(LinearModel, RegressorMixin, BaseEstimator):
Method used by :func:`scipy.optimize.linprog` to solve the linear
programming formulation.
From `scipy>=1.6.0`, it is recommended to use the highs methods because
It is recommended to use the highs methods because
they are the fastest ones. Solvers "highs-ds", "highs-ipm" and "highs"
support sparse input data and, in fact, always convert to sparse csc.
Expand Down Expand Up @@ -100,8 +100,7 @@ class QuantileRegressor(LinearModel, RegressorMixin, BaseEstimator):
>>> X = rng.randn(n_samples, n_features)
>>> # the two following lines are optional in practice
>>> from sklearn.utils.fixes import sp_version, parse_version
>>> solver = "highs" if sp_version >= parse_version("1.6.0") else "interior-point"
>>> reg = QuantileRegressor(quantile=0.8, solver=solver).fit(X, y)
>>> reg = QuantileRegressor(quantile=0.8).fit(X, y)
>>> np.mean(y <= reg.predict(X))
0.8
"""
Expand Down Expand Up @@ -180,30 +179,18 @@ def fit(self, X, y, sample_weight=None):
# So we rescale the penalty term, which is equivalent.
alpha = np.sum(sample_weight) * self.alpha

if self.solver in (
"highs-ds",
"highs-ipm",
"highs",
) and sp_version < parse_version("1.6.0"):
if self.solver == "interior-point" and sp_version >= parse_version("1.11.0"):
raise ValueError(
f"Solver {self.solver} is only available "
f"with scipy>=1.6.0, got {sp_version}"
)
else:
solver = self.solver

if solver == "interior-point" and sp_version >= parse_version("1.11.0"):
raise ValueError(
f"Solver {solver} is not anymore available in SciPy >= 1.11.0."
f"Solver {self.solver} is not anymore available in SciPy >= 1.11.0."
)

if sparse.issparse(X) and solver not in ["highs", "highs-ds", "highs-ipm"]:
if sparse.issparse(X) and self.solver not in ["highs", "highs-ds", "highs-ipm"]:
raise ValueError(
f"Solver {self.solver} does not support sparse X. "
"Use solver 'highs' for example."
)
# make default solver more stable
if self.solver_options is None and solver == "interior-point":
if self.solver_options is None and self.solver == "interior-point":
solver_options = {"lstsq": True}
else:
solver_options = self.solver_options
Expand Down Expand Up @@ -246,7 +233,7 @@ def fit(self, X, y, sample_weight=None):
c[0] = 0
c[n_params] = 0

if solver in ["highs", "highs-ds", "highs-ipm"]:
if self.solver in ["highs", "highs-ds", "highs-ipm"]:
# Note that highs methods always use a sparse CSC memory layout internally,
# even for optimization problems parametrized using dense numpy arrays.
# Therefore, we work with CSC matrices as early as possible to limit
Expand All @@ -271,7 +258,7 @@ def fit(self, X, y, sample_weight=None):
c=c,
A_eq=A_eq,
b_eq=b_eq,
method=solver,
method=self.solver,
options=solver_options,
)
solution = result.x
Expand Down
Loading

0 comments on commit e2f9a77

Please sign in to comment.