Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge Master to clustering for new changes in GAMA Backend #181

Open
wants to merge 38 commits into
base: clustering
Choose a base branch
from

Conversation

prabhant
Copy link

No description provided.

PGijsbers and others added 30 commits February 5, 2022 09:49
and don't allow errored-individuals into the population.
* Keep full test sets when subsampling for ASHA

To keep results across rungs comparable.
Also allow resources to be specified as fraction of the dataset.

* Add ASHA and AsyncEA changes, start new header
* Decrease job queue

* Decrease job count by 1 when killing process

Because the individual is already taken from the input queue.
Otherwise this would lead to a number of "ghost jobs" that are assumed
to be evaluated even though the process was killed.
* Fix lookup of scorer name

* Update versions, make 3.10 compatible

* Add pyproject.toml instead of setup.py
* Workflow for unit test
* Increase leniency of stopwatch-related tests due to MacOS issue

On MacOS CI, the stopwatch consistently is off by ~0.1 seconds. Since stopwatch is only used for recording time, typically in the span of whole seconds and minutes, enforcing 'high' precision is not very important, so I would rather make the test more lenient than try to fix the bug.

* Run CLI tests without subprocess
* Update lower bound for scikit-learn to 1.1
* Add codecov upload to pipeline
* Numpy types changed in 1.20
* Remove travis CI configuration
* Update pre-commit configuration
* Black formatting
* Fix Flake8 warnings
* Fix mypy issues
* Fix code export for new scikit-learn

* Show warnings, but dont error, ignore scipy warning

The scipy warning is caused by scikit-learn internal usage of scipy.
See scikit-learn/scikit-learn#23633

* Explicitly add whiten to avoid deprecation warning

* Cast array to list to avoid ambiguous comparison

The previous statement was ambiguous as  the `not in` operation could also
interpreted to be used in element-wise fashion.

* Allow to ignore terminals in search space for Individual.from_string

This allows you to reconstruct an individual if additional hyperparameters
have been added to the search space.

* Add test for code export
Failing to remove the process will result in an infinite loop.
Requires publishing from a tagged commit that explicitly matches the workflow dispatch input and the version in gama/__version__.py.
Boston to Diabetes, system tests.
We don't want to automatically publish to PyPI.
* Move tool configurations together

* Removed unused imports, pass ruff linter

* Remove the GAMA Dashboard

* Bump black

* Bump mypy

* Replace flake8 with ruff

* Move mypy configuration to pyproject.toml

* Remove optional requirements for Dashboard

* Bump pre-commit

* Fix an issue introduced by the new eps penalty in sklearn 1.2

The default value changed from 1e-15 to "auto" that is equivalent to np.finfo(y_pred.dtype).eps.

* Explicitly add datetime format for parsing from log

* Load data as pandas dataframe

Because some pixels were inferred as categorical.
See also #193
* Rename `config` hyperparameter to `search_space`

* Add to_code stub

* Simplify expressions in if-condition

* Minor refactoring

* Simplify conditional logic

* Refactor conditional logic, generators and other minor details
@simonprovost
Copy link

@PGijsbers It would be fantastic to look into this one as well, following #210! I will keep that in mind, and if I have a day or so, I will see if we can create a brand-new PR with the new additions so that Classification Regression and Clustering will all be available with ConfigSpace ☀️ Yet, Clustering will in anyway not help my Ph.D so will have to look into that in my spare time.

@PGijsbers
Copy link
Member

I think has already diverged from main quite substantially. It's likely easier and better to look into a re-implementation rather than cleaning this up (especially after #210 is merged). Besides, clustering has a number of difficulties with the AutoML paradigm that GAMA uses (the internal metrics (i.e. they don't use labels) don't transfer that well to performance on external metrics (i.e., ones that do, and thus can evaluate performance based on ground truth). I am not entirely sure if it makes sense to integrate clustering at this point, which is one of the reasons I (and Prabhant) haven't put real effort behind merging this PR.

@simonprovost
Copy link

I think has already diverged from main quite substantially. It's likely easier and better to look into a re-implementation rather than cleaning this up (especially after #210 is merged). Besides, clustering has a number of difficulties with the AutoML paradigm that GAMA uses (the internal metrics (i.e. they don't use labels) don't transfer that well to performance on external metrics (i.e., ones that do, and thus can evaluate performance based on ground truth). I am not entirely sure if it makes sense to integrate clustering at this point, which is one of the reasons I (and Prabhant) haven't put real effort behind merging this PR.

Indeed! It makes sense now. Will not focus this then. Yet, when you have time, maybe put a label on the PR to avoid any future contributor ^^

Have a great day,

Cheers,

dependabot bot and others added 4 commits March 21, 2024 14:10
Bumps [black](https://github.com/psf/black) from 23.3.0 to 24.3.0.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](psf/black@23.3.0...24.3.0)

---
updated-dependencies:
- dependency-name: black
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Update version grep to include post release

* Update project location url
…218)

Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 3 to 4.1.7.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](actions/download-artifact@v3...v4.1.7)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants