Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding public URLs for tabular benchmark #121

Open
wants to merge 102 commits into
base: development
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
b2155dd
Adding sample RF space for tabular collection design
Neeratyoy Jun 23, 2021
ce405e6
Placeholder SVM benchmark to interface tabular data collection
Neeratyoy Jun 23, 2021
2ef3af8
Writing common ML benchmark class for tabular collection
Neeratyoy Jun 24, 2021
61b6963
Adding placeholder for HistGradientBoostedClassifier
Neeratyoy Jun 24, 2021
a5d0217
Minor code cleaning
Neeratyoy Jun 24, 2021
3def203
Reformatting output dict + option to add more metrics
Neeratyoy Jun 26, 2021
750cc7d
Removing redundant import
Neeratyoy Jun 28, 2021
e7665e6
Decoupling storage of costs for each metric
Neeratyoy Jun 30, 2021
47fe4cd
Including test scores in objective
Neeratyoy Jul 1, 2021
2d085ec
Documenting the structure of information in each fn eval.
Neeratyoy Jul 1, 2021
2da9d5c
Some decisions on lower bound for subsample fidelity
Neeratyoy Jul 2, 2021
751d2e9
AbstractBenchmark update for fidelity option + including XGBoost
Neeratyoy Jul 6, 2021
3f84afb
Adding sample RF space for tabular collection design
Neeratyoy Jun 23, 2021
09b296a
Placeholder SVM benchmark to interface tabular data collection
Neeratyoy Jun 23, 2021
af4f593
Writing common ML benchmark class for tabular collection
Neeratyoy Jun 24, 2021
df2462d
Adding placeholder for HistGradientBoostedClassifier
Neeratyoy Jun 24, 2021
4d1d2d6
Minor code cleaning
Neeratyoy Jun 24, 2021
299e592
Reformatting output dict + option to add more metrics
Neeratyoy Jun 26, 2021
c46321d
Removing redundant import
Neeratyoy Jun 28, 2021
17f6634
Decoupling storage of costs for each metric
Neeratyoy Jun 30, 2021
7de891f
Including test scores in objective
Neeratyoy Jul 1, 2021
ec316c3
Documenting the structure of information in each fn eval.
Neeratyoy Jul 1, 2021
e7f69b9
Some decisions on lower bound for subsample fidelity
Neeratyoy Jul 2, 2021
edb3e7f
AbstractBenchmark update for fidelity option + including XGBoost
Neeratyoy Jul 6, 2021
642027b
Merge branch 'thesis-paper' of https://github.com/Neeratyoy/HPOBench …
Neeratyoy Jul 7, 2021
9e907e6
Option to load data splits from disk
Neeratyoy Jul 8, 2021
f0d4f36
Reordering data load to work for different cases
Neeratyoy Jul 12, 2021
dbeae7c
Updating source of SVM HP range
Neeratyoy Jul 14, 2021
f277a2e
Adding Tabular Benchmark class
Neeratyoy Jul 14, 2021
60d5646
Adding TabularBenchmark interface + easy import
Neeratyoy Jul 15, 2021
c4100fd
Adding LR space
Neeratyoy Jul 16, 2021
9c6dcdb
Standardizing fidelity space definitions
Neeratyoy Jul 19, 2021
74b6919
Standardizing HPs + Adding NN space
Neeratyoy Jul 19, 2021
785055e
Small placeholder for testing
Neeratyoy Jul 19, 2021
0159a35
Updating NN HP space + Helper function for TabularBenchmark
Neeratyoy Jul 20, 2021
e9e097a
Adding fidelity range retrieval utility to TabularBenchmark
Neeratyoy Jul 20, 2021
4797109
Enforcing subsample lower bound check inside objective
Neeratyoy Jul 21, 2021
dbb7327
Bug fix + adding precicion as metric
Neeratyoy Jul 21, 2021
7d5ca57
Fixing param spaces and model building for LR, SVM
Neeratyoy Jul 22, 2021
a6d94bb
TabularBenchmark edit to read compressed files and query a dataframe
Neeratyoy Jul 26, 2021
93b6908
Not evaluating training set to save time
Neeratyoy Jul 27, 2021
8164eb0
Fidelity change for trees + NN space change
Neeratyoy Jul 27, 2021
6916c9c
Final RF space
Neeratyoy Jul 29, 2021
8e5912b
Final XGB space
Neeratyoy Jul 29, 2021
6968ac3
Final HistGB space
Neeratyoy Jul 30, 2021
79dd1f3
Finalizing RF, XGB, NN
Neeratyoy Aug 2, 2021
ca1e0d4
TabularBenchmark edit to process only table and metadata
Neeratyoy Aug 2, 2021
87133ed
Merge remote-tracking branch 'origin/development' into PR_Multi-fidel…
PhMueller Aug 4, 2021
6096204
Merge remote-tracking branch 'origin/development' into PR_Multi-fidel…
PhMueller Aug 4, 2021
0d70d36
TabularBenchmark
PhMueller Aug 11, 2021
12ebce8
Pycodestyle
PhMueller Aug 11, 2021
873781e
Flake8
PhMueller Aug 11, 2021
532a905
Adapt ML Benchmark Template to fit with current API
PhMueller Aug 11, 2021
9dbd61c
Corret Datamanager.
PhMueller Aug 11, 2021
0304146
Finalize HistGB Benchmarks
PhMueller Aug 11, 2021
3e95d19
Write OpenML Datamanager
PhMueller Aug 16, 2021
f3fbd58
Unify interface for the other ml benchmarks.
PhMueller Aug 16, 2021
e57fbcb
Flake + Pep
PhMueller Aug 16, 2021
f6131ea
Add Container Interface
PhMueller Aug 16, 2021
36bc391
Mark `task_id` as required.
PhMueller Aug 16, 2021
a5c7d62
Adapt Interfaces
PhMueller Aug 16, 2021
c5f6979
Fix minor errors.
PhMueller Aug 16, 2021
48af58d
Fix minor errors.
PhMueller Aug 16, 2021
cf24488
Pylint
PhMueller Aug 16, 2021
528dde1
Init Model can handle now Configurations
PhMueller Aug 17, 2021
6bdf5c0
PR Requests: Rename Classes
PhMueller Aug 17, 2021
b8b30a5
PR Requests: Move dependencies to correct directory
PhMueller Aug 17, 2021
875c594
PR Requests: Tabular Benchmarks - Remove unnecessary class definition
PhMueller Aug 17, 2021
8891e33
PR Requests: Minor improvments
PhMueller Aug 17, 2021
75f345d
PR Requests: Update upper bounds of the fidelities
PhMueller Aug 17, 2021
8c2ab6c
PR Requests: Remove OriginalTabBenchmarks
PhMueller Aug 17, 2021
e24d537
PR Requests: Revert the query function
PhMueller Aug 17, 2021
3c4f375
PR Requests: Minor improvements
PhMueller Aug 17, 2021
6fc7f57
Pycodestyle
PhMueller Aug 17, 2021
0430c68
Add missing requirements
PhMueller Aug 17, 2021
3eb3a2d
Minor Improvements
PhMueller Aug 17, 2021
fa691f7
ADD container recipes
PhMueller Aug 17, 2021
f64917e
PR: Fix path in tabular data loader
PhMueller Aug 19, 2021
b95d2a5
PR: Remove casting configspace to np.floats
PhMueller Aug 19, 2021
d7d7a2d
PR: Move everything back from ml_mmfb/ to ml/
PhMueller Aug 19, 2021
be641f8
PR: Remove pybnn from the init.
PhMueller Aug 19, 2021
7bc25bc
PR: Cleanup
PhMueller Aug 19, 2021
b0d9b7f
PR: Fix Tests
PhMueller Aug 19, 2021
59bd905
Adding public URLs for tabular benchmark
Neeratyoy Aug 19, 2021
6576e99
Merge branch 'PR_Multi-fidelity-tabular-benchmarks' into add-tabular-…
Neeratyoy Aug 19, 2021
f576fb3
Adding more models
Neeratyoy Aug 19, 2021
63f5177
Updating figshare URLs with new public ones
Neeratyoy Aug 20, 2021
5335831
PR Fix URLs and dependencies
PhMueller Aug 20, 2021
cf9b4ef
Updating URL for SVM data
Neeratyoy Aug 21, 2021
ed7d23e
Updating Tabular bench URLs
Neeratyoy Aug 23, 2021
9181bbb
PR Fix URLs and dependencies
PhMueller Aug 25, 2021
451ff08
PR Fix URLs and dependencies
PhMueller Aug 25, 2021
310b11e
Updating RF benchmark URL
Neeratyoy Aug 26, 2021
f01286b
Updating XGB URL
Neeratyoy Aug 26, 2021
12b72b1
PR Fix tests
PhMueller Aug 27, 2021
41aa96b
New Urls
PhMueller Aug 27, 2021
c23e354
Trigger Rebuild.
PhMueller Aug 30, 2021
1fa684c
Fix Dataloader Assertion
PhMueller Aug 30, 2021
5f015a6
Merge branch 'PR_Multi-fidelity-tabular-benchmarks' into add-tabular-…
Neeratyoy Aug 31, 2021
11c57bf
Merge branch 'development' into add-tabular-urls
Neeratyoy Sep 1, 2021
8d7ea97
Merge branch 'development' into add-tabular-urls
Neeratyoy Oct 6, 2021
6394521
inference cost key fix
Neeratyoy Oct 6, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion hpobench/benchmarks/ml/tabular_benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ def _objective(
metric_str = ', '.join(list(metrics.keys()))
assert metric in list(metrics.keys()), f"metric not found among: {metric_str}"
score_key = f"{evaluation}_scores"
cost_key = f"{evaluation}_scores"
cost_key = f"{evaluation}_costs"

key_path = dict()
for name in self.configuration_space.get_hyperparameter_names():
Expand Down
24 changes: 13 additions & 11 deletions hpobench/util/data_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,15 @@
import hpobench



tabular_multi_fidelity_urls = dict(
xgb="https://figshare.com/ndownloader/files/30469920",
svm="https://figshare.com/ndownloader/files/30379359",
lr="https://figshare.com/ndownloader/files/30379038",
rf="https://figshare.com/ndownloader/files/30469089",
nn="https://figshare.com/ndownloader/files/30379005"
)

class DataManager(abc.ABC, metaclass=abc.ABCMeta):
""" Base Class for loading and managing the data.

Expand Down Expand Up @@ -929,21 +938,14 @@ def _load(self) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray, np.ndar
class TabularDataManager(DataManager):
def __init__(self, model: str, task_id: [int, str], data_dir: [str, Path, None] = None):
super(TabularDataManager, self).__init__()

self.model = model
self.task_id = str(task_id)

url_dict = dict(
xgb="https://ndownloader.figshare.com/files/30469920",
svm="https://ndownloader.figshare.com/files/30379359",
lr="https://ndownloader.figshare.com/files/30379038",
rf="https://ndownloader.figshare.com/files/30469089",
nn="https://ndownloader.figshare.com/files/30379005"
)

url_dict = tabular_multi_fidelity_urls
assert model in url_dict.keys(), \
f'Model has to be one of {list(url_dict.keys())} but was {model}'

self.model = model
self.task_id = str(task_id)

self.url_to_use = url_dict.get(model)

if data_dir is None:
Expand Down