Feature suggestion: model and dataset versioning with HuggingFace #121

narekvslife · 2024-06-17T03:33:01Z

Hi dear Brain-Score Team!

While thinking on my own experiments, reading papers and talking to people, I can’t get rid of a feeling that we (as a field) assume that we understand ML methods better than we actually do.

I mostly speak about not controlling for many of the lower-level ML specifics that could contribute largely to the differences in final performance: interactions between optimizers, batch sizes, different subsets of the same dataset, operation precisions, hyper-parameters specific to architectures, and so on. These are not well understood by the ML community itself, and many of the choices are left as heuristics. Reproducibility is a known issue in modern ML.

I think BrainScore is in a position to bring neuroscientist the best practices from CS community and thus accelerate the progress through reproducibility and unification of tools.

One step towards this would be to manage model and dataset versioning with something like huggingface for example.

We will be able to extend the model cards that we provide now (which answer very little questions about what the model actually is) with a link to HF repo with exact weights and hyper-parameters.
The same thing is possible for datasets and their subsets
Since interactions with HF are standard and well integrated with many libraries, this would also make life easier for those, who want to take the best model from BrainScore and use it in further applications.

This way we not only (1) solve reproducibility and clarity of results but also (2) collect a lot of additional (meta)data that we could later analyze to find/build best alignment models and (3) allow users to do a search for models with their preferred settings.

mike-ferguson · 2024-06-24T13:55:15Z

Hi @narekvslife - thanks for opening an issue and this is a great suggestion. We have talked previously in the past with linking models directly to our Github (for our "Brain Models", like CORnet-S for example), and HF for "Base Models", i.e. standard models like resnet-50 or MobileNets. We hope to add this soon as part of a refactoring of the way we are storing models.

narekvslife changed the title ~~Feature suggestion:~~ Feature suggestion: model and dataset versioning with HuggingFace Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature suggestion: model and dataset versioning with HuggingFace #121

Feature suggestion: model and dataset versioning with HuggingFace #121

narekvslife commented Jun 17, 2024

mike-ferguson commented Jun 24, 2024

Feature suggestion: model and dataset versioning with HuggingFace #121

Feature suggestion: model and dataset versioning with HuggingFace #121

Comments

narekvslife commented Jun 17, 2024

mike-ferguson commented Jun 24, 2024