You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While thinking on my own experiments, reading papers and talking to people, I can’t get rid of a feeling that we (as a field) assume that we understand ML methods better than we actually do.
I mostly speak about not controlling for many of the lower-level ML specifics that could contribute largely to the differences in final performance: interactions between optimizers, batch sizes, different subsets of the same dataset, operation precisions, hyper-parameters specific to architectures, and so on. These are not well understood by the ML community itself, and many of the choices are left as heuristics. Reproducibility is a known issue in modern ML.
I think BrainScore is in a position to bring neuroscientist the best practices from CS community and thus accelerate the progress through reproducibility and unification of tools.
One step towards this would be to manage model and dataset versioning with something like huggingface for example.
We will be able to extend the model cards that we provide now (which answer very little questions about what the model actually is) with a link to HF repo with exact weights and hyper-parameters.
The same thing is possible for datasets and their subsets
Since interactions with HF are standard and well integrated with many libraries, this would also make life easier for those, who want to take the best model from BrainScore and use it in further applications.
This way we not only (1) solve reproducibility and clarity of results but also (2) collect a lot of additional (meta)data that we could later analyze to find/build best alignment models and (3) allow users to do a search for models with their preferred settings.
The text was updated successfully, but these errors were encountered:
narekvslife
changed the title
Feature suggestion:
Feature suggestion: model and dataset versioning with HuggingFace
Jun 17, 2024
Hi @narekvslife - thanks for opening an issue and this is a great suggestion. We have talked previously in the past with linking models directly to our Github (for our "Brain Models", like CORnet-S for example), and HF for "Base Models", i.e. standard models like resnet-50 or MobileNets. We hope to add this soon as part of a refactoring of the way we are storing models.
Hi dear Brain-Score Team!
While thinking on my own experiments, reading papers and talking to people, I can’t get rid of a feeling that we (as a field) assume that we understand ML methods better than we actually do.
I mostly speak about not controlling for many of the lower-level ML specifics that could contribute largely to the differences in final performance: interactions between optimizers, batch sizes, different subsets of the same dataset, operation precisions, hyper-parameters specific to architectures, and so on. These are not well understood by the ML community itself, and many of the choices are left as heuristics. Reproducibility is a known issue in modern ML.
I think BrainScore is in a position to bring neuroscientist the best practices from CS community and thus accelerate the progress through reproducibility and unification of tools.
One step towards this would be to manage model and dataset versioning with something like huggingface for example.
This way we not only (1) solve reproducibility and clarity of results but also (2) collect a lot of additional (meta)data that we could later analyze to find/build best alignment models and (3) allow users to do a search for models with their preferred settings.
The text was updated successfully, but these errors were encountered: