Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarks for the main function [Feature] #105

Closed
sfmig opened this issue Jul 20, 2023 · 3 comments
Closed

Add benchmarks for the main function [Feature] #105

sfmig opened this issue Jul 20, 2023 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@sfmig
Copy link
Contributor

sfmig commented Jul 20, 2023

Add a benchmark for the main function (detect+classify)

For it, try fetching larger data (larger than the test data in the repo, but not as large as a 100% realistic scenario) until the benchmarking time is reasonable:

  • increase the size of the data in benchmarks gradually
  • use GIN and fetch with pooch - this PR may be a good example to follow
  • smallest dataset that is realistic is 100GB
  • typically 10mins?
@sfmig sfmig added the enhancement New feature or request label Jul 20, 2023
@sfmig sfmig self-assigned this Jul 20, 2023
@willGraham01 willGraham01 transferred this issue from brainglobe/cellfinder-core Jan 3, 2024
@alessandrofelder alessandrofelder transferred this issue from brainglobe/cellfinder May 1, 2024
@adamltyson
Copy link
Member

To test with the largest data, could these workflows not be run on our internal runner, and the data live there permanently?

@sfmig
Copy link
Contributor Author

sfmig commented May 2, 2024

yes @adamltyson, that is exactly the plan.

This issue was migrated from the cellfinder repo, where we initially started the benchmarking work following a "modular" approach (i.e., benchmarking individual functions, rather than a workflow). But these comments are slightly outdated now. I think the comments on the size of the data came from a discussion on how to determine what is a small/large dataset.

I will close this issue now since:

@sfmig sfmig closed this as completed May 2, 2024
@github-project-automation github-project-automation bot moved this from Planned to Done in Core development May 2, 2024
@adamltyson
Copy link
Member

Ignore me, I got a notification about this issue (because it was transferred), and thought it was a new issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

No branches or pull requests

2 participants