Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database uses too much space #9

Open
AlessioMilanese opened this issue Apr 17, 2020 · 1 comment
Open

Database uses too much space #9

AlessioMilanese opened this issue Apr 17, 2020 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@AlessioMilanese
Copy link
Member

Test with building a database with 33k genes of length ~ 800 nucleotides.

To run create_db it takes:

Operation Time
Load tax 0.3 s
Load alignment 1 m
Check taxonomy 0.3 s
Train all classifiers 12 m
Learn tax level function 30 m
Save file 8 s

When running with time we have:

real      6182.84
user      8619.94
sys        389.02
3167723520  maximum resident set size
         0  average shared memory size
         0  average unshared data size
         0  average unshared stack size
 143395164  page reclaims
        14  page faults
         0  swaps
         0  block input operations
         0  block output operations
         0  messages sent
         0  messages received
         0  signals received
      2279  voluntary context switches
   7431869  involuntary context switches

The file created (HDF5 format) is 400 MB, which is unusable.

When compressed into .zip it's 2.3 MB.

@AlessioMilanese AlessioMilanese self-assigned this Apr 18, 2020
@AlessioMilanese
Copy link
Member Author

We compress the result in ae547b0 (following h5py guide).
The database goes from 402 MB to 40.2 MB.

@AlessioMilanese AlessioMilanese added the enhancement New feature or request label Jan 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant