Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: database breakage during update #511

Merged
merged 6 commits into from
Oct 15, 2024

Conversation

twagner9
Copy link
Collaborator

@twagner9 twagner9 commented Jul 12, 2024

Description

The corresponding PR to Issue #510.

Fixes #388

Edit: It took me awhile to find a suitable method for tracking this update via the GUI, and I landed on an interface as the best medium to avoid exposing any GUI code to the database, avoiding increasing compilation time and strange dependency issues.

I also had been underestimating the scale of this upgrade; for a single refinement package with ~1.2 million particles, if any classification and refinements have been run using it, it’s much more accurate to imagine the number of column changes on the scale of 100 million or more, into the billions – each CLASSIFICATION_RESULT_% and REFINEMENT_RESULT_% table has several columns that need to be changed for every row, which would be every particle in the package.

Because of this, upgrade times are substantially slower. I was able to use a project that was being used for 3D classification. It only contained REFINEMENT_RESULT_% tables (no CLASSIFICATION_RESULT_% tables), and refinement packages contained anywhere from 10,000 to 200,000+ particles, for a total of ~460 million row updates. The upgrade takes around 1h:30m to complete. I spoke with @timothygrant80 about this, and he said he thinks this is an uncommon enough case that this upgrade time is okay.

I have rebased my feature branch to be current with the master branch using to minimize conflicts and headaches

  • yes
  • no

Which compilers were tested

  • g++
  • icpc
  • clang
  • other (please specify)

These changes are isolated to the

  • gui
  • core library
  • gpu core library
  • program it modifies

How has the functionality been tested?

Please describe the tests that you ran to verify your changes. Please also note any relevant details for your test configuration.

  • Tested manually from GUI
  • Tested manually from CLI
  • Passed console tests
  • Passed samples functional testing
  • other (please specify)

Checklist:

  • I have not changed anything that did not need to be changed
  • I have performed a self-review of my own code
  • I have commented my code, (w.r.t. why), particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation {Ok to pass for now}
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

Previously, updating a database created from the beta cisTEM version
would result in a crash at the outset as output_pixel_size was being
improperly added to REFINEMENT_PACKAGE_CONTAINED_PARTICLES_ tables (it
was out of order).

Additionally, REFINEMENT_RESULTS_ and CLASSIFICATION_RESULTS_ have
pixel_size, aberration, microscope voltage, and amplitude contrast
updated as well, making all operations usable post-update.
Some columns still remained empty; updated to either give appropriate
(already existent) value, or filled with 0.0.
Using an interface that MainFrame inherits from and Database is aware of, the schema update is now accurately tracked when upgrading a database originally created in cisTEM to the current devel via a OneSecondProgress dialog.
@twagner9 twagner9 force-pushed the output_pixel_size_db_update_fix branch from 8e73ce4 to 68ae856 Compare October 3, 2024 15:23
@twagner9 twagner9 requested a review from bHimes October 9, 2024 14:41
Copy link
Collaborator

@bHimes bHimes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment much on the logic of the implementation, but the implementation details themselves look good. I've pointed out a few small things that should be quick to address.

src/gui/MainFrame.cpp Outdated Show resolved Hide resolved
src/gui/MainFrame.cpp Outdated Show resolved Hide resolved
src/gui/UpdateProgressTracker.h Outdated Show resolved Hide resolved
src/gui/UpdateProgressTracker.h Show resolved Hide resolved
-Made dedicated custom dialog for updating the database
-Retained same dialog format when no schema changes are detected
-Added Database::CopyDatabase for backing up the database
-Functionalized the update tracking logic as it can now be used for Update only and Backup and Update
-Changed backup function name in database.cpp/h
-Changed how failing backup is handled in MainFrame.cpp
@bHimes
Copy link
Collaborator

bHimes commented Oct 14, 2024

looks good

@jojoelfe
Copy link
Collaborator

This is great, you can close #388 with this.

@twagner9 twagner9 merged commit f151ba2 into master Oct 15, 2024
6 checks passed
@twagner9 twagner9 deleted the output_pixel_size_db_update_fix branch October 15, 2024 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't suggest backing up the database, do it!
3 participants