-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating from cisTEM Beta to latest devel fails #510
Comments
Hey @twagner9 I like the way you've set up this PR. The tutorial dataset is super small, I think having numbers for projects with something like: The largest set may not be necessary depending on the results from the 100 & 500k. The main goal is to estimate how the time penalty scales when using the "fix." If it is O(n^2) for example, then this might not be good solution. Knowing the cost-to-benefit of the current fix will help me (and likely @jojoelfe since this is database related) consider how much brain space to allocate. |
My quick 2cents to issue number 2: The underlying problem is that in many places the databasse is queried using This can be fixed with some sort of custom reordering code, but as you pointed out the performance might be very bad. Maybe it would be easier to refactor in the required places to instead use an explicit:
so that the values are being always returned in the right order, no matter the order of the columns in the database |
It took me awhile to get a database set up with the appropriate number of particles. I ended up testing ~2k particles (small), ~120k (medium-ish) particles, and ~1.2 million particles (large), with steps completed through 3D refinement, and for the medium and large sets set a timer to track the amount of time spent updating the database.
I will test this next, but theoretically, this level of update should be limited to databases created only in those versions of cisTEM that existed prior to As I'm unsure whether many cisTEM users would still be utilizing the beta version or other versions this old, perhaps this amount of time is acceptable. I could create a functional loading bar, and/or (preferably and), at the very least, notify users in the update dialog that this process can take several minutes for projects created with older versions of cisTEM that contain many particles. |
A pull request resolving this issue will follow; I felt it was better to layout in depths the problems here and leave the solutions for the PR, rather than including it all in the PR.
Description
First, I want to note that the solution to this bug as implemented feels a bit hacky, and is not highly performant -- there are a few additional seconds added while updating a small database with tutorial apoferritin data, and I could imagine this taking much longer for a larger database -- so any suggestions to improve it on this front are welcome.
To that end, I'm going to go a bit in depth in describing the bug to see if there's a potential underlying (better) solution that I (and @timothygrant80, who originally wrote the solution, with later additions from me) missed. I suppose whether this solution is sufficient depends on if it's believed beta version is still in wide use in comparison with the devel version.
Bug Description
This bug specifically affects database upgrades that are at least as old as the beta version, but would also apply for any subsequent versions where the database schema was structured the same way. There were a few problems when attempting to update:
The original crash came from an issue while updating, where execution of a SQL command fails due to attempting to create a duplicate column, specifically,
ASSIGNED_SUBSET
. This was caused by two instances of theREFINEMENT_PACKAGE_CONTAINED_PARTICLES_
dynamic table being listed in the schema. This was resolved simply by removing one of the instantiations.After resolving problem 1, another crash would occur when attempting to open the Results Panel. This error looked like the following:
indicating that the
SYMMETRY
column wasn’t updated correctly. This stemmed from the later addition ofOUTPUT_PIXEL_SIZE
to theREFINEMENT_PACKAGE_ASSETS
static table, notably in the original position thatSYMMETRY
occupied in the beta database. All subsequent columns were then moved over by 1, and the following loop in Database::CheckSchema depends on proper ordering of the schema (via col_counter):And, for further confirmation, here is a screenshot of the Refinement Package Assets Panel, where symmetry == molecular weight, molecular weight == largest dimension, etc. (also note that box size, no. classes are intact, as they were before the new position of
OUTPUT_PIXEL_SIZE
in the db):These are the issues; solutions will be discussed in coming PR.
The text was updated successfully, but these errors were encountered: