-
Notifications
You must be signed in to change notification settings - Fork 6
Create SQLAlchemy ORM models for Climsoft CDMS #2
Comments
@isedwards Concerning the primary key issue, I'd rather we go with option 2 at the moment. However I am having difficulty understanding the table in the pdf. It would be nice to have a more detailed explanation. We could also discuss in a call. |
@Shaibujnr I have a bit more information on this from climsoft/climsoft-database#2 (comment) We should start by only creating models for the 24 main tables that are considered to "constitute the relational data model". These 24 tables are listed in the PDF file above. The other tables should be ignored for now. These 24 tables either have a 'single column' primary key already specified - or they have a unique index on a composite key that we can use as the primary key. Taking the first two tables from the PDF, these have:
We can use @faysal-ishtiaq 's So...
|
Got it @isedwards , I have been able to install mysql locally and run the DDL file, I now have the main database and test database as well. I am about to create the primary keys for the tables with the unique identifiers. I observed that in the pdf, for table |
In production, the unique indexes on the Climsoft In the case of In practice, I think In OpenCDMS, we should include Although our model will include |
After applying the alter tables commands locally: ALTER TABLE mariadb_climsoft_db_v4.observationfinal ADD PRIMARY KEY (recordedFrom, describedBy, obsDatetime, obsLevel);
ALTER TABLE mariadb_climsoft_db_v4.observationinitial ADD PRIMARY KEY (recordedFrom, describedBy, obsDatetime, qcStatus,acquisitionType, obsLevel);
ALTER TABLE ... we should then generate the models with sqlacodegen for the altered version and, given the output of alembic, manually update the models (on a separate commit so we can see the diff). We're then looking for whether:
|
I have been able to generate the models for the tables listed in the pdf, all 24 (25 actually) of them. 25 because one of the tables has a relationship to a table not listed in the pdf and therefore If I run I have currently updated the I believe this is the output you'd like to see, am I correct? @isedwards To describe the issue, i'd first like to describe the output of Now the issue. To whitelist/blacklist the report i.e constraint the diff to only report differences concerning a specified list of tables like the 24 we have pre-selected. I have to identify what table every operation is targeted at and filter by that table. For operations like |
This is diff report generated for climsoft without whitelisting or blacklisting.
|
This is the output when the report is whitelisted to only the defined models.
|
The whitelist algorithm basically, takes the second item in the list, checks if it's a For the second output, it looks like all we have left is So let me know what you think @isedwards |
Yes, it makes sense. Let's add a capability to ignore the Let's refer to these 25 tables as Once we have an empty result from |
output after whitelisting indexes too.
|
After generating the migration script with
For some reason, how do we resolve this? @isedwards |
Thank you @Shaibujnr - it's odd that It looks like the best solution would be to manually update the output of sqlacodegen to use |
This issue will begin to add support for the Climsoft CDMS. You will need an instance of MySQL 8+ installed and then to restore the Climsoft 4.1.1 SQL DDL file here.
When the Climsoft software is installed, the SQL DDL creates two databases:
The first is used as the actual climate database whilst the second contains some test data that is used for training with tutorials.
We only need to create models for
mariadb_climsoft_db_v4
. The models should be created in/opencdms/models/climsoft/v4_1_1.py
.Once we have the models, we should try using the
get_schema_diff
function fromopencdms.models
to check our new models against the second database schema (mariadb_climsoft_test_db_v4
) to see whether it is identical. It's possible that there may be minor differences.There is one final complication: some of the tables in Climsoft do not have primary keys explicitly specified - this prevents SQLAlchemy/sqlacodegen from creating ORM models (see docs here). The attached PDF describes where primary keys should be specified.
We should discuss our options for working around the primary key issue. We can either:
ALTER TABLE
and add the missing primary keys, this will then appear in future releases of ClimsoftI'm very happy to discuss this more - either in GitHub comments, messaging or scheduling a call...
Climsoft Data Model Tables and Keys.pdf
The text was updated successfully, but these errors were encountered: