Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EVA-3528: Refactor release tracker creation script #433

Merged
merged 4 commits into from
Mar 28, 2024

Conversation

apriltuesday
Copy link
Contributor

@apriltuesday apriltuesday commented Mar 25, 2024

  • Remove clustering automation, this is entirely superseded by submission and assembly ingestion
  • Move release tracker creation script to release automation
  • Refactor to ReleaseTracker class

I've noted the only functional changes in the script, otherwise only refactoring.

def _insert_entry_for_taxonomy_and_assembly(self, tax, asm_acc, sources, sc_name=None, fasta_path=None,
report_path=None, release_folder_name=None):
sc_name = sc_name if sc_name else get_scientific_name_from_ensembl(tax)
sc_name = sc_name.replace("'", "\''")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to accommodate our good friend Ambystoma 'unisexual hybrid'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ticket to remove scientific name where it's not needed: EVA-3405


def create_table_if_not_exists(self):
query_create_table = (
'create table if not exists eva_progress_tracker.clustering_release_tracker('
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have left this table as-is but made a note of which fields we could get rid of

@apriltuesday apriltuesday marked this pull request as ready for review March 26, 2024 16:15
return get_metadata_connection_handle(self.maven_profile, self.private_config_xml_file)

@cached_property
def mongo_conn(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously the mongo database being used was always production, even if you were creating the table in development - this seemed messy/confusing to me so I've unified it to use a single profile, but it did have the advantage that you could create a table in dev based on real production data for should_be_released. I don't think it matters much for how we use the table now (e.g. we don't actually compute any counts that need to be checked), but let me know if you think this functionality should be restored.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be an incentive to keep one species in mongoDEV so we can test properly.
It's a good change.

@apriltuesday apriltuesday self-assigned this Mar 26, 2024
Copy link
Member

@tcezard tcezard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and makes sense

return get_metadata_connection_handle(self.maven_profile, self.private_config_xml_file)

@cached_property
def mongo_conn(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be an incentive to keep one species in mongoDEV so we can test properly.
It's a good change.

@apriltuesday apriltuesday merged commit e322d25 into EBIvariation:master Mar 28, 2024
1 check passed
@apriltuesday apriltuesday deleted the EVA-3528 branch March 28, 2024 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants