Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Result download options #171

Open
2 tasks
innovate-invent opened this issue May 29, 2019 · 1 comment
Open
2 tasks

Result download options #171

innovate-invent opened this issue May 29, 2019 · 1 comment

Comments

@innovate-invent
Copy link
Collaborator

innovate-invent commented May 29, 2019

  • various formats (genbank, embl, etc)
  • subsets of the data by record type

Doing the conversion client side would be a good option. There are javascript projects for parsing the different file types.

It might also be good to provide a tabular view of the data. Tabulator is a popular javascript library that supports manipulating, filtering and reordering rows and columns of a table. It also allows downloading the data after these manipulations, allowing users to download the data in nearly any configuration they want.
This would also resolve #121

There is an issue with providing output in Genbank format. Genbank only supports a limited number of feature types, genomic islands not being one of them.

@innovate-invent
Copy link
Collaborator Author

innovate-invent commented Feb 24, 2020

There are some barriers that will need to be overcome before Genbank format (or EMBL) can be supported for output.

Genbank supports a limited number of feature types, genomic islands not being one of them.
I emailed the NIH requesting advice on how to store unsupported features and they recommended using the misc_feature feature type. ex:

misc_feature    654..26955
                         /note="AbGRI1-5 genomic island"

This is not ideal as it places structured data in a free form text field.

If genomic islands can accurately be referred to as mobile elements then another feature was recommended:

mobile_element  3190..57412
                           /note="Integrative Element (IE)"
                           /mobile_element_type="other:Acinetobacter Genomic
                           Island 1 (AGI1)"

but this does not contain structured data identifying it as a genomic island.

My alternative proposal is:

mobile_element  3190..57412
                           /note="Integrative Element (IE)"
                           /mobile_element_type="other:genomic_island"
                           /standard_name="Acinetobacter Genomic
                           Island 1 (AGI1)"

This conforms to the Genbank standard here: http://www.insdc.org/files/feature_table.html
The mobile_element_type feature qualifier is defined as semi-structured data and genomic_island is the appropriate term from the Sequence Ontology. The Sequence Ontology also defines genomic_island as a descendant of mobile_genetic_element.

The other major barrier is that the stitcher currently generates invalid Genbank files. See brinkmanlab/galaxy-tools#5 and brinkmanlab/galaxy-tools#6 . The first linked issue could be resolved with brinkmanlab/galaxy-tools#8 but I doubt the second issue would be. See also #144

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant