Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rule-based Uploader / Builder #5365

Merged
merged 6 commits into from
Mar 31, 2018

Conversation

jmchilton
Copy link
Member

@jmchilton jmchilton commented Jan 23, 2018

Vue-based component for defining collections and dataset uploads by applying rules to a list of files or more general spreadsheet style information (e.g. sample sheets or tabular data from data sources containing URL or FTP file paths for files along with metadata). The widget is fairly complex but very broadly is broken into two panes - one to preview how rules are applied to build up tabular data defining collections (each row corresponding to a file with columns for metadata and such) and one that displays defined rules and allows for editing of these rules and creation of new ones.

The goal behind defining rules this way instead of allowing the user to interact with the spreadsheet display directly is to enable scaling up collection creation. If a user wishes to upload hundreds of datasets - interacting with a widget directly for each input doesn't scale well and would be error prone. If a user wishes to upload hundreds of thousands of datasets - even loading this information in the GUI may not scale (though I've been impressed with the performance so far of this approach) and so we can potentially just display a preview of some of the rows and process the final set of rules on the backend.

Since we can handle an arbitrary number of columns this way, we can define multiple list identifiers per file and so we can easily construct nested lists. Hence this allows creation of not just potentially larger collections but arbitrarily complex lists as well. Paired identifiers via indicator columns are also implemented. This is our first GUI-based approach to allowing the creation of nested lists and enables a majority of the user stories I outlined in #4733.

In order to operate over lists of datasets directly - the multi-select history widget now has a new option "Build Collection from Rules" along side the other collection builders. This mode uses the well established dataset collection API to build collections from HDAs.

In order to operate on lists of FTP files or URLs - the upload widget has a new tab "Rule-based" tab that allows users to paste in tabular data -or- select a history dataset -or- user their FTP directory contents and then send this tabular data to the new builder widget. This mode uses the new data fetch API (#5220) to build collections and handle uploads of arbitrary collections of files.

The preview of the tabular data generated via rules is done via Handsontable - a JavaScript spreadsheet widget with a VueJS wrapper component. This turns out to be a fairly nice application for reactive components - as rules are added or modified the spreadsheet just naturally updates. In my hands the widget scales very nicely - I've uploaded files with tens of thousands of rows and rules modifying the data and changing the spreadsheet do not seem to cause significant delays in the web browser.

I was in the middle of developing a tutorial / training material section and detailed test cases for this component since it is more complex and less obvious than typical Galaxy GUI but @nekrut wants me to open a PR now and get something rougher than I'd like in. So this works I think - but it has some rough edges. I've created an issue to track these rough edges with #5380 and reviewers can decide what needs more polish before an initial merge and what can wait for follow up commits - I'll also track reviewer comments there.

Another thing to keep in mind, I think this is a power user feature like notebooks - I'll try to add lots of in-app help and documentation - but ultimately users are creating a program for defining how to input data into Galaxy. I think this is a good direction to move parts of the GUI - not just serving users without programming knowledge but providing paths to learn and incentivize learning these skills. That said the target audience is a bit different and so hopefully reviewers are on board for that vision also, and even the existing paired list creator uses regex for instance.

This PR now includes the first three test cases outlined in #5379 and so numerous screenshots are produced every time the PR is updated. Click the Selenium tests, then Build Artifacts, then screenshots. Here is an example:

The screenshots available:

screen shot 2018-01-29 at 8 31 10 am

The first use case - featuring pasting data into the browser, stripping out header information, and uploading individual datasets from ENA (not in a collection).

image

image

The second use case uses the same data but loads it from a history dataset instead of copying into the web browser and builds a collection (a flat list in this case) from the data.

image

image

The third example uses a new dataset from ENA and demonstrates building a list of pairs along with many more rule operations including hiding columns, swapping columns, splitting up a cell that has two URLs with a regular expression, extracting and mapping paired identifier information.

image

image

Related issues:

@nsoranzo
Copy link
Member

@jmchilton Are you aiming for 18.01? Or should the milestone be updated?

@jmchilton jmchilton modified the milestones: 18.01, 18.05 Jan 24, 2018
@jmchilton
Copy link
Member Author

@nsoranzo Breaking Anton's heart I've switched this to 18.05.

@jmchilton jmchilton changed the title Rule-based Uploader / Builder [WIP] Rule-based Uploader / Builder Jan 24, 2018
@dannon
Copy link
Member

dannon commented Jan 24, 2018

jmchilton#64 should fix this up a bit, resolving both the production build issues and the upload initialization failures that were breaking qunit tests.

@jmchilton
Copy link
Member Author

Updated issue description with links to followup issues and other related issues.

@mblue9
Copy link
Contributor

mblue9 commented Jan 25, 2018

Hi @jmchilton this sounds very exciting!

and will this:

In order to operate over lists of datasets directly - the multi-select history widget now has a new option "Build Collection from Rules" along side the other collection builders. This mode uses the well established dataset collection API to build collections from HDAs.

help with this issue? #740

@jmchilton jmchilton changed the title [WIP] Rule-based Uploader / Builder Rule-based Uploader / Builder Jan 25, 2018
@jmchilton
Copy link
Member Author

@mblue9

help with this issue? #740

No, I don't see how it would to be honest.

I saw your screenshot in that issue where you are dragging bits and pieces of the collection out and into multi-select drop box. This PR does allow creating nested lists however via the GUI, and if you have nested lists upfront I'm really hoping you won't even want to do the thing outlined in #740. You can have multiple replicates with different conditions or whatever and take advantage of that to avoid needing to specify the grouping manually after the fact? The tool you showed had a group name and then multiple files per group - well that really should have an option of just taking in a nested list and using the outer list identifier as the group name and the inner lists are the files you select. If parts of the analysis require things to be in a flatter fashion - there is a collection operation to build a flatten collection so you can have two views of the same data for different parts of the analysis. Then if everything stays in collections throughout you you'll always have the element identifier.

If this model is too simplistic - and you need to really filter the data deeply and get even more different views of it, if what you are doing manually is something that can be represented by the rule builder here - I did outline creating a collection operation tool out the rule language here in this issue (#5381) and that would allow really arbitrary different organizations and filtered views of the data and could applied in the middle of workflows for instance - so it would be great for manual or automated analyses.

@mvdbeek
Copy link
Member

mvdbeek commented Jan 25, 2018

Some things I noticed (in chronological order)

  • Can we directly start with the spreadsheet editor? This would be more obvious if you're trying to copy and paste a table. Also the editor could have a drop target for tables, right ?

  • My first instinct is to edit the table right away -- can we do that ? It seems we can resize it, so we should be able to edit it, right ?

  • We should be able to scroll horizontally in the table and/or resize the columns, my first try was hard to edit, and I have a high-resolution screen (see the screenshot)

screen shot 2018-01-25 at 16 10 29

  • If you enter a invalid regex you have to restart the rule-building you need to reload the whole page, this isn't obvious if you don't look in the JS console

  • I can see that a job launched in my logs, but it failed and there were no visual indications of this in the UI, this is the log:

galaxy.webapps.galaxy.api.tools INFO 2018-01-25 16:25:13,624 Keywords are {'history_id': u'846fb0a2a64137c0', 'targets': [{'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A1', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-CDS-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-CDS-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A2', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-aligned-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-aligned-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A3', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-chromosome-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-chromosome-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A4', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-clones-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-clones-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A5', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-exon-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-exon-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A6', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-five_prime_UTR-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-five_prime_UTR-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A7', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-gene-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-gene-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A8', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-gene_extended2000-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-gene_extended2000-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A9', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-intergenic-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-intergenic-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A10', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-intron-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-intron-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A11', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-miRNA-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-miRNA-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A12', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-miscRNA-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-miscRNA-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A13', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-ncRNA-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-ncRNA-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A14', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-predicted-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-predicted-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A15', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-pseudogene-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-pseudogene-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A16', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-sequence_features-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-sequence_features-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A17', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-synteny-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-synteny-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A18', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-tRNA-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-tRNA-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A19', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-three_prime_UTR-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-three_prime_UTR-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A20', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-transcript-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-transcript-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A21', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-translation-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-translation-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A22', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-transposon-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-transposon-r6.19.fasta.gz', 'dbkey': u'?'}]}]}
galaxy.webapps.galaxy.api.tools INFO 2018-01-25 16:25:13,625 payload {'targets': [{'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A1', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-CDS-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-CDS-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A2', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-aligned-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-aligned-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A3', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-chromosome-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-chromosome-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A4', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-clones-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-clones-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A5', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-exon-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-exon-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A6', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-five_prime_UTR-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-five_prime_UTR-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A7', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-gene-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-gene-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A8', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-gene_extended2000-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-gene_extended2000-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A9', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-intergenic-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-intergenic-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A10', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-intron-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-intron-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A11', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-miRNA-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-miRNA-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A12', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-miscRNA-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-miscRNA-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A13', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-ncRNA-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-ncRNA-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A14', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-predicted-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-predicted-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A15', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-pseudogene-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-pseudogene-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A16', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-sequence_features-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-sequence_features-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A17', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-synteny-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-synteny-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A18', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-tRNA-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-tRNA-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A19', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-three_prime_UTR-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-three_prime_UTR-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A20', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-transcript-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-transcript-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A21', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-translation-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-translation-r6.19.fasta.gz', 'dbkey': u'?'}]}, {'destination': {'type': u'hdca'}, 'collection_type': u'list', 'name': u'A22', 'elements': [{'url': u'ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-transposon-r6.19.fasta.gz', 'src': u'url', 'ext': u'auto', 'name': u'dmel-all-transposon-r6.19.fasta.gz', 'dbkey': u'?'}]}]}
galaxy.webapps.galaxy.api.tools INFO 2018-01-25 16:25:13,626 {"check_content": true, "targets": [{"destination": {"type": "hdca"}, "collection_type": "list", "name": "A1", "elements": [{"src": "url", "name": "dmel-all-CDS-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-CDS-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A2", "elements": [{"src": "url", "name": "dmel-all-aligned-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-aligned-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A3", "elements": [{"src": "url", "name": "dmel-all-chromosome-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-chromosome-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A4", "elements": [{"src": "url", "name": "dmel-all-clones-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-clones-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A5", "elements": [{"src": "url", "name": "dmel-all-exon-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-exon-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A6", "elements": [{"src": "url", "name": "dmel-all-five_prime_UTR-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-five_prime_UTR-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A7", "elements": [{"src": "url", "name": "dmel-all-gene-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-gene-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A8", "elements": [{"src": "url", "name": "dmel-all-gene_extended2000-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-gene_extended2000-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A9", "elements": [{"src": "url", "name": "dmel-all-intergenic-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-intergenic-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A10", "elements": [{"src": "url", "name": "dmel-all-intron-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-intron-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A11", "elements": [{"src": "url", "name": "dmel-all-miRNA-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-miRNA-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A12", "elements": [{"src": "url", "name": "dmel-all-miscRNA-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-miscRNA-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A13", "elements": [{"src": "url", "name": "dmel-all-ncRNA-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-ncRNA-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A14", "elements": [{"src": "url", "name": "dmel-all-predicted-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-predicted-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A15", "elements": [{"src": "url", "name": "dmel-all-pseudogene-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-pseudogene-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A16", "elements": [{"src": "url", "name": "dmel-all-sequence_features-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-sequence_features-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A17", "elements": [{"src": "url", "name": "dmel-all-synteny-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-synteny-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A18", "elements": [{"src": "url", "name": "dmel-all-tRNA-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-tRNA-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A19", "elements": [{"src": "url", "name": "dmel-all-three_prime_UTR-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-three_prime_UTR-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A20", "elements": [{"src": "url", "name": "dmel-all-transcript-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-transcript-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A21", "elements": [{"src": "url", "name": "dmel-all-translation-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-translation-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A22", "elements": [{"src": "url", "name": "dmel-all-transposon-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-transposon-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}]}
galaxy.tools DEBUG 2018-01-25 16:25:13,630 Validated and populated state for tool request (0.130 ms)
galaxy.tools.actions.upload DEBUG 2018-01-25 16:25:13,652 Persisted uploads (0.011 ms)
galaxy.tools.actions.upload INFO 2018-01-25 16:25:13,653 incoming are {'request_json': '{"check_content": true, "targets": [{"destination": {"type": "hdca"}, "collection_type": "list", "name": "A1", "elements": [{"src": "url", "name": "dmel-all-CDS-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-CDS-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A2", "elements": [{"src": "url", "name": "dmel-all-aligned-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-aligned-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A3", "elements": [{"src": "url", "name": "dmel-all-chromosome-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-chromosome-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A4", "elements": [{"src": "url", "name": "dmel-all-clones-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-clones-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A5", "elements": [{"src": "url", "name": "dmel-all-exon-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-exon-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A6", "elements": [{"src": "url", "name": "dmel-all-five_prime_UTR-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-five_prime_UTR-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A7", "elements": [{"src": "url", "name": "dmel-all-gene-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-gene-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A8", "elements": [{"src": "url", "name": "dmel-all-gene_extended2000-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-gene_extended2000-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A9", "elements": [{"src": "url", "name": "dmel-all-intergenic-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-intergenic-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A10", "elements": [{"src": "url", "name": "dmel-all-intron-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-intron-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A11", "elements": [{"src": "url", "name": "dmel-all-miRNA-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-miRNA-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A12", "elements": [{"src": "url", "name": "dmel-all-miscRNA-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-miscRNA-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A13", "elements": [{"src": "url", "name": "dmel-all-ncRNA-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-ncRNA-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A14", "elements": [{"src": "url", "name": "dmel-all-predicted-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-predicted-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A15", "elements": [{"src": "url", "name": "dmel-all-pseudogene-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-pseudogene-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A16", "elements": [{"src": "url", "name": "dmel-all-sequence_features-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-sequence_features-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A17", "elements": [{"src": "url", "name": "dmel-all-synteny-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-synteny-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A18", "elements": [{"src": "url", "name": "dmel-all-tRNA-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-tRNA-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A19", "elements": [{"src": "url", "name": "dmel-all-three_prime_UTR-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-three_prime_UTR-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A20", "elements": [{"src": "url", "name": "dmel-all-transcript-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-transcript-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A21", "elements": [{"src": "url", "name": "dmel-all-translation-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-translation-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}, {"destination": {"type": "hdca"}, "collection_type": "list", "name": "A22", "elements": [{"src": "url", "name": "dmel-all-transposon-r6.19.fasta.gz", "url": "ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r6.19_FB2017_06/fasta/dmel-all-transposon-r6.19.fasta.gz", "dbkey": "?", "ext": "auto", "in_place": false, "purge_source": false}]}]}', 'files': [{'file_data': None, '__index__': 0, 'url_paste': None}], 'request_version': '1'}
galaxy.tools.actions.upload DEBUG 2018-01-25 16:25:13,653 Checked and cleaned uploads (0.013 ms)
galaxy.tools.actions.upload_common INFO 2018-01-25 16:25:13,669 tool __DATA_FETCH__ created job id 964
galaxy.tools.actions.upload DEBUG 2018-01-25 16:25:13,690 Created upload job (36.473 ms)
galaxy.tools.execute DEBUG 2018-01-25 16:25:13,690 Tool [__DATA_FETCH__] created job [964] (38.160 ms)
galaxy.tools.execute DEBUG 2018-01-25 16:25:13,698 Executed 1 job(s) for tool __DATA_FETCH__ request: (67.842 ms)
127.0.0.1 - - [25/Jan/2018:16:25:13 +0200] "POST /api/tools/fetch HTTP/1.1" 200 - "http://127.0.0.1:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36"
galaxy.jobs DEBUG 2018-01-25 16:25:13,853 (964) Working directory for job is: /Users/mvandenb/src/galaxy/database/jobs_directory/000/964
galaxy.jobs.handler DEBUG 2018-01-25 16:25:13,856 (964) Dispatching to local runner
galaxy.jobs DEBUG 2018-01-25 16:25:13,883 (964) Persisting job destination (destination id: local)
galaxy.jobs.runners DEBUG 2018-01-25 16:25:13,888 Job [964] queued (31.298 ms)
galaxy.jobs.handler INFO 2018-01-25 16:25:13,894 (964) Job dispatched
galaxy.jobs.command_factory INFO 2018-01-25 16:25:14,045 Built script [/Users/mvandenb/src/galaxy/database/jobs_directory/000/964/tool_script.sh] for tool command [python '/Users/mvandenb/src/galaxy/lib/galaxy/tools/data_fetch.py' --galaxy-root '/Users/mvandenb/src/galaxy' --datatypes-registry '/Users/mvandenb/src/galaxy/database/jobs_directory/000/964/registry.xml' --request-version '1' --request '/Users/mvandenb/src/galaxy/database/jobs_directory/000/964/tmpzv7eVM']
galaxy.jobs.runners DEBUG 2018-01-25 16:25:14,081 (964) command is: rm -rf working; mkdir -p working; cd working; /Users/mvandenb/src/galaxy/database/jobs_directory/000/964/tool_script.sh; return_code=$?; cd '/Users/mvandenb/src/galaxy/database/jobs_directory/000/964';
[ "$GALAXY_VIRTUAL_ENV" = "None" ] && GALAXY_VIRTUAL_ENV="$_GALAXY_VIRTUAL_ENV"; _galaxy_setup_environment True
python "/Users/mvandenb/src/galaxy/database/jobs_directory/000/964/set_metadata_oXzAqd.py" "/Users/mvandenb/src/galaxy/database/jobs_directory/000/964/registry.xml" "/Users/mvandenb/src/galaxy/database/jobs_directory/000/964/working/galaxy.json"  5242880; sh -c "exit $return_code"
galaxy.jobs.runners.local DEBUG 2018-01-25 16:25:14,097 (964) executing job script: /Users/mvandenb/src/galaxy/database/jobs_directory/000/964/galaxy_964.sh
galaxy.jobs DEBUG 2018-01-25 16:25:14,105 (964) Persisting job destination (destination id: local)
127.0.0.1 - - [25/Jan/2018:16:25:14 +0200] "GET /api/jobs/948a5d82d737f13d HTTP/1.1" 200 - "http://127.0.0.1:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36

....

galaxy.jobs.runners.local DEBUG 2018-01-25 16:26:33,915 execution finished: /Users/mvandenb/src/galaxy/database/jobs_directory/000/964/galaxy_964.sh
galaxy.jobs.output_checker INFO 2018-01-25 16:26:33,920 Job 964: Fatal error: Exit code 1 ()
galaxy.jobs.output_checker DEBUG 2018-01-25 16:26:33,920 Tool exit code indicates an error, failing job.
galaxy.tools.parameters.output_collect INFO 2018-01-25 16:26:33,984 <galaxy.tools.parameters.output_collect.NullToolProvidedMetadata object at 0x115da5e10>
galaxy.jobs INFO 2018-01-25 16:26:34,005 Collecting metrics for Job 964
galaxy.jobs DEBUG 2018-01-25 16:26:34,037 job 964 ended (finish() executed in (106.700 ms))

Looking the job up in the database I see this in the stderr:

Fatal error: Exit code 1 ()
Traceback (most recent call last):
  File "/Users/mvandenb/src/galaxy/lib/galaxy/tools/data_fetch.py", line 317, in <module>
    main()
  File "/Users/mvandenb/src/galaxy/lib/galaxy/tools/data_fetch.py", line 42, in main
    galaxy_json = _request_to_galaxy_json(upload_config, request)
  File "/Users/mvandenb/src/galaxy/lib/galaxy/tools/data_fetch.py", line 52, in _request_to_galaxy_json
    fetched_target = _fetch_target(upload_config, target)
  File "/Users/mvandenb/src/galaxy/lib/galaxy/tools/data_fetch.py", line 175, in _fetch_target
    elements = elements_tree_map(_resolve_src, items)
  File "/Users/mvandenb/src/galaxy/lib/galaxy/tools/data_fetch.py", line 203, in elements_tree_map
    new_items.append(f(item))
  File "/Users/mvandenb/src/galaxy/lib/galaxy/tools/data_fetch.py", line 97, in _resolve_src
    name, path = _has_src_to_path(item)
  File "/Users/mvandenb/src/galaxy/lib/galaxy/tools/data_fetch.py", line 231, in _has_src_to_path
    path = sniff.stream_url_to_file(url)
  File "/Users/mvandenb/src/galaxy/lib/galaxy/datatypes/sniff.py", line 44, in stream_url_to_file
    page = urlopen(path)  # page will be .close()ed in stream_to_file
  File "/usr/local/Cellar/python/2.7.13_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/Cellar/python/2.7.13_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 429, in open
    response = self._open(req, data)
  File "/usr/local/Cellar/python/2.7.13_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 447, in _open
    '_open', req)
  File "/usr/local/Cellar/python/2.7.13_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 407, in _call_chain
    result = func(*args)
  File "/usr/local/Cellar/python/2.7.13_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1413, in ftp_open
    fw = self.connect_ftp(user, passwd, host, port, dirs, req.timeout)
  File "/usr/local/Cellar/python/2.7.13_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1435, in connect_ftp
    persistent=False)
  File "/usr/local/Cellar/python/2.7.13_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 877, in __init__
    self.init()
  File "/usr/local/Cellar/python/2.7.13_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 886, in init
    self.ftp.connect(self.host, self.port, self.timeout)
  File "/usr/local/Cellar/python/2.7.13_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ftplib.py", line 135, in connect
    self.sock = socket.create_connection((self.host, self.port), self.timeout)
  File "/usr/local/Cellar/python/2.7.13_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 575, in create_connection
    raise err
urllib2.URLError: <urlopen error ftp error: [Errno 60] Operation timed out>

Turns out this is also broken through the regular upload, I will check if this is a regression or not.

@mvdbeek
Copy link
Member

mvdbeek commented Jan 25, 2018

(I'm behind a proxy, this upload issue probably has nothing to do with the PR, except for the missing indication that something went wrong)

@mvdbeek
Copy link
Member

mvdbeek commented Jan 25, 2018

For the collection building from history items:

  • If you select pre-existing collections as input and try to create a new collection from this I would expect to get a list:list collection, but instead I get a flat list containing only the first element of each collection
  • If you mix collections and HDAs (not that I would know the correct result of this) you see a warning "Unknown error encountered [object Object]"
  • Creating nested collections from HDAs works like a charm, very nice!

@jmchilton
Copy link
Member Author

Thanks for trying it out @mvdbeek!

My first instinct is to edit the table right away -- can we do that ?

This is the instinct I'm trying to break. I'm not trying to build a crappy Excel - I'm trying to force people into writing programs that can be recorded and reran for munging their data and that scale to arbitrary numbers of rows. In that spirit of pretending I know better than users I explicitly disable editing. At lower level the way this works is that every time a rule is added or editing everything is regenerated from the initial data. If there are manual edits in the middle - the rules won't reapply - this breaks down.

That said - how about I offer a olive branch to this instinct to edit - since both you and Anton had the same instinct. As long as there are no rules applied to data - so when you first land on the page - you can edit and I'll record it as a pseudo rule "Manually Edited Data" and note that it isn't reproducible or exportable. Then as soon as rules start being applied to the data the table becomes read only and I add some mouseover that explains why it is read only.

Another way this could work is we keep the rule editor the same - real-only - and I could implement your other idea of allowing pasting directly into a spreadsheet but we could do that on the previous page during upload. So instead of pasting into the textbox we figure out how to paste into a spreadsheet and then allow editing from there as soon as the build button is clicked and the data is brought over to the rule builder things are frozen. This would make it very clear what is input and what is the set of rules the user is building - and that inputs are modifiable and the rules are reproducible/executable on the backend/etc....

Also the editor could have a drop target for tables, right ?

I've added that as an enhancement to #5380.

We should be able to scroll horizontally in the table and/or resize the columns, my first try was hard to edit, and I have a high-resolution screen (see the screenshot)

I feel like this sometimes works for me and sometimes doesn't. I can spend some more time working on the stylesheet and figure out what is wrong. I've added this as a bug to #5380.

If you enter a invalid regex you have to restart the rule-building you need to reload the whole page, this isn't obvious if you don't look in the JS console

I thought I had tested this but clearly not. I'll definitely fix that. I've added this as a bug to #5380.

I can see that a job launched in my logs, but it failed and there were no visual indications of this in the UI, this is the log:

There is a discussion of this in #5380 already - the cardinal sin like always is that Galaxy doesn't represent jobs (or workflows) in the history panel so these errors aren't seen. If you keep the modal dialog open it will poll on the job and report errors right in the panel I think - at least it did for me at various times.

Short of putting jobs in the history panel there are two fixes we could apply - we could either add an explicit output to the data fetching tool added in #5220 the way some tools do that could summarize what happened and that would turn red on failure -or- we could do some harder work and update the new data fetch API to prebuild skeletons for all datasets and collections that need to be created before the job runs (right now these are just discovered after the job is complete). The log file approach has always struck me as kind of hacky - but might be useful in other consumers of this API so who knows. Pre-creating everything sounds great - but is tricky to implement and could result in a lot of pointless red stuff in the history if things fail. We could also blend the approaches and sort discover a log file if the job fails.

I'm not sure how to proceed here - happy to hear advice though.

@jmchilton
Copy link
Member Author

If you select pre-existing collections as input and try to create a new collection from this I would expect to get a list:list collection, but instead I get a flat list containing only the first element of each collection
If you mix collections and HDAs (not that I would know the correct result of this) you see a warning "Unknown error encountered [object Object]"

I didn't even start to think about non-HDAs. I'll report and error if such things are used for now - obviously down the road some combinations of these things will be super powerful and super awesome though.

@guerler
Copy link
Contributor

guerler commented Feb 13, 2018

This is great. I am wondering if we can consolidate the other collection builders e.g. in the Collection uploader tab, the history panel and this one into a single collection builder ui?

@jmchilton jmchilton force-pushed the rule_based_uploader branch 2 times, most recently from 7470631 to 0626296 Compare February 21, 2018 17:35
@jmchilton
Copy link
Member Author

I am wondering if we can consolidate the other collection builders e.g. in the Collection uploader tab, the history panel and this one into a single collection builder ui?

The collection tab is unified with the history panel in the sense that they both launch the same components after the files or datasets are selected. That initial step of selecting datasets or files is pretty different though. This makes sense to me - in one case the datasets are in the history and in the other case the datasets don't yet exist in your history. I guess one could eliminate the upload collection tab and require files be added to the history first - but that is how it was in the past and many people called in confusing and resulted in a lot of extra error prone clicking that can be avoided by skipping the history step.

One could imagine replacing the existing collection builders (the things after the initial upload or after things are selected in the history) with this work - because indeed this can do a superset of what they can do, but I'd argue we shouldn't do that. This just isn't as simple as the paired-end list builder or the list builder when building small lists (less say a few dozen items or less). While I really think we should work toward make this as simple as possible - this approach will always be more complex I think and so it targeted at more technically sophisticated users. From the above description:

Another thing to keep in mind, I think this is a power user feature like notebooks - I'll try to add lots of in-app help and documentation - but ultimately users are creating a program for defining how to input data into Galaxy. I think this is a good direction to move parts of the GUI - not just serving users without programming knowledge but providing paths to learn and incentivize learning these skills. That said the target audience is a bit different and so hopefully reviewers are on board for that vision also.

The other thing to keep in mind is that this new option is not a collection upload option necessarily - since you can use rules to upload individual datasets. Check out the first example in the tutorial I'm building https://github.com/jmchilton/training-material/blob/rules/topics/introduction/tutorials/galaxy-intro-rules/tutorial.md for instance. I use uploading datasets as a hook into defining rules and such.

That is the pushback - where I might agree is say in the history panel - it would be nice to have a "create collection" for instance that opens up dialog or something like the new viz panel that describes the 4 different ways to build a collection in detail instead of having the 4 individual options. Something along the lines of what @martenson did for libraries #5080 - but maybe big buttons like in the viz instead of a dropdown - and then we could synchronize the help language across libraries and histories and maybe add some images and such that can be shared throughout?

The history => collection builder would then be a two step process and everyone explicitly shot down wizards along time ago when we first started talking about this. Integrating this into the uploader the same way would be a three step process - definitely a wizard. It seems like multi-step wizard-ish interfaces would give us more room to explain things and synchronize UI elements - so I think I'm in favor of doing that but it was explicitly shot down by the powers that be in the past.

@jmchilton jmchilton force-pushed the rule_based_uploader branch 2 times, most recently from f57a245 to 99a93f7 Compare March 1, 2018 18:32
@jmchilton
Copy link
Member Author

@mvdbeek I have updated this using #5609 which causes datasets and collections to be pre-created during job submission and to refresh the history just after the job is submitted. So if there is some runtime issue with creating collections or dataset there will now be a red collection or dataset.

The job details are not available in the GUI for direct output collections - that would be really helpful for end users trying to debug problems I think - this is a sort of general Galaxy issue. I may take a crack at this generally outside the context of this PR but it would enhance this PR.

@nsoranzo
Copy link
Member

nsoranzo commented Mar 9, 2018

@jmchilton Can you rebase? After the merge of #5220 this has a few conflicts.

@jmchilton jmchilton force-pushed the rule_based_uploader branch 3 times, most recently from c641aba to 6049021 Compare March 22, 2018 16:51
jmchilton and others added 6 commits March 29, 2018 11:06
Vue-based component for defining collections by applying rules to a list of files or more general spreadsheet style information (e.g. sample sheets or tabular data from data sources containing URL or FTP file paths for files along with metadata). The widget is fairly complex but very broadly is broken into two panes - one to preview how rules are applied to build up tabular data defining collections (each row corresponding to a file with columns for metadata and such) and one that displays defined rules and allows for editing of these rules and creation of new ones.

The goal behind defining rules this way instead of allowing the user to interact with the spreadsheet display directly is to enable scaling up collection creation. If a user wishes to upload hundreds of datasets - interacting with a widget directly for each input doesn't scale well and would be error prone. If a user wishes to upload hundreds of thousands of datasets - even loading this information in the GUI may not scale (though I've been impressed with the performance so far of this approach) and so we can potentially just display a preview of some of the rows and process the final set of rules on the backend.

Since we can handle an arbitrary number of columns this way, we can define multiple list identifiers per file and so we can easily construct nested lists. Hence this allows creation of not just potentially larger collections but arbitrarily complex lists as well. Paired identifiers via indicator columns are also implemented.

In order to operate over lists of datasets directly - the multi-select history widget now has a new option "Build Collection from Rules" along side the other collection builders. This mode uses the well established dataset collection API to build collections from HDAs.

In order to operate on lists of FTP files or URLs - the upload widget has a new tab "Rule-based" tab that allows users to paste in tabular data or select a history dataset and then send this tabular data to the new builder widget. This will be extended to include FTP directories for instance over time. This mode uses the new data fetch API to build collections and handle uploads of arbitrary collections of files.

The preview of the tabular data generated via rules is done via [Handsontable](https://handsontable.com/) - a JavaScript spreadsheet widget with a VueJS [wrapper component](https://github.com/handsontable/vue-handsontable-official). This turns out to be a fairly nice application for reactive components - as rules are added or modified the spreadsheet just naturally updates. In my hands the widget scales very nicely - I've uploaded files with tens of thousands of rows and rules modifying the data and changing the spreadsheet do not seem to cause siignificant delays in the web browser.
…lation without a vue runtime for nested local components in vue SFCs, and other situations.
@martenson
Copy link
Member

I think it is ready for the big stage. Majority of things work as expected and besides few validation problems I encountered nothing that would stop me from using this efficiently. I will point out only one thing here: I think the editor/rule builder needs more canvas, the modal is too much of a size constraint.

It is very exciting feature @jmchilton and I think many people will love this! Thank you for your review @mvdbeek.

I will put this on https://test.galaxyproject.org and hopefully @nekrut @blankenberg @jgoecks and other power users will give it a spin so we can see what it can do in a real settings.

@martenson martenson merged commit e552bc4 into galaxyproject:dev Mar 31, 2018
@martenson
Copy link
Member

I deployed the new rule-based dataset/collection uploader on https://test.galaxyproject.org/
Testing would be much appreciated since this will probably be one of the headliners for 18.05/GCC.

p.s. sorry for leaving this open for so long @jmchilton 😞

@mblue9
Copy link
Contributor

mblue9 commented Apr 1, 2018

This is great!!! 👏 🎉

I've only played with it a small bit so far but it works beautifully for importing the counts files I need for a tutorial I'm working on (here: https://www.bioconductor.org/help/workflows/RNAseq123/)

e.g. if I paste this into the Rule-builder

10_6_5_11	LP	L004	ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM1545nnn/GSM1545535/suppl/GSM1545535_10_6_5_11.txt.gz
9_6_5_11	ML	L004	ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM1545nnn/GSM1545536/suppl/GSM1545536_9_6_5_11.txt.gz
purep53	Basal	L004	ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM1545nnn/GSM1545538/suppl/GSM1545538_purep53.txt.gz
JMS8-2	Basal	L006	ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM1545nnn/GSM1545539/suppl/GSM1545539_JMS8-2.txt.gz
JMS8-3	ML	L006	ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM1545nnn/GSM1545540/suppl/GSM1545540_JMS8-3.txt.gz
JMS8-4	LP	L006	ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM1545nnn/GSM1545541/suppl/GSM1545541_JMS8-4.txt.gz
JMS8-5	Basal	L006	ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM1545nnn/GSM1545542/suppl/GSM1545542_JMS8-5.txt.gz
JMS9-P7c	ML	L008	ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM1545nnn/GSM1545544/suppl/GSM1545544_JMS9-P7c.txt.gz
JMS9-P8c	LP	L008	ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM1545nnn/GSM1545545/suppl/GSM1545545_JMS9-P8c.txt.gz

I can import the counts files from GEO directly into collections for the groups Basal/LP/MP really easily.

😄 Yay!!!

Would be great to be able to add hashtags to save on doing that after importing eg add hashtags to the collections using the group column (Basal/LP/MP). Then it would be super quick to go from grabbing the counts, to groups ready for the next step in the tutorial - differential expression with limma-voom.

I have a few other small feedback points (should I put them here?):

  • could you have a "Close" (or exit) button on the Rule-based tab, same as on the other tabs (Regular/Composite/Collection)
  • instead of "Okay" button call it "Apply" ?
  • could you save last Definitions/Rules used, so if you want to redo/retest you don't have to specify all again
  • I couldn't see how to remove a Definition I've added e.g. see below (could there be a small "x" or something like beside the "A" in that screenshot"):

screen shot 2018-04-01 at 1 54 45 pm

  • could you have a "Reset" button on the screen below as would be good for clearing mistakes (like when you accidentally paste the contents of a bed file instead of the samplesheet):

screen shot 2018-04-01 at 12 55 09 pm

  • I tried to run a few tools (e.g. Cut) on the collections that were created but got the metadata error below:

screen shot 2018-04-01 at 7 29 29 pm

Thanks for this 😄

@jmchilton
Copy link
Member Author

@mblue9 Thanks for the quick feedback - I'll try to get at least some of this implemented this week. I've tracked a lot of these issues on a new issue here (#5822). The one about using previous rules is a little more intricate so I put it on a longer term tracking issue here #5381. I'll think about the metadata issue to - not sure what caused that. The RNA seq example looks exciting!

@mblue9
Copy link
Contributor

mblue9 commented Apr 1, 2018

Thanks to you @jmchilton for working on this! Just to say, the previous rules thing is only a nice-to-have for me at this stage, I'm not that fussed about that one atm. Much more interested in #5381 (comment)
(I'm excited by that RNA-seq example too as have been planning to write a limma-voom/Glimma tutorial for a while and imho it will be even better now that there's this cool tool 😄 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants