-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow len file to be supplied for bedToBigWig converter builtin #18356
Conversation
Test added and dependencies updated to latest versions Use case is VGP assemblies in workflows where the assembly is constantly being improved, so a custom genome would need to be revised for every assembly update. Possible but likely to lead to failures. This way can extract the lengths and be sure they are correct at each workflow run.
How to get hg17.len for the chromfile? Trying with the old dependency versions
ah. inbuilt tests are tricky. Cannot find hg17.len |
Darn. Need to provide an hg17 len file somehow. |
I think this is the file you are looking for:
https://hgdownload.soe.ucsc.edu/goldenPath/hg17/bigZips/hg17.chrom.sizes
Good luck!
Mike
…On Fri, Jun 7, 2024 at 10:52 PM Ross Lazarus ***@***.***> wrote:
Darn. Need to provide an hg17 len file somehow.
Will close this until I figure out what to do...
—
Reply to this email directly, view it on GitHub
<#18356 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABP34ZLQFF2T2NWO5HOBBDZGJWWDAVCNFSM6AAAAABI7PQLVSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJVG43TMNRXGA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
|
@mschatz
|
An external tool is the way to go, but we will have to leave the old converters in place for reproducibility. We've been aiming to reduce the amount of tools shipped in the codebase, as they're not as easy to provide, are often not relevant to non-biology domains and the whole dbkey/genome/build system is something that doesn't work too well when working with new and evolving assemblies. |
@mvdbeek: How about a single generic tool to combine the converters for bed/bam/cram/gff - those are the things the VGP will want - to bigwig? They have bed and gff with 6M rows when what they really want (from Arang's work) is the density with some stats. |
I think the tried and true thing is to orient yourself on the upstream command you're using. If as a hypothetical command you've got |
@mvdbeek: Ok, but remember there's a lot of boilerplate fixing to do for multiple converters to meet IUC standards - those things are old. Was looking at a slight change to Update: It only tests bed and that particular converter is probably broken unless GFF now has the same order of fields (chr,start,end,score) as bed/bedgraph? It was never thus AFAIK - hence the script in that converters directory gff_to_bed_converter.py. |
Submitting as a draft for comment: Should this converter be moved out of the
lib/galaxy/datatypes/converters
built-in tools into an IUC tool?In the long run, if Galaxy is to become a generic platform for all kinds of open science, all the built in genomic metadata and converters will eventually become pluggable, like everything else?
Currently the tool works if the reference has a built-in or custom genome dbkey.
Use case is VGP workflows where the reference assembly is constantly being improved until it is finished. A custom genome could work but would need to be manually revised for every assembly update. Possible that this would always be done manually, but fragile and likely to lead to failures from a missing contig in a new assembly so this PR allows a chromosome length file to be supplied. Test added and dependencies updated to latest versions
This way can generate the len file and be confident that the contigs and lengths are correct at each workflow run.
How to test the changes?
(Select all options that apply)
License