Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IUC Contribution Fest - RADSeq Tools and Workflows #422

Closed
tiagoantao opened this issue Nov 27, 2015 · 85 comments
Closed

IUC Contribution Fest - RADSeq Tools and Workflows #422

tiagoantao opened this issue Nov 27, 2015 · 85 comments

Comments

@tiagoantao
Copy link
Contributor

We are planning a Remote Contribution Fest on 7th and 8th of March for developers to work on Galaxy RADSeq tools.

RADSeq is a cheap sequencing technology that is used by many resource-limited groups who would benefit a lot from easy-to-use galaxy tools. Indeed there has been quite some interest in analyzing RADSeq with Galaxy. Currently there is a wrapper for stacks and little more to help with RAD specific analysis (though many other galaxy tools are useful with RADs - bwa, cap3, gatk, velvet, ...).

If you are interested in participating in the hackathon but not interested in actual tool development - we will assemble a list of smaller, manageable Python and JavaScript tasks to work on and certainly documentation is a chronically lacking for collections so we could use help there and no actual coding would be required.

We encourage ideas or advice about how to organize this so please let us know. A core group will be available on IRC all day and we will have google hangouts across those days to organize, answer questions, and report progress.

We will do our best to coordinate and make this hackathon a nice and productive experience and we would like to especially focus on working reasonable hours and discourage overnighters.

All forms of contribution are welcome!

Google hangout link for the hackathon:
https://hangouts.google.com/call/iiksaostt5dwfms4lph6nuj74ye

Original idea from @bgruening (galaxyproject/tools-devteam#26 (comment)). See #299 and #239 for information on the last two hackathons. This text is inspired on #299.

Things to work on

This list will be updated as comments are made on the issue. Your are strongly encouraged to leave your suggestion/opinion/criticism below.

Note that while the list is currently stacks-oriented, there is no reason for that to be the final list of stuff to do... Non-stacks based suggestions are strongly encouraged

  • Existing stacks plugin (https://github.com/genouest/galaxy-tools/tree/master/tools/stacks)
    • Support for data collections
    • cap3 integration
    • support multi-threading (GALAXY_SLOTS)
    • upgrade the version of stacks
    • support clone_filter
  • Implement stacks workflows
  • Check the feasibility of implementing the stacks web interface on Galaxy
  • have a process_radtags alike with a configurable enzyme sequence (i.e allow demultiplexing using a configurable sequence)
  • PyRAD
  • dDocent

Reading list

@bgruening
Copy link
Member

Just wanted to say that this is awesome! I'm in and try to help wherever I can!

@abretaud
Copy link
Contributor

abretaud commented Dec 1, 2015

Writing wrappers for PyRAD or dDocent would be cool too

@bgruening
Copy link
Member

@abretaud I added them to the list, thanks!

@tiagoantao
Copy link
Contributor Author

@bgruening @abretaud I have added a link to PyRad documentation. I will probably send an email to the author to see if he wants to participate. He is a Python programmer, so he might be interested...

@tiagoantao
Copy link
Contributor Author

@abretaud @bgruening @yvanlebras It is now new year and the metagenomics fest is long gone. I was thinking in resurrecting this in the next few days (start preparing, finding potential volunteers, ...)

@abretaud
Copy link
Contributor

abretaud commented Jan 7, 2016

Hi (and happy new year!)
It looks like a good idea! It's not really clear for me yet what we will be needed for the stacks web interface (i.e. potential IE). I don't have much time these days to look into it, but it would good to think about it before the hackathon

@tiagoantao
Copy link
Contributor Author

@abretaud There is a stacks package as you probably know http://toolshed.genouest.org/repos/cmonjeau/stacks . This was developed by @yvanlebras and @cmonjeau .

I am making a few changes myself (which I can make public), namely:

  • Support for GALAXY_SLOTS (multi-threading)
  • Support for collections
  • General version upgrade of tools

I had a local request for supporting clone_filter, which I can work on.

With regards to this plugin, @yvanlebras, you would consider accepting patches? And maybe put it under tools-uic?

@yvanlebras
Copy link
Contributor

Hi @tiagoantao,

For sure we will accept patches and put it under tools-iuc ! We will exchange locally in Rennes with @abretaud and @cmonjeau to prepare our participation to this contribution fest.

@tiagoantao
Copy link
Contributor Author

@yvanlebras Any possibility you would put the code on github (even if not under tools-iuc)? I could submit a PR against it.

@yvanlebras
Copy link
Contributor

@tiagoantao of course! I have to check with @cmonjeau, because I see only these 1.18 stacks tools even if we have a more recent 1.32 one..... I will keep you informed!

@yvanlebras
Copy link
Contributor

@tiagoantao
Copy link
Contributor Author

@yvanlebras In my version, I have upgraded all the tools. I will start submitting PRs today.

@yvanlebras
Copy link
Contributor

@tiagoantao Thanks! OK, I will have a look.

@yvanlebras
Copy link
Contributor

With @lecorguille, Maria Bernard and @abretaud, we begin to create things like remote hackathon dedicated Trello and HUBzero space to prepare the event.

@tiagoantao
Copy link
Contributor Author

I had a request for rapture support:
http://genetics.org/content/early/2015/12/22/genetics.115.183665

@tiagoantao
Copy link
Contributor Author

I am going to drop an email to the authors of PyRad and dDocent about our hackthon

@yvanlebras
Copy link
Contributor

Hi Tiago,

Good idea!

I cc Julian concerning Stacks !

Here in France, we begin to prepare the event ! A preliminary description is written on our GUGGO Galaxy working group dedicated space ( https://cesgo.genouest.org/groups/guggo/wiki/RADseqHackathon ).

Don't hesitate to come back to us if we can help !

Cheers,

Yvan

----- Mail original -----

De: "Tiago Antao" [email protected]
À: "galaxyproject/tools-iuc" [email protected]
Cc: "yvanlebras" [email protected]
Envoyé: Vendredi 5 Février 2016 01:26:26
Objet: Re: [tools-iuc] IUC Contribution Fest - RADSeq Tools and Workflows
(#422)

I am going to drop an email to the authors of PyRad and dDocent about our
hackthon


Reply to this email directly or view it on GitHub .


Yvan Le Bras, PhD @Yvan2935 <°))))><
e-Biogenouest project http://www.e-biogenouest.org
CNRS UMR 6074 IRISA-INRIA, Campus de Beaulieu, 35042 Rennes Cedex
tél.: +33 (0) 2 99 84 71 79 / +33 (0) 6.10.43.96.51
[email protected]

@jpuritz
Copy link

jpuritz commented Feb 7, 2016

Hi everyone,

I'm happy to help with implementing dDocent and filtering scripts!

Thanks for including dDocent.

@abretaud
Copy link
Contributor

@bgruening I have seen some PR about a migration to conda to define tool dependencies: for this hackathon do you think we should use this new method or is it better to stick to the good-old tool_dependencies.xm method?
In the first case, is there some guide/documentation/howto that we could use?

@bgruening
Copy link
Member

@abretaud yes we should. I will help as much as I can. This will save us so much time and provides us with free testing.

@abretaud
Copy link
Contributor

abretaud commented Mar 1, 2016

All right, I'm preparing a few slides to introduce this on monday

@bgruening
Copy link
Member

Uh awesome - let me know if you have any questions and if I should prep some packages for monday and get it into bioconda.

@abretaud
Copy link
Contributor

abretaud commented Mar 1, 2016

As there is a long todo list for stacks, maybe having it in bioconda before the hackathon would help, if you have some time to do it?

@tiagoantao
Copy link
Contributor Author

I can do the stacks on bioconda

@abretaud
Copy link
Contributor

abretaud commented Mar 1, 2016

Great, thanks!

@tiagoantao
Copy link
Contributor Author

I have added this to bioconda as a PR, while I can merge it, I would prefer @bgruening to have a look.

here: bioconda/bioconda-recipes#925

@yvanlebras
Copy link
Contributor

@cmonjeau have made PR for pyrad and mawk bioconda recipes. If you can also have a look @bgruening ;)
mawk bioconda recipes
pyRAD bioconda recipes

@bgruening
Copy link
Member

You all rock! This will be a fantastic hackathon with conda dependencies and travis testing all at once!
I commented on the PR!

@Eduardo-Alves
Copy link

Hi, has anyone started looking into "the feasibility of implementing the stacks web interface on Galaxy". I started looking into visualization plugins using mako but I believe there must be a way to do it with JSON/Backbone. Anyone has some pointers? I am happy to help!

@abretaud
Copy link
Contributor

abretaud commented Mar 4, 2016

I was thinking about working on an interactive environment for stacks.
IIRC the web interface allow to modify the data, so using IE looks like the perfect solution for this (ie load the stacks preliminary results in the IE, make changes to it if you want, and then inject the corrected data to your history)

I'll see next week if I have time to work on this during the hackathon, but your help is very welcome anyhow!

@abretaud
Copy link
Contributor

abretaud commented Mar 7, 2016

Yes that's it, no more work on genouest github now

@tiagoantao
Copy link
Contributor Author

@abretaud thanks!

@yvanlebras
Copy link
Contributor

Is it a way @jpuritz to help us implementing dDocent ? If we well understood, the user normally have to follow "the text based prompts". Which way you think is the best to integrate dDocent in Galaxy ?

I'm thinking about the fact to use an Interactive Environment to deploy dDocent on a Docker, interactively choose the parameters using "the text based prompts" then export something like a config file we can use as an input to a Galaxy dDocent tools...... Maybe it's totally out of scope....

@bgruening
Copy link
Member

@yvanlebras here is a Docker pyrad container. Automatically created with conda packages.
docker run -it --rm quay.io/mulled/pyrad:3.0.64--py27_0

https://mulled.github.io/

@yvanlebras
Copy link
Contributor

Sorry Eduardo, I miss your e-mail... 
As the introduction speechs were in french, hangout was not used but presentation slides are here: https://cesgo.genouest.org/tags/galhackradfr
Wishing you a good day.
Yvan

@bgruening
Copy link
Member

@yvanlebras what do you mean here?

@yvanlebras
Copy link
Contributor

@bgruening concerning our presentations of tomorrow morning ? We made it in French, so we have not used hangout to avoid proposing "english speaking" colleagues to hear us speaking french.....

@bgruening
Copy link
Member

Uh, I was referring to an other comment of you that disappeared. Never mind.
We should all learn Esperanto ;)

@yvanlebras
Copy link
Contributor

+1 ;) or deutsch !

@Eduardo-Alves
Copy link

I speak french! But its' ok, I had the slides so I read them instead of
the hangout.

Eduardo

On Tue, Mar 8, 2016 at 1:35 PM, Björn Grüning [email protected]
wrote:

Uh, I was referring to an other comment of you that disappeared. Never
mind.
We should all learn Esperanto ;)


Reply to this email directly or view it on GitHub
#422 (comment)
.

@tiagoantao
Copy link
Contributor Author

@yvanlebras et al: Are you planning to add any test data snippet to the stacks tool? For automated testing purposes.

If you have something to test STACKS_sort_read_pairs , I could re-use it with my version without velvet...

@tiagoantao tiagoantao reopened this Mar 8, 2016
@tiagoantao
Copy link
Contributor Author

Sorry, closed by mistake...

@yvanlebras
Copy link
Contributor

Yes, for sure! For sort_read_pairs, you can use the original STACKS sample data : http://catchenlab.life.illinois.edu/stacks/pe_tutorial/pe_samples.tar.gz

@yvanlebras
Copy link
Contributor

For other STACKS scripts, I propose some datasets here

@tiagoantao
Copy link
Contributor Author

@yvanlebras thanks!

@tiagoantao
Copy link
Contributor Author

@yvanlebras Just one thing: we might have to add some of these to the tool in order to have automated testing...

@yvanlebras
Copy link
Contributor

@tiagoantao Yes, of course! ;) we are adding data for each tools to enable functional testing...

@bgruening
Copy link
Member

@tiagoantao we are on it in the latest PR. The problem is that we encountered a few planemo/tools issues we need to sort out. But we are working on it.

@yvanlebras
Copy link
Contributor

In France, we don't have anything to eat now... so we must stop here for the day!

A preliminary report of our participation: http://cesgo.genouest.org/groups/guggo/wiki/RADseqHackathon

Don't hesitatre to comment / react !

Whishing you a good end of day!

@bgruening
Copy link
Member

Thanks a lot @yvanlebras + team! Keep the PR coming!

@tiagoantao
Copy link
Contributor Author

When do you guys think the new stacks version will hit the toolshed? Selfish-me here talking, I would like to install it ASAP. ;)

I am finalizing the sort_pair_reads / velvet decoupling, but that can go later...

@abretaud
Copy link
Contributor

abretaud commented Mar 9, 2016

There is quite some work left to do, personally I will continue to work on this in the coming weeks, but after a week of holidays ;)

@lecorguille
Copy link
Member

@yvanlebras https://github.com/yvanlebras The dataset, you produce for
ref_map, is a little poor. I get empty tsv (
https://github.com/galaxyproject/tools-iuc/tree/stacks/tools/stacks/test-data).
Or perhaps I'm to strong in my parameters (population with -n 3 -m 1)

2016-03-09 9:20 GMT+01:00 Anthony Bretaudeau [email protected]:

There is quite some work left to do, personally I will continue to work on
this in the coming weeks, but after a week of holidays ;)


Reply to this email directly or view it on GitHub
#422 (comment)
.

@yvanlebras
Copy link
Contributor

This is not your parameters, only the data who are poor, you're right. If you need something who generates not empty file for this last population step, I need some additional time ;) I will try today... Thanks for pointing this!

Envoyé depuis mon appareil Samsung

-------- Message d'origine --------
De : Gildas Le Corguillé [email protected]
Date : 09/03/2016 09:53 (GMT+01:00)
À : galaxyproject/tools-iuc [email protected]
Cc : yvanlebras [email protected]
Objet : Re: [tools-iuc] IUC Contribution Fest - RADSeq Tools and Workflows (#422)

@yvanlebras https://github.com/yvanlebras The dataset, you produce for
ref_map, is a little poor. I get empty tsv (
https://github.com/galaxyproject/tools-iuc/tree/stacks/tools/stacks/test-data).
Or perhaps I'm to strong in my parameters (population with -n 3 -m 1)

2016-03-09 9:20 GMT+01:00 Anthony Bretaudeau [email protected]:

There is quite some work left to do, personally I will continue to work on
this in the coming weeks, but after a week of holidays ;)


Reply to this email directly or view it on GitHub
#422 (comment)
.


Reply to this email directly or view it on GitHub:
#422 (comment)

@tiagoantao
Copy link
Contributor Author

If it takes some time, then I will soon submit a PR with the sort_read_pairs decoupling

@martenson
Copy link
Member

Closing this issue since this codefest is long finished. Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants