You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 12, 2020. It is now read-only.
A biological analysis is sometimes more appropriately called a pipeline. This is because it generally consists of many steps, using many different software and data formats. Yet, these analysis pipelines are becoming very complex and usually makes use of many bash/perl scripts. For people like me who don't really know that much bash or perl, it can be really hard to understand those scripts.
What is important in these pipelines? To list what comes to my mind:
use the command line
manipulate files
use regular expressions
visualize results
report results
I think we can do each of these operations in R.
And I think we should.
The main reason would be to put all your analysis in a single notebook where you have all your code, results and possibly some writing. Using notebooks is good practice and makes it possible to have a fully reproducible analysis, which will a standard in years to come. Another reason is simply that it's easier!
In this tutorial, I'll show an example of a moderately complex analysis of the 1000 Genomes data, all in R.
A biological analysis is sometimes more appropriately called a pipeline. This is because it generally consists of many steps, using many different software and data formats. Yet, these analysis pipelines are becoming very complex and usually makes use of many bash/perl scripts. For people like me who don't really know that much bash or perl, it can be really hard to understand those scripts.
What is important in these pipelines? To list what comes to my mind:
I think we can do each of these operations in R.
And I think we should.
The main reason would be to put all your analysis in a single notebook where you have all your code, results and possibly some writing. Using notebooks is good practice and makes it possible to have a fully reproducible analysis, which will a standard in years to come. Another reason is simply that it's easier!
In this tutorial, I'll show an example of a moderately complex analysis of the 1000 Genomes data, all in R.
You can find the first version of the tuto there.
The text was updated successfully, but these errors were encountered: