GitHub - ashipunov/Ripeline: R pipeline: from DNA sequences to phylogeny trees

ashipunov / Ripeline Public

Notifications You must be signed in to change notification settings
Fork 0
Star 2

R pipeline: from DNA sequences to phylogeny trees

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
13_wanted		13_wanted
20_sets		20_sets
30_alignments		30_alignments
31_alignments_trimmed		31_alignments_trimmed
32_alignments_trimmed_gapcoded		32_alignments_trimmed_gapcoded
40_concatenated		40_concatenated
50_technical_trees		50_technical_trees
70_raxml_working		70_raxml_working
80_mrbayes_working		80_mrbayes_working
99_trees		99_trees
03_checks.r		03_checks.r
03_checks_rresults.txt		03_checks_rresults.txt
04_duplicated_ids_check.r		04_duplicated_ids_check.r
04_duplicated_ids_check_rresults.txt		04_duplicated_ids_check_rresults.txt
13_make_wanted.r		13_make_wanted.r
13_make_wanted_rresults.txt		13_make_wanted_rresults.txt
20_make_sets.r		20_make_sets.r
20_make_sets_rresults.txt		20_make_sets_rresults.txt
30_align.r		30_align.r
30_align_rresults.txt		30_align_rresults.txt
31_trim.r		31_trim.r
31_trim_rresults.txt		31_trim_rresults.txt
32_gapcode.r		32_gapcode.r
32_gapcode_rresults.txt		32_gapcode_rresults.txt
40_concatenate_and_stat.r		40_concatenate_and_stat.r
40_concatenate_and_stat_rresults.txt		40_concatenate_and_stat_rresults.txt
51_make_r_raw_kmer_trees.r		51_make_r_raw_kmer_trees.r
51_make_r_raw_kmer_trees_rresults.txt		51_make_r_raw_kmer_trees_rresults.txt
52_make_r_semistrict_kmer_tree.r		52_make_r_semistrict_kmer_tree.r
52_make_r_semistrict_kmer_tree_rresults.txt		52_make_r_semistrict_kmer_tree_rresults.txt
53_make_r_nj_single_marker_trees.r		53_make_r_nj_single_marker_trees.r
53_make_r_nj_single_marker_trees_rresults.txt		53_make_r_nj_single_marker_trees_rresults.txt
61_make_r_mp_semistrict_tree.r		61_make_r_mp_semistrict_tree.r
61_make_r_mp_semistrict_tree_rresults.txt		61_make_r_mp_semistrict_tree_rresults.txt
71_make_r_ml_modeltest.r		71_make_r_ml_modeltest.r
71_make_r_ml_modeltest_rresults.txt		71_make_r_ml_modeltest_rresults.txt
72_make_r_ml_trees.r		72_make_r_ml_trees.r
72_make_r_ml_trees_rresults.txt		72_make_r_ml_trees_rresults.txt
73_make_raxml_trees.r		73_make_raxml_trees.r
73_make_raxml_trees_rresults.txt		73_make_raxml_trees_rresults.txt
81_make_mrbayes_semistrict_tree.r		81_make_mrbayes_semistrict_tree.r
81_make_mrbayes_semistrict_tree_rresults.txt		81_make_mrbayes_semistrict_tree_rresults.txt
README		README
README_files		README_files
_kubricks_dna.txt		_kubricks_dna.txt
_kubricks_dna_c.txt		_kubricks_dna_c.txt
_kubricks_sp.txt		_kubricks_sp.txt
_kubricks_sp_c.txt		_kubricks_sp_c.txt
_kubricks_treesp.txt		_kubricks_treesp.txt
_kubricks_treesp_c.txt		_kubricks_treesp_c.txt
make_all		make_all
make_check		make_check
make_data		make_data
make_gzip		make_gzip
make_model_test		make_model_test
make_technical_trees		make_technical_trees
make_trees_semistrict		make_trees_semistrict

Repository files navigation

RIPELINE is the R-based sequence analysis pipeline

***HOW TO INSTALL RIPELINE***

All platforms:
=============

You will need:

R with working Rscript command (check it by typing in the terminal window
"Rscript", there should come some short instructions of how to use it)

shipunov R package; after installation load it and type in R window
"Rresults()"; this will output the instruction of how to install
"Rresults" command. In principle, Ripeline works with basic "Rscript"
(without "Rresults") but in that case you will need to modify "make_*"
shell scripts

The following R packages: ips, ape, kmer

MrBayes installation which contains "mb-mpi" executable working in the
terminal (check it by typing "mb-mpi" in the terminal window).

RAxML installation; type "raxmlHPC" in terminal to see if this executable
works

MUSCLE installation (optionally also MAFFT and ClustalO); type "muscle"
in terminal to see if it works

NOTE (especially Windows users): if for any reason "mb-mpi" and
"raxmlHPC" do not work, the Ripeline will still run (probably, with some
messages) but Bayesian and RAxML trees will not appear. If you have RAxML
and MrBayes installed, please change names of executable files by editing
corresponding R scripts (they are simple text files). However, Ripeline
will _not_ work without "muscle".

ANOTHER NOTE: all scripts are simple text files, so on Windows and
(probably) on macOS you will need the simple text editor to work with
them. There are many, but few are fully cross-platform and at the same
time feature rich and simple enough for non-programmers; examples of the
latter are "Kate" and "Geany".

Windows:
========

Install bash UNIX shell, e.g. from https://cygwin.com/install.html

(Optional) Install the good terminal application, e.g., ConEmu or cmder (please
google these names)

macOS and Linux:
================

No additional installations required

***HOW TO USE RIPELINE***

Run the example:
================

In the terminal, make Ripeline directory current, e.g., enter in
the terminal window something like "cd ~/some/dir/ripeline"

Then run the shell script "make_all", e.g., enter in the terminal window
"bash ./make_all"

All paramaters (like bootstrap replicates) are set to minimal values so
on the Intel Core i5 with SSD and 8 Gb memory it runs approximately 2 min

Check phylogeny trees which will appear in the directory names "99_trees"

Now change the DNA database (file "_kubricks_dna_txt"): for example,
uncomment (remove "#") in the beginning of last line, then new sequence
start to be available for Ripeline

Run "make_all" again

Check out new phylogeny trees, compare with old ones

Play with this example as long as you like, e.g., add some data, comment
(with "#") some data etc., run "make_all" and observe new phylogeny trees

Run with your data:
===================

The best way is to replace the example databases with your own database
made _in the same way_ (same set of text tables, same column names), and
run again