diff --git a/README.md b/README.md index 72b50e1..fb6d348 100644 --- a/README.md +++ b/README.md @@ -57,19 +57,20 @@ This would perform a full search for the optimal (lowest-scoring) MUL-tree on th | ------ | ----------- | | -s | A file or string containing a bifurcating, rooted species tree in Newick format. This tree can either be singly-labeled or MUL. | | -g | A file containing one or more bifurcating, rooted, Newick formatted gene trees. Gene trees with polytomies will be removed from the dataset. | -| -h1 | A space separated list of nodes to search as the polyploid clade. Only used with `-t s`. If nothing is entered all nodes will be considered. | -| -h2 | A space separated list of nodes to search as possible parental lineages for all nodes specified with `-h1`. If nothing is entered all possible nodes for the current `h1` will be considered. | -| -d | An option to specify whether to do reconciliations to MUL-trees only (0), the singly-labeled tree only (1), or both (2). Default: 2 | +| -h1 | A space separated list of nodes to search as the polyploid clade. Cannot be used with used with `--multree`. If nothing is entered all nodes will be considered. | +| -h2 | A space separated list of nodes to search as possible parental lineages for all nodes specified with `-h1`. Cannot be used with used with `--multree`. If nothing is entered all possible nodes for the current `h1` will be considered. | | -c | The maximum number of initial groups to consider for any gene tree. Default: 8, Max value: 18 | | -o | Output directory name. If the directory is not present, GRAMPA will created it for you. | -| -f | By default, all output files created by GRAMPA will have the prefix 'grampa_'. You can specify a different prefix with this option. | -| -v | Control the amount of output printed to the screen. Print all output (1) or just some log info (0). Default: 1 | -| -p | The number of processes GRAMPA should use | +| -f | By default, all output files created by GRAMPA will have the prefix 'grampa'. You can specify a different prefix with this option. | +| -v | Control the amount of output printed to the screen. 0: print nothing. 1: print only some info at the start. 2: print all log info to screen. 3 (default): print all log info to the screen as well as progress updates for certain steps. | +| -p | The number of processes GRAMPA should use. | | --multree | Set this flag if your input species tree is a MUL-tree. | | --labeltree | The program will simply label your input species tree. | | --numtrees | The program will simply count the number of possible MUL-trees given `-s`. `-h1`, and `-h2` may also be supplied. | --buildmultrees | Build MUL-trees given `-s`, `-h1` and `-h2`. | | --checknums | If this flag is entered, the program will just calculate the number of groups per gene tree and exit. No reconciliations will be done. | +| --st-only | Only perform reconciliations to the singly-labeled tree input with `-s`. | +| --no-st | Skip performing reconciliations to the singly-labeled tree input with `-s` and only do reconciliations to MUL-trees. | | --maps | Output the node maps for each reconciliation in addition to the scores. The maps will be placed in the detailed output file. | | --version | Print out version info and exit. | | --tests | Run the tests script. | diff --git a/docs/readme.html b/docs/readme.html index 4e129eb..e91a958 100644 --- a/docs/readme.html +++ b/docs/readme.html @@ -124,84 +124,191 @@

Usage

Output

All output files will be placed in the directory specified with -o

-

GRAMPA creates three main output files and a filtered tree file (if necessary).

+

GRAMPA creates four output files, a log file, and a filtered tree file (if necessary).

GRAMPA also creates a directory within the output directory called groups_dir. This just stores the gene tree groupings for each MUL-tree (in pickled format) so GRAMPA doesn't eat up a lot of RAM during reconciliations. This can be ignored/deleted

    -
  1. grampa_out.txt

    -

    This is the main output file and contains some log info for the current run and it gives the total reconciliation score for each MUL-tree - considered. At the bottom of the file it will display the MUL-tree with the minimum reconciliation score.

    - -
    # Tree #    H1 node H2 node Tree string Total score
    -ST          (((a,(x,(y,z)<1>)<2>)<3>,b)<4>,(c,d)<5>)<6> 115
    -MT-1    <2> a   ((((a,(x*,(y*,z*)<1>)<2>)<3>,(x+,(y+,z+)<4>)<5>)<6>,b)<7>,(c,d)<8>)<9>  119
    -MT-2    <2> c   (((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,((c,(x*,(y*,z*)<5>)<6>)<7>,d)<8>)<9>  96
    -MT-3    <2> b   (((a,(x+,(y+,z+)<1>)<2>)<3>,(b,(x*,(y*,z*)<4>)<5>)<6>)<7>,(c,d)<8>)<9>  130
    -MT-4    <2> d   (((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,(c,(d,(x*,(y*,z*)<5>)<6>)<7>)<8>)<9>  76
    -MT-5    <2> <3> ((((a,(x+,(y+,z+)<1>)<2>)<3>,(x*,(y*,z*)<4>)<5>)<6>,b)<7>,(c,d)<8>)<9>  119
    -MT-6    <2> <2> (((a,((x+,(y+,z+)<1>)<2>,(x*,(y*,z*)<3>)<4>)<5>)<6>,b)<7>,(c,d)<8>)<9>  145
    -MT-7    <2> <6> ((((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,(c,d)<5>)<6>,(x*,(y*,z*)<7>)<8>)<9>  132
    -MT-8    <2> <4> ((((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,(x*,(y*,z*)<5>)<6>)<7>,(c,d)<8>)<9>  118
    -MT-9    <2> <5> (((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,((c,d)<5>,(x*,(y*,z*)<6>)<7>)<8>)<9>  95
    -# ---------
    -The MUL-tree with the minimum parsimony score is MT-4:  (((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,(c,(d,(x*,(y*,z*)<5>)<6>)<7>)<8>)<9>
    -Score = 76
    +					
  2. grampa-scores.txt

    +

    This is the main output file and contains the total reconciliation score for each MUL-tree considered, sorted in ascending order.

    + + -
      -
    • The first line has the headers for the columns of the output table. Note that H1 node and H2 node are always nodes in the singly - labeled tree.
    • -
    • The total reconciliation score in the last column is the sum of all reconciliation scores for all gene trees for that MUL-tree.
    • -
    -
  3. +

    + The first line of this file contains the headers, defined as follows for each subsequent row: +

    + +
    + + + + + + + + + + + + + + + + + + + +
    mul.treeh1.nodeh2.nodescoremul.tree
    The ID of the MUL-treeThe H1 node in the species tree for the current MUL-treeThe H2 node in the species tree for the current MUL-treeThe total parsimony score for the current MUL-treeThe Newick formatted tree string for the MUL-tree, with hybrid clades indicated with *
    +
    -
  4. grampa_det.txt

    +

    + Please note that the input singly-labeled species tree always has the ID of 0 +

    + +
  5. grampa-detailed.txt

    The secondary output file contains detailed output describing the reconciliation scores from each gene tree to the lowest scoring MUL-tree.

    -
    # MT-4:(((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,(c,(d,(x*,(y*,z*)<5>)<6>)<7>)<8>)<9>   H1 Node:<2> H2 Node:d
    -# GT/MT combo   # dups  # losses    Total score
    -GT-1 to MT-4    1   1   2
    -GT-2 to MT-4    1   2   3
    -GT-3 to MT-4    1   3   4
    -GT-4 to MT-4    1   2   3
    -.
    -.
    -.
    -GT-25 to MT-4   0   1   1
    -# Gene trees with multiple maps:    1
    -# Total parsimony score for MT-4: 76
    +
     
    -					
  6. grampa_checknums.txt

    +

    + The first line of this file contains the headers, defined as follows for each subsequent row: +

    + +
    + + + + + + + + + + + + + + + + + + + +
    mul.treegene.treedupslossestotal score
    The ID of the MUL-treeThe ID of the gene tree being reconciled to the MUL-treeThe number of duplications on this gene tree given this MUL-treeThe number of losses on this gene tree given this MUL-treeThe sum of dups and losses for this gene tree and MUL-tree
    +
    + +

    + Note that the lowest score for some GT/MT combos can have multiple maps. In these cases, we report all possible scores. +

    + +
  7. grampa-dup-counts.txt

    +

    + For the 6 lowest scoring MUL-trees, GRAMPA counts the number of duplications along each branch in the MUL-tree summed + over all gene trees. +

    + +

    + The first line of this file contains the headers, defined as follows for each subsequent row: +

    + +
    + + + + + + + + + + + + + + + +
    mul.treenodedups
    The ID of the MUL-treeThe node ID for the current MUL-treeThe total number of duplications over all gene tres along the branch above the node in the MUL-tree
    +
    + +
  8. grampa-checknums.txt

    GRAMPA must calculate how many combinations of maps there are for each gene-tree/MUL-tree pair and filter out those that are over the group cap in any combo before any reconciliations can be done. This filtering ensures that all MUL-trees are reconciled to the same set of gene-trees. The number of groups for each gene-tree/MUL-tree is recorded in this file.

    -
    # GT/MT combo   # Groups    # Fixed # Combinations
    -# MT-1:((((a,(x*,(y*,z*)<1>)<2>)<3>,(x+,(y+,z+)<4>)<5>)<6>,b)<7>,(c,d)<8>)<9>   H1 Node:<2> H2 Node:a
    -GT-1 to MT-1    2   0   4
    -GT-2 to MT-1    1   1   2
    -GT-3 to MT-1    1   1   2
    -GT-4 to MT-1    1   1   2
    -.
    -.
    -.
    -GT-25 to MT-9   0   2   1
    +
     
    -						
      -
    • The first line in the file contains the table headers. For each MUL-tree there is also a line giving som info about it.
    • -
    • If a gene tree is over the cap specified with -c X then the message Number of - groups over group cap (-c set to X) -- Filtering. will also be displayed on the line.
    • -
    -
  9. +

    + The first line of this file contains the headers, defined as follows for each subsequent row: +

    + +
    + + + + + + + + + + + + + + + + + + + + + +
    mul.treegene.treegroupsfixedcombinationsover.cap.filtered
    The ID of the MUL-treeThe ID of the gene tree to be reconciled to the MUL-treeThe number of distinct hybrid clades in the gene treeThe number of hybrid clades in the gene tree that also group with a sister species from the singly-labeled treeThe total number of mappings to try for the gene tree with this MUL-treeEither Y or N to indicate whether the number of groups exceeds the number set with -c
    +
    + +
  10. grampa-trees-filtered.txt

    +

    + A text file with the gene trees used for this GRAMPA run, after filtering by the group cap. One tree per line. +

    + +
  11. grampa.log

    +

    + A log file containing run time information and a summary of the lowest scoring MUL-tree. +

@@ -240,10 +347,6 @@

Options Table

A space separated list of nodes to search as possible parental lineages for all nodes specified with -h1. If nothing is entered all possible nodes for the current h1 will be considered. - - -d - An option to specify whether to do reconciliations to MUL-trees only (0), the singly-labeled tree only (1), or both (2). Default: 2 - -c The maximum number of initial groups to consider for any gene tree. Default: 8, Max value: 18 @@ -254,11 +357,11 @@

Options Table

-f - By default, all output files created by GRAMPA will have the prefix 'grampa_'. You can specify a different prefix with this option. + By default, all output files created by GRAMPA will have the prefix 'grampa-'. You can specify a different prefix with this option. -v - Control the amount of output printed to the screen. Print all output (1) or just some log info (0). Default: 1 + Control the amount of output printed to the screen. 0: print nothing. 1: print only some info at the start. 2: print all log info to screen. 3 (default): print all log info to the screen as well as progress updates for certain steps. -p @@ -278,12 +381,20 @@

Options Table

--buildmultrees - Build MUL-trees given -s, -h1, and -h2. + Build MUL-trees given -s, -h1, and -h2 and write them to the log file. --checknums If this flag is entered, the program will just calculate the number of groups per gene tree and exit. No reconciliations will be done. + + --st-only + Only do reconciliations to the input singly-labeled species tree. + + + --no-st + Skip doing reconciliations to the input singly-labled species tree. + --maps Output the node maps for each reconciliation in addition to the scores. The maps will be placed in the detailed output file. @@ -359,6 +470,7 @@

If -h1 and -h2 are not speci of H1 and H2!

+

-c : The group cap

    @@ -426,6 +539,14 @@

    If -h1 and -h2 are not speci
    (((a,(x,(y,z)<1>)<2>)<3>,b)<4>,(c,d)<5>)<6>

+

--numtrees : Building MUL-trees

+
    +

    + This option quickly calculates how many MUL-trees are to be built with a given H1 and H2 set. If neither H1 or H2 are set + it will display the total number of MUL-trees possible for the input species tree. This information is printed to the screen. +

    +
+

--buildmultrees : Building MUL-trees

    This option can be used with -s, -h1, and -h2 to build @@ -440,6 +561,10 @@

    If -h1 and -h2 are not speci

    Will yield the following output in the main output file (multree_ex/grampa_out.txt):

    ((((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,(x*,(y*,z*)<5>)<6>)<7>,(c,d)<8>)<9>
    + +

    + The MUL-trees are written to the log file. +

--checknums : Group counting

@@ -449,30 +574,48 @@

If -h1 and -h2 are not speci best setting for -c.

+

--st-only : Group counting

+
    +

    + By default, GRAMPA reconciles the gene trees to all specified MUL-trees as well as the singly-labeled input species tree. Set this option to ONLY + do reconciliations to the singly-labeled input species tree. +

    +
+ +

--no-st : Group counting

+
    +

    + By default, GRAMPA reconciles the gene trees to all specified MUL-trees as well as the singly-labeled input species tree. Set this option to SKIP + reconciliations to the singly-labeled input species tree. +

    +
+

--maps : Output node mappings

    -

    Set this option to output the LCA node mappings along with the reconciliation scores to the detailed output file. This adds a column - to the _det.txt output file with that gene tree with the nodes re-labeled to include the maps, dups, and losses along that branch.

    - -
    # GT/MT combo	# dups	# losses	Total score	Maps
    -GT-1 to MT-4	1	1	2	(((1_a[a+-0],((1_x[x+-0],(1_y[y+-0],1_z[z+-0])<1>[<1>+-0])<2>[<2>+-0],(2_x[x+-0],(2_y[y+-0],2_z[z+-0])<3>[<1>+-0])<4>[<2>+-0])<5>[<2>+-1])<6>[<3>+-0],1_b[b+-0])<7>[<4>+-0],(1_c[c+-0],1_d[d+-0])<8>[<8>+-0])<9>[<9>+-0]
    -GT-2 to MT-4	1	2	3	((((1_a[a+-0],(1_x[x+-0],(1_y[y+-0],1_z[z+-0])<1>[<1>+-0])<2>[<2>+-0])<3>[<3>+-0],(2_x[x+-0],(2_y[y+-0],2_z[z+-0])<4>[<1>+-0])<5>[<2>+-0])<6>[<3>+-1],1_b[b+-0])<7>[<4>+-0],(1_c[c+-0],1_d[d+-0])<8>[<8>+-0])<9>[<9>+-0]
    -GT-3 to MT-4	1	3	4	((((1_a[a+-0],(1_x[x+-0],(1_y[y+-0],1_z[z+-0])<1>[<1>+-0])<2>[<2>+-0])<3>[<3>+-0],1_b[b+-0])<4>[<4>+-0],(2_x[x+-0],(2_y[y+-0],2_z[z+-0])<5>[<1>+-0])<6>[<2>+-0])<7>[<4>+-1],(1_c[c+-0],1_d[d+-0])<8>[<8>+-0])<9>[<9>+-0]
    -GT-4 to MT-4	1	2	3	((((1_a[a+-0],(2_x[x+-0],(2_y[y+-0],2_z[z+-0])<1>[<1>+-0])<2>[<2>+-0])<3>[<3>+-0],(1_x[x+-0],(1_y[y+-0],1_z[z+-0])<4>[<1>+-0])<5>[<2>+-0])<6>[<3>+-1],1_b[b+-0])<7>[<4>+-0],(1_c[c+-0],1_d[d+-0])<8>[<8>+-0])<9>[<9>+-0]
    -
    -.
    -.
    -.
    -GT-25 to MT-4	0	1	1	(((1_a[a+-0],1_x[x+-0])<1>[<3>+-0],1_b[b+-0])<2>[<4>+-0],(1_c[c+-0],(1_d[d+-0],(2_x[x*-0],(1_y[y*-0],1_z[z*-0])<3>[<5>+-0])<4>[<6>+-0])<5>[<7>+-0])<6>[<8>+-0])<7>[<9>+-0]
    -# Gene trees with multiple maps:    1
    -# Total parsimony score for MT-4: 76
    +							

    + This option adds a column to the grampa-detailed.txt with the actual LCA node mappings for each gene tree and MUL-tree combo. The column + contains a Newick formatted version of the gene tree with nodes labeled as follows: +

    + +
    Node[Map-Dups]
    + +

    + Where Map indicates the node in the MUL-tree that this gene tree node maps to and Dups the number of duplications this mapping incurs. These trees + can be rendered with a tree viewer such as SeaView + or FigTree. +

    +
@@ -484,7 +627,7 @@

If -h1 and -h2 are not speci
diff --git a/docs/scripts/generators/readme_generator.py b/docs/scripts/generators/readme_generator.py index b9f82b7..54649e6 100644 --- a/docs/scripts/generators/readme_generator.py +++ b/docs/scripts/generators/readme_generator.py @@ -4,8 +4,8 @@ ############################################################ import sys, os -sys.path.append('..') -import lib.read_chunks as RC +sys.path.append(os.path.abspath('../lib/')) +import read_chunks as RC ###################### # HTML template @@ -73,84 +73,191 @@

Output

All output files will be placed in the directory specified with -o

-

GRAMPA creates three main output files and a filtered tree file (if necessary).

+

GRAMPA creates four output files, a log file, and a filtered tree file (if necessary).

GRAMPA also creates a directory within the output directory called groups_dir. This just stores the gene tree groupings for each MUL-tree (in pickled format) so GRAMPA doesn't eat up a lot of RAM during reconciliations. This can be ignored/deleted

    -
  1. grampa_out.txt

    -

    This is the main output file and contains some log info for the current run and it gives the total reconciliation score for each MUL-tree - considered. At the bottom of the file it will display the MUL-tree with the minimum reconciliation score.

    - -
    # Tree #    H1 node H2 node Tree string Total score
    -ST          (((a,(x,(y,z)<1>)<2>)<3>,b)<4>,(c,d)<5>)<6> 115
    -MT-1    <2> a   ((((a,(x*,(y*,z*)<1>)<2>)<3>,(x+,(y+,z+)<4>)<5>)<6>,b)<7>,(c,d)<8>)<9>  119
    -MT-2    <2> c   (((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,((c,(x*,(y*,z*)<5>)<6>)<7>,d)<8>)<9>  96
    -MT-3    <2> b   (((a,(x+,(y+,z+)<1>)<2>)<3>,(b,(x*,(y*,z*)<4>)<5>)<6>)<7>,(c,d)<8>)<9>  130
    -MT-4    <2> d   (((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,(c,(d,(x*,(y*,z*)<5>)<6>)<7>)<8>)<9>  76
    -MT-5    <2> <3> ((((a,(x+,(y+,z+)<1>)<2>)<3>,(x*,(y*,z*)<4>)<5>)<6>,b)<7>,(c,d)<8>)<9>  119
    -MT-6    <2> <2> (((a,((x+,(y+,z+)<1>)<2>,(x*,(y*,z*)<3>)<4>)<5>)<6>,b)<7>,(c,d)<8>)<9>  145
    -MT-7    <2> <6> ((((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,(c,d)<5>)<6>,(x*,(y*,z*)<7>)<8>)<9>  132
    -MT-8    <2> <4> ((((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,(x*,(y*,z*)<5>)<6>)<7>,(c,d)<8>)<9>  118
    -MT-9    <2> <5> (((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,((c,d)<5>,(x*,(y*,z*)<6>)<7>)<8>)<9>  95
    -# ---------
    -The MUL-tree with the minimum parsimony score is MT-4:  (((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,(c,(d,(x*,(y*,z*)<5>)<6>)<7>)<8>)<9>
    -Score = 76
    +					
  2. grampa-scores.txt

    +

    This is the main output file and contains the total reconciliation score for each MUL-tree considered, sorted in ascending order.

    + + -
      -
    • The first line has the headers for the columns of the output table. Note that H1 node and H2 node are always nodes in the singly - labeled tree.
    • -
    • The total reconciliation score in the last column is the sum of all reconciliation scores for all gene trees for that MUL-tree.
    • -
    -
  3. +

    + The first line of this file contains the headers, defined as follows for each subsequent row: +

    + +
    + + + + + + + + + + + + + + + + + + + +
    mul.treeh1.nodeh2.nodescoremul.tree
    The ID of the MUL-treeThe H1 node in the species tree for the current MUL-treeThe H2 node in the species tree for the current MUL-treeThe total parsimony score for the current MUL-treeThe Newick formatted tree string for the MUL-tree, with hybrid clades indicated with *
    +
    + +

    + Please note that the input singly-labeled species tree always has the ID of 0 +

    -
  4. grampa_det.txt

    +
  5. grampa-detailed.txt

    The secondary output file contains detailed output describing the reconciliation scores from each gene tree to the lowest scoring MUL-tree.

    -
    # MT-4:(((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,(c,(d,(x*,(y*,z*)<5>)<6>)<7>)<8>)<9>   H1 Node:<2> H2 Node:d
    -# GT/MT combo   # dups  # losses    Total score
    -GT-1 to MT-4    1   1   2
    -GT-2 to MT-4    1   2   3
    -GT-3 to MT-4    1   3   4
    -GT-4 to MT-4    1   2   3
    -.
    -.
    -.
    -GT-25 to MT-4   0   1   1
    -# Gene trees with multiple maps:    1
    -# Total parsimony score for MT-4: 76
    +
     
    -					
  6. grampa_checknums.txt

    +

    + The first line of this file contains the headers, defined as follows for each subsequent row: +

    + +
    + + + + + + + + + + + + + + + + + + + +
    mul.treegene.treedupslossestotal score
    The ID of the MUL-treeThe ID of the gene tree being reconciled to the MUL-treeThe number of duplications on this gene tree given this MUL-treeThe number of losses on this gene tree given this MUL-treeThe sum of dups and losses for this gene tree and MUL-tree
    +
    + +

    + Note that the lowest score for some GT/MT combos can have multiple maps. In these cases, we report all possible scores. +

    + +
  7. grampa-dup-counts.txt

    +

    + For the 6 lowest scoring MUL-trees, GRAMPA counts the number of duplications along each branch in the MUL-tree summed + over all gene trees. +

    + +

    + The first line of this file contains the headers, defined as follows for each subsequent row: +

    + +
    + + + + + + + + + + + + + + + +
    mul.treenodedups
    The ID of the MUL-treeThe node ID for the current MUL-treeThe total number of duplications over all gene tres along the branch above the node in the MUL-tree
    +
    + +
  8. grampa-checknums.txt

    GRAMPA must calculate how many combinations of maps there are for each gene-tree/MUL-tree pair and filter out those that are over the group cap in any combo before any reconciliations can be done. This filtering ensures that all MUL-trees are reconciled to the same set of gene-trees. The number of groups for each gene-tree/MUL-tree is recorded in this file.

    -
    # GT/MT combo   # Groups    # Fixed # Combinations
    -# MT-1:((((a,(x*,(y*,z*)<1>)<2>)<3>,(x+,(y+,z+)<4>)<5>)<6>,b)<7>,(c,d)<8>)<9>   H1 Node:<2> H2 Node:a
    -GT-1 to MT-1    2   0   4
    -GT-2 to MT-1    1   1   2
    -GT-3 to MT-1    1   1   2
    -GT-4 to MT-1    1   1   2
    -.
    -.
    -.
    -GT-25 to MT-9   0   2   1
    +
     
    -						
      -
    • The first line in the file contains the table headers. For each MUL-tree there is also a line giving som info about it.
    • -
    • If a gene tree is over the cap specified with -c X then the message Number of - groups over group cap (-c set to X) -- Filtering. will also be displayed on the line.
    • -
    -
  9. +

    + The first line of this file contains the headers, defined as follows for each subsequent row: +

    + +
    + + + + + + + + + + + + + + + + + + + + + +
    mul.treegene.treegroupsfixedcombinationsover.cap.filtered
    The ID of the MUL-treeThe ID of the gene tree to be reconciled to the MUL-treeThe number of distinct hybrid clades in the gene treeThe number of hybrid clades in the gene tree that also group with a sister species from the singly-labeled treeThe total number of mappings to try for the gene tree with this MUL-treeEither Y or N to indicate whether the number of groups exceeds the number set with -c
    +
    + +
  10. grampa-trees-filtered.txt

    +

    + A text file with the gene trees used for this GRAMPA run, after filtering by the group cap. One tree per line. +

    + +
  11. grampa.log

    +

    + A log file containing run time information and a summary of the lowest scoring MUL-tree. +

@@ -189,10 +296,6 @@ A space separated list of nodes to search as possible parental lineages for all nodes specified with -h1. If nothing is entered all possible nodes for the current h1 will be considered. - - -d - An option to specify whether to do reconciliations to MUL-trees only (0), the singly-labeled tree only (1), or both (2). Default: 2 - -c The maximum number of initial groups to consider for any gene tree. Default: 8, Max value: 18 @@ -203,11 +306,11 @@ -f - By default, all output files created by GRAMPA will have the prefix 'grampa_'. You can specify a different prefix with this option. + By default, all output files created by GRAMPA will have the prefix 'grampa-'. You can specify a different prefix with this option. -v - Control the amount of output printed to the screen. Print all output (1) or just some log info (0). Default: 1 + Control the amount of output printed to the screen. 0: print nothing. 1: print only some info at the start. 2: print all log info to screen. 3 (default): print all log info to the screen as well as progress updates for certain steps. -p @@ -227,12 +330,20 @@ --buildmultrees - Build MUL-trees given -s, -h1, and -h2. + Build MUL-trees given -s, -h1, and -h2 and write them to the log file. --checknums If this flag is entered, the program will just calculate the number of groups per gene tree and exit. No reconciliations will be done. + + --st-only + Only do reconciliations to the input singly-labeled species tree. + + + --no-st + Skip doing reconciliations to the input singly-labled species tree. + --maps Output the node maps for each reconciliation in addition to the scores. The maps will be placed in the detailed output file. @@ -308,6 +419,7 @@ of H1 and H2! +

-c : The group cap

    @@ -375,6 +488,14 @@
    (((a,(x,(y,z)<1>)<2>)<3>,b)<4>,(c,d)<5>)<6>
+

--numtrees : Building MUL-trees

+
    +

    + This option quickly calculates how many MUL-trees are to be built with a given H1 and H2 set. If neither H1 or H2 are set + it will display the total number of MUL-trees possible for the input species tree. This information is printed to the screen. +

    +
+

--buildmultrees : Building MUL-trees

    This option can be used with -s, -h1, and -h2 to build @@ -389,6 +510,10 @@

    Will yield the following output in the main output file (multree_ex/grampa_out.txt):

    ((((a,(x+,(y+,z+)<1>)<2>)<3>,b)<4>,(x*,(y*,z*)<5>)<6>)<7>,(c,d)<8>)<9>
    + +

    + The MUL-trees are written to the log file. +

--checknums : Group counting

@@ -398,30 +523,48 @@ best setting for -c.

+

--st-only : Group counting

+
    +

    + By default, GRAMPA reconciles the gene trees to all specified MUL-trees as well as the singly-labeled input species tree. Set this option to ONLY + do reconciliations to the singly-labeled input species tree. +

    +
+ +

--no-st : Group counting

+
    +

    + By default, GRAMPA reconciles the gene trees to all specified MUL-trees as well as the singly-labeled input species tree. Set this option to SKIP + reconciliations to the singly-labeled input species tree. +

    +
+

--maps : Output node mappings

    -

    Set this option to output the LCA node mappings along with the reconciliation scores to the detailed output file. This adds a column - to the _det.txt output file with that gene tree with the nodes re-labeled to include the maps, dups, and losses along that branch.

    - -
    # GT/MT combo	# dups	# losses	Total score	Maps
    -GT-1 to MT-4	1	1	2	(((1_a[a+-0],((1_x[x+-0],(1_y[y+-0],1_z[z+-0])<1>[<1>+-0])<2>[<2>+-0],(2_x[x+-0],(2_y[y+-0],2_z[z+-0])<3>[<1>+-0])<4>[<2>+-0])<5>[<2>+-1])<6>[<3>+-0],1_b[b+-0])<7>[<4>+-0],(1_c[c+-0],1_d[d+-0])<8>[<8>+-0])<9>[<9>+-0]
    -GT-2 to MT-4	1	2	3	((((1_a[a+-0],(1_x[x+-0],(1_y[y+-0],1_z[z+-0])<1>[<1>+-0])<2>[<2>+-0])<3>[<3>+-0],(2_x[x+-0],(2_y[y+-0],2_z[z+-0])<4>[<1>+-0])<5>[<2>+-0])<6>[<3>+-1],1_b[b+-0])<7>[<4>+-0],(1_c[c+-0],1_d[d+-0])<8>[<8>+-0])<9>[<9>+-0]
    -GT-3 to MT-4	1	3	4	((((1_a[a+-0],(1_x[x+-0],(1_y[y+-0],1_z[z+-0])<1>[<1>+-0])<2>[<2>+-0])<3>[<3>+-0],1_b[b+-0])<4>[<4>+-0],(2_x[x+-0],(2_y[y+-0],2_z[z+-0])<5>[<1>+-0])<6>[<2>+-0])<7>[<4>+-1],(1_c[c+-0],1_d[d+-0])<8>[<8>+-0])<9>[<9>+-0]
    -GT-4 to MT-4	1	2	3	((((1_a[a+-0],(2_x[x+-0],(2_y[y+-0],2_z[z+-0])<1>[<1>+-0])<2>[<2>+-0])<3>[<3>+-0],(1_x[x+-0],(1_y[y+-0],1_z[z+-0])<4>[<1>+-0])<5>[<2>+-0])<6>[<3>+-1],1_b[b+-0])<7>[<4>+-0],(1_c[c+-0],1_d[d+-0])<8>[<8>+-0])<9>[<9>+-0]
    -
    -.
    -.
    -.
    -GT-25 to MT-4	0	1	1	(((1_a[a+-0],1_x[x+-0])<1>[<3>+-0],1_b[b+-0])<2>[<4>+-0],(1_c[c+-0],(1_d[d+-0],(2_x[x*-0],(1_y[y*-0],1_z[z*-0])<3>[<5>+-0])<4>[<6>+-0])<5>[<7>+-0])<6>[<8>+-0])<7>[<9>+-0]
    -# Gene trees with multiple maps:    1
    -# Total parsimony score for MT-4: 76
    +							

    + This option adds a column to the grampa-detailed.txt with the actual LCA node mappings for each gene tree and MUL-tree combo. The column + contains a Newick formatted version of the gene tree with nodes labeled as follows: +

    + +
    Node[Map-Dups]
    + +

    + Where Map indicates the node in the MUL-tree that this gene tree node maps to and Dups the number of duplications this mapping incurs. These trees + can be rendered with a tree viewer such as SeaView + or FigTree. +

    +