-
Notifications
You must be signed in to change notification settings - Fork 6
Command line
Specify at least the directory containing the MSAs (-a
), the output directory (-o
), the number of cores (-c
) and the type of the sequences (-d
aa
or nt
).
In addition, you need to specify a model: either with modeltesting (-m
), to automatically determine the best-fit model, or using raxml global parameter option (-r
or -R
).
example: python pargenes/pargenes.py -a msa_dir -o output_dir -c 32 -d nt -R "--model GTR"
Unless stated otherwise, we strongly recommend you to use absolute paths when giving a file or directory location.
Command | Meaning |
---|---|
-a , --alignments-dir
|
Directory containing the input MSA files (fasta or phylip). ParGenes will try to parse all the files in this directory. |
-o , --output-dir
|
Output directory. If the directory does not exist, it ParGenes creates it. Else, ParGenes will abort unless you are running from a checkpoint (see --continue option) |
-c , --cores
|
Number of cores allocated for this job. Do not exceed the number of physical cores available. Should be at least 2. |
-d , --data-type
|
Alignments type: nucleotides or amino acids. Possible values: {nt ,aa }. |
--dry-run |
Special mode to parse the MSA, and compute some statistics without running the analysis. In particular, outputs an estimation of the maximum number of cores that a user could assign to this job without losing parallel efficiency. See also this section |
--continue |
Restart the analysis from the last checkpoint. Apart from this argument and the number of cores (̀--cores ), please avoid changing any of the program inputs. For instance, use this option when your previous run stopped because of a hardware breakdown or a reached wall-time limit. |
--scheduler |
Defines the scheduling strategy. Possible values are: {split ,onecore ,openmp }. Please read this section
|
Command | Meaning |
---|---|
-r , --raxml-global-parameters
|
Path to a file containing one single line with the arguments to pass to all the raxml runs. For instance, the file can contain: --model GTR --brlen unscaled
|
-R , --raxml-global-parameters-string
|
Alternative to --raxml-global-parameters : a quoted string with the arguments to pass to all the raxml runs. For instance: -R "--model GTR --brlen unscaled" . |
--per-msa-raxml-parameters |
Path to a file containing per-msa raxml parameters |
-s , --random-starting-trees
|
Number of random starting trees |
-p , --parsimony-starting-trees
|
Number of parsimony starting trees |
-b , --bs-trees
|
Number of bootstrap trees to compute |
Command | Meaning |
---|---|
-m , --use-modeltest
|
Autodetect the model with ModelTest-NG before running raxml |
--modeltest-global-parameters |
A file containing the parameters to pass to ModelTest-NG |
--per-msa-modeltest-parameters |
A file containing per-msa modeltest parameters |
--modeltest-criteria {AICc,AIC,BIC} |
The criterion to use for best-fit model selection |
--modeltest-perjob-cores |
Number of cores to assign to each modeltest core (at least 4) |
Command | Meaning |
---|---|
--use-astral |
Run astral at the end, to generate a species tree from all the gene trees inferred with ParGenes |
--astral-global-parameters |
Path to a file containing arguments to pass to Astral |
Command | Meaning |
---|---|
--msa-filter |
Path to a file with a list of filenames to process. The file should not contain paths, but filenames. ParGenes will only process MSAs that are both present in the list and in the initial input directory. |
--core-assignment {high,medium,low} |
Policy to decide the per-job number of cores (low favors a low per-job number of cores) |
--percentage-jobs-double-cores |
Percentage (between 0 and 1) of jobs that will receive twice more cores |
This paragraph applies to raxml, modeltest and astral. We take the example of raxml.
To specify some parameters to all raxml runs (for instance if you want to set the same model for all the MSAs), you can use --raxml-global-parameters <file>
. The file should contain one unique line with all the raxml arguments you want to add.
Example of the content of this file:
--model GTR --brlen unscaled
All raxml runs started with ParGenes will be called with these arguments.
The same applies to modeltest options (--modeltest-global-parameters
), and astral options (--astral-global-parameters
)
This applies to raxml and modeltest. We take the example of raxml.
To apply different options to each MSA, please use the option--per-msa-raxml-parameters <file>
.
The file should contain one line per MSA for which you want to add options. Each line starts with the MSA file name (without its path!) followed with the arguments.
For instance:
msa1.fasta --model partition1.part
msa2.fasta --model partition2.part
msa3.fasta --model partition3.part
The same applies to modeltest options (--per-msa-modeltest-parameters
)