Merge pull request #190 from labgem/fix_doc_links

Fix doc links
labgem · Mar 12, 2024 · 69750f2 · 69750f2
2 parents ed7bbfe + a68f089
commit 69750f2
Show file tree

Hide file tree

Showing 7 changed files with 25 additions and 25 deletions.
diff --git a/docs/user/PangenomeAnalyses/pangenomeAnnotation.md b/docs/user/PangenomeAnalyses/pangenomeAnnotation.md
@@ -8,7 +8,7 @@ If you do so, the provided genomes will be annotated using the following tools:
 - [ARAGORN](http://www.ansikte.se/ARAGORN/) to annotate tRNAs
 - [Infernal](http://eddylab.org/infernal/) coupled with HMM of the bacterial and archaeal rRNAs downloaded from [RFAM](https://rfam.xfam.org/) to annotate rRNAs.
 
-To proceed with this stage of the pipeline, you need to create an **organisms.fasta.list** file. 
+To proceed with this stage of the pipeline, you need to create an **genomes.fasta.list** file. 
 This file should be tab-separated with each line depicting an individual genome and
 its pertinent information with the following organization (only the first two columns are mandatory):
 
@@ -17,12 +17,12 @@ its pertinent information with the following organization (only the first two co
 - The following columns contain Contig identifiers present in the associated FASTA file that should be analyzed as being circular.
 For the 'circular contig identifiers,' if you do not have access to this information, you can safely ignore this part as it does not have a big impact on the resulting pangenome.
 
-You can check [this example input file](https://github.com/labgem/PPanGGOLiN/blob/master/testingDataset/organisms.fasta.list).
+You can check [this example input file](https://github.com/labgem/PPanGGOLiN/blob/master/testingDataset/genomes.fasta.list).
 
 To run the annotation part, you can use this minimal command:
 
 ```
-ppanggolin annotate --fasta organisms.fasta.list
+ppanggolin annotate --fasta genomes.fasta.list
 ```
 
 #### Use a different genetic code in my annotation step
@@ -48,7 +48,7 @@ to specify Infernal's RNA annotation model.
 
 ### Use annotation files for your pangenome
 
-You can provide annotation files in either gff3 files or .gbk/.gbff files, or a mix of them. They should be provided through as a list in a tab-separated file that follows the same format as described for the fasta files. You can check [this example input file](https://github.com/labgem/PPanGGOLiN/blob/master/testingDataset/organisms.gbff.list).
+You can provide annotation files in either gff3 files or .gbk/.gbff files, or a mix of them. They should be provided through as a list in a tab-separated file that follows the same format as described for the fasta files. You can check [this example input file](https://github.com/labgem/PPanGGOLiN/blob/master/testingDataset/genomes.gbff.list).
 
 ```{note}
 Use your own annotation for your genome is highly recommended, particularly if you already
@@ -58,7 +58,7 @@ have functional annotations, as they can be added to the pangenome.
 You can provide them using the following command: 
 
 ```
-ppanggolin annotate --anno organisms.gbff.list
+ppanggolin annotate --anno genomes.gbff.list
 ```
 
 #### How to deal with annotation files without sequences
@@ -67,7 +67,7 @@ If your annotation files do not contain the genome sequence,
 you can use both options simultaneously to obtain the gene annotations and gene sequences, as follows: 
 
 ```
-ppanggolin annotate --anno organisms.gbff.list --fasta organisms.fasta.list
+ppanggolin annotate --anno genomes.gbff.list --fasta genomes.fasta.list
 ```
 
 #### Take the pseudogenes into account for pangenome analyses

diff --git a/docs/user/PangenomeAnalyses/pangenomeGraphOut.md b/docs/user/PangenomeAnalyses/pangenomeGraphOut.md
@@ -12,7 +12,7 @@ Using Gephi, the layout can be tuned as illustrated below:
 We advise the Gephi "Force Atlas 2" algorithm to compute the graph layout with "Stronger Gravity: on" and "scaling: 4000" but don't hesitate to tinker with the layout parameters.
 
 In the _light.gexf file : 
-The nodes will contain the number of genes belonging to the gene family, the most common gene name (if you provided annotations), the most common product name (if you provided annotations in your GFF or GBFF input files), the partitions it belongs to, its average and median size in nucleotides, and the number of organisms that have this gene family. If spots or modules are computed, it also indicates if a node belongs to them. Finally, this file also outputs the imported metadata regarding each gene family.
+The nodes will contain the number of genes belonging to the gene family, the most common gene name (if you provided annotations), the most common product name (if you provided annotations in your GFF or GBFF input files), the partitions it belongs to, its average and median size in nucleotides, and the number of genomes that have this gene family. If spots or modules are computed, it also indicates if a node belongs to them. Finally, this file also outputs the imported metadata regarding each gene family.
 
 The edges contain the number of times they are present in the pangenome.
 

diff --git a/docs/user/PangenomeAnalyses/pangenomeWorkflow.md b/docs/user/PangenomeAnalyses/pangenomeWorkflow.md
@@ -45,20 +45,20 @@ To use this command, you need to provide a tab-separated list of either annotati
 
 You can use the workflow with annotation files as such: 
 ```
-ppanggolin workflow --anno organism.gbff.list
+ppanggolin workflow --anno genomes.gbff.list
 ```
 
 For fasta files, you have to change for: 
 ```
-ppanggolin workflow --fasta organism.fasta.list
+ppanggolin workflow --fasta genomes.fasta.list
 ```
 
 Moreover, as detailed [in the section about providing your gene families](./pangenomeAnalyses.md#read-clustering), 
 if you wish to use different gene clustering methods than those provided by PPanGGOLiN,
 it is also possible to provide your own clustering results with the workflow command as such:
 
 ```
-ppanggolin workflow --anno organism.gbff.list --clusters clusters.tsv
+ppanggolin workflow --anno genomes.gbff.list --clusters clusters.tsv
 ```
 
 All the workflow parameters are obtained from the commands explained below, except for the `--no_flat_files` option, which solely pertains to it. This option prevents the automatic generation of the output files listed and described [in the pangenome output section](./pangenomeAnalyses.md#pangenome-outputs).

diff --git a/docs/user/QuickUsage/quickWorkflow.md b/docs/user/QuickUsage/quickWorkflow.md
@@ -63,24 +63,24 @@ The minimal subcommand only need your own annotations files (using `.gff` or `.g
 as long as they include the genomic dna sequences, such as the ones provided by Prokka or Bakta.
 
 ```bash
-ppanggolin all --anno organism.gbff.list
+ppanggolin all --anno genomes.gbff.list
 ```
 
 It uses parameters that we found to be generally the best when working with species pangenomes.
 
-The file **organism.gbff.list** is a tab-separated file with the following organisation :
+The file **genomes.gbff.list** is a tab-separated file with the following organisation :
 
 1. The first column contains a unique genome name
 2. The second column the path to the associated annotation file
 3. Each line represents a genome
 
-An example with 50 _Chlamydia trachomatis_ genomes can be found in the [testingDataset](https://github.com/labgem/PPanGGOLiN/blob/master/testingDataset/organisms.gbff.list) directory.
+An example with 50 _Chlamydia trachomatis_ genomes can be found in the [testingDataset](https://github.com/labgem/PPanGGOLiN/blob/master/testingDataset/genomes.gbff.list) directory.
 
 [//]: # (### PPanGGOLiN: Pangenome analyses from list of fasta files)
 You can also give PPanGGOLiN `.fasta` files, such as:
 
 ```
-ppanggolin all --fasta organism.fasta.list
+ppanggolin all --fasta genomes.fasta.list
 ```
 
 Again you must use a tab-separated file but this time with the following organisation:
@@ -90,7 +90,7 @@ Again you must use a tab-separated file but this time with the following organis
 3. Circular contig identifiers are indicated in the following columns
 4. Each line represents a genome
 
-Same, an example can be found in the [testingDataset](https://github.com/labgem/PPanGGOLiN/blob/master/testingDataset/organisms.fasta.list) directory.
+Same, an example can be found in the [testingDataset](https://github.com/labgem/PPanGGOLiN/blob/master/testingDataset/genomes.fasta.list) directory.
 
 ```{tip}
 Downloading genomes from NCBI refseq or genbank for a species of interest can be easily accomplished using CLI tools like [ncbi-genome-download](https://github.com/kblin/ncbi-genome-download) or the [genome updater](https://github.com/pirovc/genome_updater) script.

diff --git a/docs/user/RGP/rgpPrediction.md b/docs/user/RGP/rgpPrediction.md
@@ -59,12 +59,12 @@ graph LR
 
 You can use the `panrgp` with annotation (gff3 or gbff) files with `--anno` option, as such: 
 ```bash
-ppanggolin panrgp --anno organism.gbff.list
+ppanggolin panrgp --anno genomes.gbff.list
 ```
 
 For fasta files, you need to use the alternative `--fasta` option, as such:
 ```bash
-ppanggolin panrgp --fasta organism.fasta.list
+ppanggolin panrgp --fasta genomes.fasta.list
 ```
 
 Just like [workflow](../PangenomeAnalyses/pangenomeAnalyses.md#workflow), this command will deal with the [annotation](../PangenomeAnalyses/pangenomeAnalyses.md#annotation), [clustering](../PangenomeAnalyses/pangenomeAnalyses.md#compute-pangenome-gene-families), [graph](../PangenomeAnalyses/pangenomeAnalyses.md#graph) and [partition](../PangenomeAnalyses/pangenomeAnalyses.md#partition) commands by itself.

diff --git a/docs/user/practicalInformation.md b/docs/user/practicalInformation.md
@@ -85,12 +85,12 @@ ppanggolin utils --default_config panrgp
 
 ```yaml
 input_parameters:
-    # A tab-separated file listing the organism names, and the fasta filepath of its
-    # genomic sequence(s) (the fastas can be compressed with gzip). One line per organism.
+    # A tab-separated file listing the genome names, and the fasta filepath of its
+    # genomic sequence(s) (the fastas can be compressed with gzip). One line per genome.
   # fasta: <fasta file>
-    # A tab-separated file listing the organism names, and the gff/gbff filepath of
+    # A tab-separated file listing the genome names, and the gff/gbff filepath of
     # its annotations (the files can be compressed with gzip). One line
-    # per organism. If this is provided, those annotations will be used.
+    # per genome. If this is provided, those annotations will be used.
   # anno: <anno file>
 
 general_parameters:

diff --git a/docs/user/writeGenomes.md b/docs/user/writeGenomes.md
@@ -2,7 +2,7 @@
 
 The `write_genomes` command creates 'flat' files representing genomes with their pangenome annotations.
 
-To generate output for specific genomes, use the `--organisms` argument. This argument accepts a list of organism names, either directly entered in the command line (comma-separated) or referenced from a file where each line contains a single organism name.
+To generate output for specific genomes, use the `--genomes` argument. This argument accepts a list of genome names, either directly entered in the command line (comma-separated) or referenced from a file where each line contains a single genome name.
 
 
 ### Genes table with pangenome annotations
@@ -20,7 +20,7 @@ The following table outlines the columns present in the generated files:
 | stop                 | Stop position of the gene                                                  |
 | strand               | Gene location strand                                      |
 | family               | ID of the gene's associated family in the pangenome             |
-| nb_copy_in_org       | Number of copies of a family present in the organism; 1 indicates no close paralogs |
+| nb_copy_in_org       | Number of copies of a family present in the genome; 1 indicates no close paralogs |
 | partition            | Gene family partition in the pangenome                  |
 | persistent_neighbors | Number of neighbors classified as 'persistent' in the pangenome graph        |
 | shell_neighbors      | Number of neighbors classified as 'shell' in the pangenome graph             |
@@ -137,9 +137,9 @@ PPanGGOLiN allows the incorporation of fasta sequences into GFF files and prokse
 
 Since PPanGGOLiN does not retain genomic sequences, it is necessary to provide the original genomic files used to construct the pangenome through either the `--anno` or `--fasta` argument. These arguments mirror those used in workflow commands (`workflow`, `all`, `panrgp`, `panmodule`) and the `annotate` command.
 
-- `--anno`: This option requires a tab-separated file containing organism names and the corresponding GFF/GBFF file paths of their annotations. If `--anno` is utilized, GFF files should include fasta sequences.
+- `--anno`: This option requires a tab-separated file containing genome names and the corresponding GFF/GBFF file paths of their annotations. If `--anno` is utilized, GFF files should include fasta sequences.
 
-- `--fasta`: Use this option with a tab-separated file that lists organism names alongside the filepaths of their genomic sequences in fasta format.
+- `--fasta`: Use this option with a tab-separated file that lists genome names alongside the filepaths of their genomic sequences in fasta format.
 
 
 ### Incorporating Metadata into Tables, GFF, and Proksee Files