Skip to content
swarbred edited this page Apr 29, 2024 · 5 revisions

Minos commands

A) Quick Minos run without BUSCO

Commands

1. Minos configure command:

cd /home/train/Annotation_workshop/Minos/Chr3-1065466-1464870
source activate_minos
minos configure --mikado-container /opt/images/mikado.img -o My_Minos_run --external-metrics External_metrics/external_metrics_workshop.txt --external Config_mikado/external.yaml --genus-identifier ARATH3702 --annotation-version EIv1.0 --use-tpm-for-picking --use-diamond Input_list/list_A_models.txt Config_mikado/scoring_template_v1.7.yaml Inputs/Reference/Athaliana_447_TAIR10_Chr3_clean.fa > minos.configure.My_minos_run.log 2>&1

2. Minos run command (completes in under 3 minutes):

minos run --no_drmaa --scheduler NONE --mikado-container /opt/images/mikado.img -o My_Minos_run > minos.run.My_minos_run.log 2>&1 &

3. Track progress of the Minos job

tail -f minos.run.My_minos_run.log |grep "^rule\|done"

NOTE Control+C to exit

4. Review output files

cd My_Minos_run/results; ls -ltr
  • *release.gff3 = GFF3 file of selected genes
  • *release.gff3.biotype_conf.summary = summary of gene and transcript counts for each biotype/confidence classification
-------------------  ----------  ----  ----------
Biotype              Confidence  Gene  Transcript
protein_coding_gene  High        111   173
protein_coding_gene  Low         14    14
predicted_gene       Low         2     2
Total                127         189
-------------------  ----------  ----  ----------
  • *release.gff3.mikado_stats.txt.summary = summary gene model stats
--------------------------------  -------
Number of genes                    127
Number of Transcripts              189
Transcripts per gene                 1.49
Number of monoexonic genes          27
Monoexonic transcripts              29
Transcript mean size cDNA (bp)    1790.83
Transcript median size cDNA (bp)  1740
Min cDNA                            93
Max cDNA                          5675
Total exons                       1250
Exons per transcript                 6.61
Exon mean size (bp)                270.77
CDS mean size (bp)                 201.13
Transcript mean size CDS (bp)     1247.24
Transcript median size CDS (bp)   1173
Min CDS                             93
Max CDS                           5112
Intron mean size (bp)              157.05
5'UTR mean size (bp)               237.27
3'UTR mean size (bp)               306.32
--------------------------------  -------
  • *release.collapsed_metrics.summary.tsv = summary of the overall evidence support for gene models in the gene set
  • *release.gff3.final_table.tsv = Basic information for each gene model
  • *release.gff3.cds.fasta, *release.gff3.cdna.fasta, *release.gff3.pep.fasta = fasta files for CDS, cDNA and proteins
  • *release.metrics_oddities.tsv = summary of number of models with selected (odd) characteristics
  • *release.busco_final_table.tsv = busco results for input gene sets and the final selected models

5. Run mikado compare to see how well the Minos models match the reference annotation

mikado compare -r ../../Inputs/Ref_Annotation/Athaliana_447_Araport11.gene_exons.regionA.gtf -p ARATH3702_EIv1.0.release.gff3 -o mikado_compare.minos
221 reference RNAs in 122 genes
189 predicted RNAs in  127 genes
--------------------------------- |   Sn |   Pr |   F1 |
                        Base level: 85.30  89.30  87.25
            Exon level (stringent): 54.61  63.24  58.61
              Exon level (lenient): 84.20  91.05  87.49
                 Splice site level: 93.12  94.19  93.65
                      Intron level: 95.29  94.63  94.96
                 Intron level (NR): 91.17  91.89  91.53
                Intron chain level: 58.33  68.75  63.11
           Intron chain level (NR): 57.89  68.75  62.86
      Transcript level (stringent): 0.90  1.06  0.98
  Transcript level (>=95% base F1): 31.22  35.98  33.43
  Transcript level (>=80% base F1): 55.66  64.02  59.55
         Gene level (100% base F1): 1.64  1.57  1.61
        Gene level (>=95% base F1): 45.90  44.09  44.98
        Gene level (>=80% base F1): 81.15  77.95  79.52

6. How do the Mikado compare results for Minos compare against the EVM-Mikado results?

7. Try rerunning the pipeline i.e. steps 1 and 2 using an alternative set of input models

see files under

Minos/Chr3-1065466-1464870/Input_list

list_ALL_models.txt list_B_models.txt  list_C_models.txt

NOTE change the output name for minos configure -o and minos run -o

Run mikado compare see step 5 Does the "Transcript level (>=80% base F1)" sn, pr and f1 improve?

8. Try rerunning the pipeline i.e. steps 1 and 2 with an alternative external metrics file --external-metrics

see files under

Minos/Chr3-1065466-1464870/External_metrics

external_metrics_workshop_alternative_1.txt (increases the multiplier for protein evidence)
external_metrics_workshop_alternative_2.txt (+ increases the not_fragmentary_min_value for protein alignments)

or copy the file and try your own adjustments

e.g. for the RNA-Seq reads try changing _xx to _rf (stranded data) and changing the multiplier from 0.2 to 1

9. Try rerunning the pipeline i.e. steps 1 and 2 with an alternative mikado configuration file --external

see files under

Minos/Chr3-1065466-1464870/Config_mikado/

external_alternative_1.yaml

10. Try rerunning just from the pick stage

Enter the Minos run directory

cd My_Minos_run

Edit the mikado scoring file (https://opensource.com/article/19/3/getting-started-vim)

vim minos_run.scoring.yaml

e.g. try making the requirements section more lenient

Return to the Minos/Chr3-1065466-1464870 directory

cd ../..

Run Minos run with --rerun-from pick

minos run --no_drmaa --scheduler NONE --mikado-container /opt/images/mikado.img -o My_Minos_run --rerun-from pick > minos.run.My_minos_run.log 2>&1 &

This will just rerun the pick stage and will only take 30 seconds. Under My_Minos_run you will now have an archives folder with the original results. The new results will be found under My_Minos_run/results

B) Slow run (completes in under 14 minutes):

Commands with BUSCO

Minos configure command:

minos configure -f --mikado-container /opt/images/mikado.img --busco-level p --busco-lineage /home/train/data/brassicales_odb10 --busco-genome-run Inputs/Busco -o My_Minos_run_with_busco --external-metrics External_metrics/external_metrics_workshop.txt --external Config_mikado/external.yaml --genus-identifier ARATH3702 --annotation-version EIv1.0 --use-tpm-for-picking --use-diamond Input_list/list_ALL_models.txt Config_mikado/scoring_template_v1.7.yaml Inputs/Reference/Athaliana_447_TAIR10_Chr3_clean.fa > minos.configure.My_Minos_run_with_busco.log 2>&1

Minos run command:

minos run --no_drmaa --scheduler NONE --mikado-container /opt/images/mikado.img -o My_Minos_run_with_busco > minos.run.My_Minos_run_with_busco.log 2>&1

Clone this wiki locally