Skip to content

Latest commit

 

History

History
61 lines (41 loc) · 3.13 KB

readme.md

File metadata and controls

61 lines (41 loc) · 3.13 KB

Phylogenetic Analysis

The following contains the steps used in compiling accessions, generating multisequence alignments, concatenation, tree building and visualization.

Sequence Acquisition

A list of type sequences containing ITS, TUB, and TEF accessions for Pestalotiopsis, including an outgroup using the species Neopestalotiopsis saprophytica, was constructed from prior work.1

accessions.csv was created with the compiled list of accessions. The list was curated by removing any repeat species and species outside of the target genus.

Retreiving Gene Sequences from Subject Genome

ITS, TUB, and TEF regions of the subject Pestalotiopsis genome was obtained using seqkit v2.0.0 using the following command supplied with primers.tsv containing primers targetting the regions of interest:

cat consensus.fasta | seqkit amplicon -j 16 -m 2 -p primers.tsv --bed

The outputs of each were saved as individual fasta files:

Sequence Compilation

A utility script was made to download the accessions in the source CSV from NCBI GenBank using BioPython and compile them into three seperate loci-specific fasta files.

The utility also appends the subject genome loci-specific files to the corresponding list of accession sequences downloaded by the script.

The utility requires an email address as a single argument as required by the Entrez API.

 python accession_downloader.py [email protected]

The resulting files generated containing all of the accessions used in the phylogenic analysis, including sequences from the outgroup and subject genome are as follows:

Sequence Alignment

All three sequences were aligned using MAFFT v7.453 using the --auto settings.

mafft --auto files/its_combined.fasta > its_aligned.fasta
mafft --auto files/tub_combined.fasta > tub_aligned.fasta
mafft --auto files/tef_combined.fasta > tef_aligned.fasta

The alignments were then trimmed and concatenated using Mega 11.

Tree Building

A Maximum Likelihood (ML) tree was constructed in Mega 11 using the combined ITS+TUB+TEF alignment with default settings and 100 bootstrap replications.

The final tree was saved in Newick format: final_tree.nwk

Visualization

Footnotes

  1. Maharachchikumbura SS, Hyde KD, Groenewald JZ, Xu J, Crous PW. Pestalotiopsis revisited. Stud Mycol. 2014 Sep;79:121-86. doi: 10.1016/j.simyco.2014.09.005. PMID: 25492988; PMCID: PMC4255583.