Skip to content

Extra Multiple Sequence Alignment

Andrea Telatin edited this page Nov 19, 2021 · 11 revisions

Try to perform a small MSA

Multiple sequence alignment (more here) leads to the identification of the homologous regions in a set of sequences, and to the "edit paths" from one sequence to another.

The tools

There are different options for MSA, and the ideal choice might depend on the nature of the dataset (sequence size, dataset size, nucleotide or protein...) and a good summary of the available options is in this review here

During the workshop we will use:

The file formats: MSA

A common output format is "FASTA with gaps":

>seq1
--CAGTCGATCGGTAGCAGCTGACGTAGCAG--GAAGCT
>seq2
GGCAGTCGATC-GTAGCAGCTGACGTAGCAG--GAAGCT
>seq3
--CAGTCGATCGGTAGCAGCTGACGTAGCAG--CTAGC-