- Clone repository
- Install dependencies
- Download databases
*Optionally add pHMMs for domain annotation
git clone https://github.com/mcsimenc/PhyLTR.git
2. Install the following however you can, then add the program paths to the CONFIG file in the PhyLTR root directory.
Parts of PhyLTR require only certain dependencies. See README.md for an explanation of dependency requirements for each process.
- BEDtools
- MAFFT
- FastTree2
- trimAl
- jModelTest2
- GenomeTools
- GENECONV
- PAUP*
- Circos
- PATHd8
- EMBOSS
- PHYLIP
- HMMER3
- NCBI BLAST+
- MCL
The CONFIG file has format: key=path
where key
needs to be exactly as shown below and path
is expected to point to either the file
of the program itself or the directory
containing the program, depending on the dependency.
bedtools=file # bedtools executable
mafft=file # mafft executable
fasttree=file # fasttree executable
trimal=file # trimal executable
jmodeltest2=file # jModelTest.jar
genometools=file # gt executable
geneconv=file # geneconv executable
paup=file # paup executable
rscript=file # Rscript executable
perl=file # perl executable
circos=file # circos executable
pathd8=file #PATHd8 executable
getorf=file # EMBOSS getorf executable
phylip=directory # the bin/ directory in the PHYLIP installation
hmmer=directory # the binaries/ directory in the HMMER3 installation
blast=directory # the bin/ directory in the BLAST+ installation
mcl=directory # the bin/ directory in the MCL installation
A.Download http://dfam.org/releases/Dfam_3.1/families/Dfam.hmm.gz and unpack it
PhyLTR/RepeatDatabases/Dfam/Dfam_ERV_LTR.hmm
PhyLTR/RepeatDatabases/Dfam/Dfam_ERV_LTR.SF
PhyLTR/RepeatDatabases/Dfam/Dfam_ERV_LTR.list
A. Get an account with GIRI No longer free.
- Go to http://www.girinst.org/repbase/update/browse.php
- Select LTR Retrotransposon from the Repeat class dropdown list.
- Select FASTA from the Output format drop down list.
- Click the Download button, sign in, and download the text page that opens.
- Repeat steps 2-4 but select Endogenous Retrovirus from the Repeat class dropdown list.
- Run:
cat <LTR.fa> <ERV.fa> >> Repbase_ERV_LTR.fasta
- Move the new file from 6 to:
PhyLTR/RepeatDatabases/Repbase/Repbase_ERV_LTR.fasta
B. Select IG from the Output format drop down list and download the ERV and LTR Retrotransposon entries in IG format, then concatenate to join both files as in A.6. as Repbase.LTR-ERV-concatenated.IG
C. Run: PhyLTR/scripts/RepbaseIG2superfamilies.py < Repbase.LTR-ERV-concatenated.IG > Repbase_ERV_LTR.SF
PhyLTR/RepeatDatabases/Repbase/Repbase_ERV_LTR.fasta
PhyLTR/RepeatDatabases/Repbase/Repbase_ERV_LTR.SF
PhyLTR/RepeatDatabases/Repbase/Repbase_ERV_LTR.list
The version included in repository contains pHMMs for TE-related domains from Pfam and from gydb.org, downloaded Summer 2018.