Skip to content

Latest commit

 

History

History
128 lines (103 loc) · 5.06 KB

README.md

File metadata and controls

128 lines (103 loc) · 5.06 KB

doubletrouble

GitHub issues Lifecycle: stable R-CMD-check-bioc Codecov test coverage

The major goal of doubletrouble is to identify duplicated genes from whole-genome protein sequences and classify them based on their modes of duplication. Duplicates can be classified using four different classification schemes, which increase the complexity and level of details in a stepwise manner. The classification schemes and the duplication modes they can classify are:

Scheme Duplication modes
binary SD, SSD
standard SD, TD, PD, DD
extended SD, TD, PD, TRD, DD
full SD, TD, PD, rTRD, dTRD, DD

Legend: SD, segmental duplication. SSD, small-scale duplication. TD, tandem duplication. PD, proximal duplication. TRD, transposon-derived duplication. rTRD, retrotransposon-derived duplication. dTRD, DNA transposon-derived duplication. DD, dispersed duplication.

Besides classifying gene pairs, users can also classify genes, so that each gene is assigned to a unique mode of duplication.

Users can also calculate substitution rates per substitution site (i.e., $K_a$, $K_s$ and their ratios $\frac{K_a}{K_s}$) from duplicate pairs, find peaks in Ks distributions with Gaussian Mixture Models (GMMs), and classify gene pairs into age groups based on Ks peaks.

Installation instructions

Get the latest stable R release from CRAN. Then install doubletrouble from Bioconductor using the following code:

if (!requireNamespace("BiocManager", quietly = TRUE)) {
    install.packages("BiocManager")
}

BiocManager::install("doubletrouble")

And the development version from GitHub with:

BiocManager::install("almeidasilvaf/doubletrouble")

Citation

Below is the citation output from using citation('doubletrouble') in R. Please run this yourself to check for any updates on how to cite doubletrouble.

print(citation('doubletrouble'), bibtex = TRUE)
#> To cite package 'doubletrouble' in publications use:
#> 
#>   Almeida-Silva F, Van de Peer Y (2022). _doubletrouble: Identification
#>   and classification of duplicated genes_. R package version 1.3.0,
#>   <https://github.com/almeidasilvaf/doubletrouble>.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {doubletrouble: Identification and classification of duplicated genes},
#>     author = {Fabrício Almeida-Silva and Yves {Van de Peer}},
#>     year = {2022},
#>     note = {R package version 1.3.0},
#>     url = {https://github.com/almeidasilvaf/doubletrouble},
#>   }

Please note that the doubletrouble was only made possible thanks to many other R and bioinformatics software authors, which are cited either in the vignettes and/or the paper(s) describing this package.

Code of Conduct

Please note that the doubletrouble project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Development tools

For more details, check the dev directory.

This package was developed using biocthis.