Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vignettes "Pairwise Sequence Alignments" #3

Open
haruosuz opened this issue May 7, 2021 · 0 comments
Open

vignettes "Pairwise Sequence Alignments" #3

haruosuz opened this issue May 7, 2021 · 0 comments

Comments

@haruosuz
Copy link

haruosuz commented May 7, 2021

I found the following issues in vignettes "Pairwise Sequence Alignments"
https://bioconductor.org/packages/release/bioc/vignettes/Biostrings/inst/doc/PairwiseAlignments.pdf

4 Pairwise Sequence Alignment Classes

"If scoreOnly is FALSE, the pairwise alignment with the maximum alignment score is returned."

  • In my environment, the pairwise alignment with the maximum alignment score ("precede" with Score: -25.1171) was not returned, but the first alignment c("succeed" with Score: -33.99738) was returned as follows:
pa1 <- pairwiseAlignment(pattern = c("succeed", "precede"), subject = "supersede")

> pa1
Global PairwiseAlignmentsSingleSubject (1 of 2)
pattern: succ--eed
subject: supersede
score: -33.99738 

writePairwiseAlignments(pa1)

# Length: 9
# Identity:       3/9 (33.3%)
# Similarity:    NA/9 (NA%)
# Gaps:           2/9 (22.2%)
# Score: -33.99738

# Length: 9
# Identity:       4/9 (44.4%)
# Similarity:    NA/9 (NA%)
# Gaps:           2/9 (22.2%)
# Score: -25.1171

The definition of "Identity" and "Similarity" is mentioned in the documentation?

"and the pairwiseAlignmentSummary function holds the results of a summarized pairwise sequence alignment."

  • pairwiseAlignmentSummary should be summary?
> ?pairwiseAlignmentSummary
No documentation for ‘pairwiseAlignmentSummary’ in specified packages and libraries:
you could try ‘??pairwiseAlignmentSummary’

> summary(pa1)
Global Single Subject Pairwise Alignments
Number of Alignments:  2

Scores:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -34.00  -31.78  -29.56  -29.56  -27.34  -25.12 

# the above scores are the same as the following scores:

> summary( c(-34.00, -25.12) )
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -34.00  -31.78  -29.56  -29.56  -27.34  -25.12 

  • patternQuality=PhredQuality(22L) and subjectQuality=PhredQuality(22L) are used as default (if substitutionMatrix=NULL).
> PhredQuality(22L)
PhredQuality object of length 1:
    width seq
[1]     1 7

example(matchPattern)
printed the following message:

Warning message:
In .local(x, ...) :
  as.matrix() on an XStringViews object 'x' has changed behavior: now the
  views in 'x' must be of equal width and each view is converted into a row of
  single characters. To achieve the old behavior, do 'as.matrix(ranges(x))'.
  To supress this warning, do 'suppressWarnings(as.matrix(x))'.
  This warning will be removed in BioC 2.12.

5 Pairwise Sequence Alignment Helper Functions

"Tables 1, 1 and 3 show functions"

should be

"Tables 1, 2 and 3 show functions"

"The score, nedit, nmatch, nmismatch, and nchar functions return numeric vectors containing information on the pairwise sequence alignment score, number of matches, number of mismatches, and number of aligned characters respectively."

where a description for nedit (e.g. "the Levenshtein edit distance of the alignments") should be added.

12 Exercise Answers

12.1 Exercise 1

"2. Do any of the alignments change if the gapExtension argument is set to -Inf? Yes, the overlap pairwise sequence alignment changes."

  • In my environment, the overlap pairwise sequence alignment did not change as follows:
> pairwiseAlignment("zyzzyx", "syzygy", type = "overlap")
Overlap PairwiseAlignmentsSingleSubject (1 of 1)
pattern: [1] 
subject: [7] 
score: 0 
> pairwiseAlignment("zyzzyx", "syzygy", type = "overlap", gapExtension = Inf)
Overlap PairwiseAlignmentsSingleSubject (1 of 1)
pattern: [1] 
subject: [7] 
score: 0 
  • My environment is as follows:
> sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
[1] Biostrings_2.58.0   XVector_0.30.0      IRanges_2.24.1      S4Vectors_0.28.1   
[5] BiocGenerics_0.36.1

loaded via a namespace (and not attached):
[1] zlibbioc_1.36.0 compiler_4.0.5  tools_4.0.5     rstudioapi_0.13 crayon_1.4.1   

@hpages hpages transferred this issue from Bioconductor/Biostrings Mar 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant