You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TL;DR: Would it be possible to have pairwiseAlignment return the traceback matrix?
I am interested in doing global-local alignments between pattern string A and a set of subject strings B, C, D, etc. In this case, the subject strings are longer than the pattern string. I hope to find within the subject strings those that have significantly better alignment scores than we would expect by chance.
Furthermore, I want to adjust for how far from the start of the subject string where a good alignment is found. This is biologically important in my use case. The reason is that since my pattern strings tend to be short, for instance, if we try to find a 6-mer in a set of 100-mers, chances are we are going to find some hits. However, if we find a 6-mer right at the start of a certain 100-mer, that would indicate something significant biologically in our problem.
In order to control for this, I am doing iteratively global-local alignments using pairwiseAlignment between pattern string A and all substrings of subject strings that start from the beginning. [for instance, from substring(B, 1, nchar(A)) to substring(B, 1, nchar(B))]. Then we use those alignment scores that are controlled for distance from the starting point to assess significance. A major problem with this current approach is that we have to run the underlying alignment algorithm many many times for each substring.
However, If we were to have access to the traceback matrix currTraceMatrix in align_pairwiseAlignment.c from the pairwiseAlignment call, that would save me a tremendous amount of time.
Would this be possible? I would imagine some folks in my field would be interested in having this feature.
Thank you in advance.
Best,
Peter
The text was updated successfully, but these errors were encountered:
hpages
transferred this issue from Bioconductor/Biostrings
Apr 4, 2024
I just moved this issue to the pwalign repository. Please note that starting with Bioconductor 3.19 (to be released in about a month), pairwiseAlignment() and all related functionalities currently found in Biostrings will be in the new pwalign package.
To answer your question: we have limited resources at the moment to implement the feature that you are requesting but we would welcome a PR.
Hi,
TL;DR: Would it be possible to have
pairwiseAlignment
return the traceback matrix?I am interested in doing global-local alignments between pattern string A and a set of subject strings B, C, D, etc. In this case, the subject strings are longer than the pattern string. I hope to find within the subject strings those that have significantly better alignment scores than we would expect by chance.
Furthermore, I want to adjust for how far from the start of the subject string where a good alignment is found. This is biologically important in my use case. The reason is that since my pattern strings tend to be short, for instance, if we try to find a 6-mer in a set of 100-mers, chances are we are going to find some hits. However, if we find a 6-mer right at the start of a certain 100-mer, that would indicate something significant biologically in our problem.
In order to control for this, I am doing iteratively global-local alignments using
pairwiseAlignment
between pattern string A and all substrings of subject strings that start from the beginning. [for instance, fromsubstring(B, 1, nchar(A))
tosubstring(B, 1, nchar(B))
]. Then we use those alignment scores that are controlled for distance from the starting point to assess significance. A major problem with this current approach is that we have to run the underlying alignment algorithm many many times for each substring.However, If we were to have access to the traceback matrix
currTraceMatrix
in align_pairwiseAlignment.c from thepairwiseAlignment
call, that would save me a tremendous amount of time.Would this be possible? I would imagine some folks in my field would be interested in having this feature.
Thank you in advance.
Best,
Peter
The text was updated successfully, but these errors were encountered: