Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to obtain the trace back matrix in align_pairwiseAlignment.c? #8

Open
peteryzheng opened this issue Apr 4, 2024 · 1 comment

Comments

@peteryzheng
Copy link

Hi,

TL;DR: Would it be possible to have pairwiseAlignment return the traceback matrix?

I am interested in doing global-local alignments between pattern string A and a set of subject strings B, C, D, etc. In this case, the subject strings are longer than the pattern string. I hope to find within the subject strings those that have significantly better alignment scores than we would expect by chance.

Furthermore, I want to adjust for how far from the start of the subject string where a good alignment is found. This is biologically important in my use case. The reason is that since my pattern strings tend to be short, for instance, if we try to find a 6-mer in a set of 100-mers, chances are we are going to find some hits. However, if we find a 6-mer right at the start of a certain 100-mer, that would indicate something significant biologically in our problem.

In order to control for this, I am doing iteratively global-local alignments using pairwiseAlignment between pattern string A and all substrings of subject strings that start from the beginning. [for instance, from substring(B, 1, nchar(A)) to substring(B, 1, nchar(B))]. Then we use those alignment scores that are controlled for distance from the starting point to assess significance. A major problem with this current approach is that we have to run the underlying alignment algorithm many many times for each substring.

However, If we were to have access to the traceback matrix currTraceMatrix in align_pairwiseAlignment.c from the pairwiseAlignment call, that would save me a tremendous amount of time.

Would this be possible? I would imagine some folks in my field would be interested in having this feature.

Thank you in advance.

Best,
Peter

@hpages hpages transferred this issue from Bioconductor/Biostrings Apr 4, 2024
@hpages
Copy link
Contributor

hpages commented Apr 4, 2024

Hi Peter,

I just moved this issue to the pwalign repository. Please note that starting with Bioconductor 3.19 (to be released in about a month), pairwiseAlignment() and all related functionalities currently found in Biostrings will be in the new pwalign package.

To answer your question: we have limited resources at the moment to implement the feature that you are requesting but we would welcome a PR.

Best,
H.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants