Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev #61

Open
wants to merge 8 commits into
base: dev
Choose a base branch
from
Open

Dev #61

wants to merge 8 commits into from

Conversation

lizhencmb
Copy link

Hi Arthur,

I am going through the wgd code as a way to learn a bit more of python. It is fun actually :-) I have made some changes to let V2 produce similar output files as V1. I've seen that you've put weighting stuff in the visualization part, but I think it would still make sense to include them in the ksd Ks table, so that others can draw the distributions by their own (if they want to).

You can see I've also added a function to strip the alignment with a parameter to leave some gaps. I was thinking that codeml can deal with some gaps in its pairwise mode with cleandata=0. However, after some tests, it seems not really the case, so the function is currently only used to remove all the gaps.

Best,
Zhen

@arzwa
Copy link
Owner

arzwa commented Apr 19, 2021

Thanks Zhen, nice to see someone helping out! Concerning the cleandata thing in PAML, see this reported bug. I had changed some output file formats indeed, but I agree it may be better to keep them compatible with earlier versions.

It seems the tests are failing because

  1. the alignment length has changed due to gap trimming you introduced, so I think we should just update the tests to test with/without trimming.
  2. something related to your last commit concerning the diamond output, which I don't see immediately.

So if we update the tests, I can merge this in.

@lizhencmb
Copy link
Author

Hi Arthur, the failed tests were due to replacing gene ids in multi-species diamond search and alignment trimming. I see that that we trim sequence alignments twice (a bit redundant) and the tests only considers the first one. I did not change the tests for now, but will modify it a little bit later, e.g. maybe for gene tree inference we can tolerate some gaps.

For the commit about diamond, I just added a diamond output file in the output folder (in wgd_dmd by default).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants