Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance the FORMAT column of merge subcommand. #58

Open
xiucz opened this issue Sep 21, 2022 · 2 comments
Open

Enhance the FORMAT column of merge subcommand. #58

xiucz opened this issue Sep 21, 2022 · 2 comments

Comments

@xiucz
Copy link

xiucz commented Sep 21, 2022

Hi,
I think people are looking for SV merge tools often, and after testing many SV merge tools, such as SURVIVOR, svimmer, SVanalyzer, svtools, truvari, bcftools, and many other tools, I find SVDB is the best tool to merge SV vcf files from different SV callers! It use the set and priority strategy to combine SV events, which is similar to another tool CombineVariants (suitable for small variants).

When I want to merge three vcf from manta, lumpy and svaba, I find the result vcf only contains two field

#svdb merged file
chr13	32913545	MantaDEL:1:279:280:0:0:0:manta|5:lumpy	T	<DEL>	.	MaxDepth	END=32918007;SVTYPE=DEL;SVLEN=-4462;CIPOS=0,2;CIEND=0,2;HOMLEN=2;HOMSEQ=CA;STRANDS=+-:436;CIPOS95=0,0;CIEND95=0,0;SU=436;PE=0;SR=436;VARID=5:lumpy;set=filterInmanta-lumpy;FOUNDBY=2;manta_CHROM=MantaDEL_1_279_280_0_0_0|chr13;lumpy_CHROM=5|chr13;manta_POS=MantaDEL_1_279_280_0_0_0|32913545;lumpy_POS=5|32913547;manta_QUAL=MantaDEL_1_279_280_0_0_0|.;lumpy_QUAL=5|0.00;manta_FILTERS=MantaDEL_1_279_280_0_0_0|MaxDepth;lumpy_FILTERS=5|.;manta_SAMPLE=MantaDEL_1_279_280_0_0_0|QYQ_zuzhi|PR:0:0|SR:0:0;lumpy_SAMPLE=5|QYQ_zuzhi|GT:./.|SU:436|PE:0|SR:436|GQ:.|SQ:.|GL:.|DP:0|RO:0|AO:0|QR:0|QA:0|RS:0|AS:0|ASC:0|RP:0|AP:0|AB:.;manta_INFO=MantaDEL_1_279_280_0_0_0|END:32918007|SVTYPE:DEL|SVLEN:-4462|CIPOS:0:2|CIEND:0:2|HOMLEN:2|HOMSEQ:CA;lumpy_INFO=5|SVTYPE:DEL|SVLEN:-4463|END:32918010|CIPOS:0:0|CIEND:0:0|CIPOS95:0:0|CIEND95:0:0|SU:436|PE:0|SR:436;svdb_origin=manta|lumpy	PR:SR	.,.:.,.	0,0:0,0

#lumpy raw file
chr13	32913547	5	N	<DEL>	0.00	.	SVTYPE=DEL;SVLEN=-4463;END=32918010;STRANDS=+-:436;CIPOS=0,0;CIEND=0,0;CIPOS95=0,0;CIEND95=0,0;SU=436;PE=0;SR=436	GT:SU:PE:SR:GQ:SQ:GL:DP:RO:AO:QR:QA:RS:AS:ASC:RP:AP:AB	./.:436:0:436:.:.:.:0:0:0:0:0:0:0:0:0:0:.

#manta raw file
chr13	32913545	MantaDEL:1:279:280:0:0:0	T	<DEL>	.	MaxDepth	END=32918007;SVTYPE=DEL;SVLEN=-4462;CIPOS=0,2;CIEND=0,2;HOMLEN=2;HOMSEQ=CA	PR:SR	0,0:0,0

The SVDB merged the FORMAT field, it is trimmed for some reason.

PR: SR	.,.:.,.	0,0:0,0

, I think

  1. the key of the FORMAT field should be filled by the tags;
  2. the values of the FORMAT field should be the same length as the numbers of the callers(here, 3 callers, 3 columns).
    If one SV event is called by 3 callers, then the FORMAT should contain all the information from the 3 callers? as the INFO field does.
    An example,
GT:SU:PE:SR:GQ:SQ:GL:DP:RO:AO:QR:QA:RS:AS:ASC:RP:AP:AB:PR:SR ./.:436:0:436:.:.:.:0:0:0:0:0:0:0:0:0:0:.:.:. .:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:0,0:0,0 .:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.:.,.:.,.

Here is the command line:

#version SVDB-2.6.4
~/bin/svdb --merge --vcf tumorSV.vcf:manta lumpy.gt.vcf:lumpy svaba.unfiltered.sv.vcf:svaba  --priority svaba,manta,lumpy > svdb.3.vcf

Best,
xiucz

@J35P312
Copy link
Owner

J35P312 commented Sep 22, 2022

Hello!
Thanks, I'm happy to hear that!

I agree, everything should be transfered into the FORMAT column! It might be a bug in SVDB because the manta calls lacks the GT (SVDB is GT "centric" in some way).

I will have a look!

Best regards
Jesper

@aksenia
Copy link

aksenia commented Feb 26, 2024

Hi @J35P312 ,

We also love your package! Any progress on this issue? I agree that it is very useful to be able to retail the FORMAT field values from all the vcf files that are being merged. Is this something hard to do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants