Skip to content

Commit

Permalink
[query] Set QUAL to missing when it's "." (#14697)
Browse files Browse the repository at this point in the history
When we switched from using htsjdk to do most of our VCF parsing to
using our own parser, we kept the behavior of setting missing QUAL to a
sentinel value. This isn't necessary (or correct) for hail to do. The
type of QUAL is optional float64 and we should preserve the missing
value.

Technically speaking this represents a backwards compatibility change.
However, I think this is a bug in my attempt in the past to keep
bug-for-bug compatibility with htsjdk rather than parsing the value into
what people would expect.
  • Loading branch information
chrisvittal authored Sep 20, 2024
1 parent 9b8e322 commit 13a9aee
Show file tree
Hide file tree
Showing 52 changed files with 36 additions and 12 deletions.
14 changes: 7 additions & 7 deletions hail/python/hail/methods/impex.py
Original file line number Diff line number Diff line change
Expand Up @@ -2883,13 +2883,13 @@ def import_vcf(
>>> ds = hl.import_vcf('data/missing-values-in-array-fields.vcf', array_elements_required=False)
>>> ds.show(n_rows=1, n_cols=1, include_row_fields=True)
+---------------+------------+------+-----------+----------+--------------+
| locus | alleles | rsid | qual | filters | info.A |
+---------------+------------+------+-----------+----------+--------------+
| locus<GRCh37> | array<str> | str | float64 | set<str> | array<int32> |
+---------------+------------+------+-----------+----------+--------------+
| 1:123456 | ["A","C"] | NA | -1.00e+01 | NA | [1,NA] |
+---------------+------------+------+-----------+----------+--------------+
+---------------+------------+------+---------+----------+--------------+
| locus | alleles | rsid | qual | filters | info.A |
+---------------+------------+------+---------+----------+--------------+
| locus<GRCh37> | array<str> | str | float64 | set<str> | array<int32> |
+---------------+------------+------+---------+----------+--------------+
| 1:123456 | ["A","C"] | NA | NA | NA | [1,NA] |
+---------------+------------+------+---------+----------+--------------+
<BLANKLINE>
+------------------+----------------+----------------+--------------+
| info.B | info.C | info.D | 'SAMPLE1'.GT |
Expand Down
2 changes: 1 addition & 1 deletion hail/src/main/scala/is/hail/io/vcf/LoadVCF.scala
Original file line number Diff line number Diff line change
Expand Up @@ -1618,7 +1618,7 @@ object LoadVCF {
if (c.hasQual) {
val qstr = l.parseString()
if (qstr == ".")
rvb.addDouble(-10.0)
rvb.setMissing()
else
rvb.addDouble(qstr.toDouble)
} else
Expand Down
4 changes: 2 additions & 2 deletions hail/src/test/resources/ex.vcf.mt/README.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
This folder comprises a Hail (www.hail.is) native Table or MatrixTable.
Written with version 0.2.12-4a66ad88153e
Created at 2019/04/11 16:23:26
Written with version 0.2.132-8889c53db85c
Created at 2024/09/20 14:36:43
3 changes: 3 additions & 0 deletions hail/src/test/resources/ex.vcf.mt/cols/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
This folder comprises a Hail (www.hail.is) native Table or MatrixTable.
Written with version 0.2.132-8889c53db85c
Created at 2024/09/20 14:36:43
Binary file modified hail/src/test/resources/ex.vcf.mt/cols/metadata.json.gz
Binary file not shown.
Binary file modified hail/src/test/resources/ex.vcf.mt/cols/rows/metadata.json.gz
Binary file not shown.
Binary file modified hail/src/test/resources/ex.vcf.mt/cols/rows/parts/part-0
Binary file not shown.
3 changes: 3 additions & 0 deletions hail/src/test/resources/ex.vcf.mt/entries/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
This folder comprises a Hail (www.hail.is) native Table or MatrixTable.
Written with version 0.2.132-8889c53db85c
Created at 2024/09/20 14:36:44
Binary file modified hail/src/test/resources/ex.vcf.mt/entries/metadata.json.gz
Binary file not shown.
Binary file modified hail/src/test/resources/ex.vcf.mt/entries/rows/metadata.json.gz
Binary file not shown.
Binary file not shown.
Binary file not shown.
3 changes: 3 additions & 0 deletions hail/src/test/resources/ex.vcf.mt/globals/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
This folder comprises a Hail (www.hail.is) native Table or MatrixTable.
Written with version 0.2.132-8889c53db85c
Created at 2024/09/20 14:36:43
Binary file modified hail/src/test/resources/ex.vcf.mt/globals/globals/metadata.json.gz
Binary file not shown.
Binary file modified hail/src/test/resources/ex.vcf.mt/globals/globals/parts/part-0
Binary file not shown.
Binary file modified hail/src/test/resources/ex.vcf.mt/globals/metadata.json.gz
Binary file not shown.
Binary file modified hail/src/test/resources/ex.vcf.mt/globals/rows/metadata.json.gz
Binary file not shown.
Binary file modified hail/src/test/resources/ex.vcf.mt/globals/rows/parts/part-0
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified hail/src/test/resources/ex.vcf.mt/metadata.json.gz
Binary file not shown.
3 changes: 3 additions & 0 deletions hail/src/test/resources/ex.vcf.mt/rows/README.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
This folder comprises a Hail (www.hail.is) native Table or MatrixTable.
Written with version 0.2.132-8889c53db85c
Created at 2024/09/20 14:36:44
Binary file modified hail/src/test/resources/ex.vcf.mt/rows/metadata.json.gz
Binary file not shown.
Binary file modified hail/src/test/resources/ex.vcf.mt/rows/rows/metadata.json.gz
Binary file not shown.
Binary file not shown.
Binary file not shown.
4 changes: 2 additions & 2 deletions hail/src/test/resources/gvcfs/HG00096.g.vcf.gz.mt/README.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
This folder comprises a Hail (www.hail.is) native Table or MatrixTable.
Written with version 0.2.12-4a66ad88153e
Created at 2019/04/11 13:36:42
Written with version 0.2.132-8889c53db85c
Created at 2024/09/20 14:37:08
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
This folder comprises a Hail (www.hail.is) native Table or MatrixTable.
Written with version 0.2.132-8889c53db85c
Created at 2024/09/20 14:37:08
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
This folder comprises a Hail (www.hail.is) native Table or MatrixTable.
Written with version 0.2.132-8889c53db85c
Created at 2024/09/20 14:37:08
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
This folder comprises a Hail (www.hail.is) native Table or MatrixTable.
Written with version 0.2.132-8889c53db85c
Created at 2024/09/20 14:37:08
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
This folder comprises a Hail (www.hail.is) native Table or MatrixTable.
Written with version 0.2.132-8889c53db85c
Created at 2024/09/20 14:37:08
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

0 comments on commit 13a9aee

Please sign in to comment.