Skip to content

Commit

Permalink
New --ligate-force and --ligate-warn options
Browse files Browse the repository at this point in the history
for finer control of the `-l, --ligate` behavior in
imperfect overlaps. The new default is to throw an error
when sites present in one chunk but absent in the other
are encountered. To drop such sites and proceed, use
the new `--ligate-warn` option (previously this was the
default). To keep such sites, use the new `--ligate-force`
option.

Resolves #1567
  • Loading branch information
pd3 committed Aug 28, 2021
1 parent e3ba077 commit ef5cf0a
Show file tree
Hide file tree
Showing 9 changed files with 163 additions and 51 deletions.
14 changes: 14 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,20 @@
information carried in the missing and half-missing genotypes
(e.g. ".", "./." or "./1")

* bcftools concat:

- new `--ligate-force` and `--ligate-warn` options for finer control
of `-l, --ligate` behavior in imperfect overlaps. The new default is
to throw an error when sites present in one chunk but absent in the
other are encountered. To drop such sites and proceed, use the new
`--ligate-warn` option (previously this was the default). To keep such
sites, use the new `--ligate-force` option (#1567).

* bcftools +contrast:

- support for chunking within map/reduce framework allowing to collect
NASSOC counts even for empty case/control sample sets (#1566)

* bcftools csq:

- bug fix, compound indels were not recognised in some cases (#1536)
Expand Down
8 changes: 7 additions & 1 deletion doc/bcftools.txt
Original file line number Diff line number Diff line change
Expand Up @@ -787,7 +787,13 @@ are concatenated without being recompressed, which is very fast..
*-l, --ligate*::
Ligate phased VCFs by matching phase at overlapping haplotypes.
Note that the option is intended for VCFs with perfect overlap, sites
in overlapping regions present in one but missing in other are dropped.
in overlapping regions present in one but missing in the other are dropped.

*--ligate-force*::
Keep all sites and ligate even non-overlapping chunks and chunks with imperfect overlap

*--ligate-warn*::
Drop sites in imperfect overlaps

*--no-version*::
see *<<common_options,Common Options>>*
Expand Down
10 changes: 10 additions & 0 deletions test/concat.5.1.out
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=chr1>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=PQ,Number=1,Type=Integer,Description="Phasing Quality (bigger is better)">
##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phase Set">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
chr1 1 . A C . . . GT:PS 0|1:1
chr1 3 . G T . . . GT:PS 0|1:1
chr1 4 . T A . . . GT:PS 1|0:1
11 changes: 11 additions & 0 deletions test/concat.5.2.out
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=chr1>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=PQ,Number=1,Type=Integer,Description="Phasing Quality (bigger is better)">
##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phase Set">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
chr1 1 . A C . . . GT:PS 0|1:1
chr1 2 . C G . . . GT:PS 1|0:1
chr1 3 . G T . . . GT:PQ:PS 0|1:99:1
chr1 4 . T A . . . GT:PS 1|0:1
5 changes: 5 additions & 0 deletions test/concat.5.a.vcf
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
##fileformat=VCFv4.2
##contig=<ID=chr1>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
chr1 1 . A C . . . GT 0|1
6 changes: 6 additions & 0 deletions test/concat.5.b.vcf
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
##fileformat=VCFv4.2
##contig=<ID=chr1>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
chr1 2 . C G . . . GT 1|0
chr1 3 . G T . . . GT 0|1
6 changes: 6 additions & 0 deletions test/concat.5.c.vcf
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
##fileformat=VCFv4.2
##contig=<ID=chr1>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
chr1 3 . G T . . . GT 1|0
chr1 4 . T A . . . GT 0|1
7 changes: 5 additions & 2 deletions test/test.pl
Original file line number Diff line number Diff line change
Expand Up @@ -601,11 +601,14 @@
test_vcf_concat($opts,in=>['concat.2.a','concat.2.b'],out=>'concat.2.bcf.out',do_bcf=>1,args=>'-a');
test_vcf_concat($opts,in=>['concat.2.a','concat.2.b'],out=>'concat.4.vcf.out',do_bcf=>0,args=>'-aD');
test_vcf_concat($opts,in=>['concat.2.a','concat.2.b'],out=>'concat.4.bcf.out',do_bcf=>1,args=>'-aD');
test_vcf_concat($opts,in=>['concat.3.a','concat.3.b','concat.3.0','concat.3.c','concat.3.d','concat.3.e','concat.3.f'],out=>'concat.3.vcf.out',do_bcf=>0,args=>'-l');
test_vcf_concat($opts,in=>['concat.3.a','concat.3.b','concat.3.0','concat.3.c','concat.3.d','concat.3.e','concat.3.f'],out=>'concat.3.bcf.out',do_bcf=>1,args=>'-l');
test_vcf_concat($opts,in=>['concat.3.a','concat.3.b','concat.3.0','concat.3.c','concat.3.d','concat.3.e','concat.3.f'],out=>'concat.3.vcf.out',do_bcf=>0,args=>'-l --ligate-warn');
test_vcf_concat($opts,in=>['concat.3.a','concat.3.b','concat.3.0','concat.3.c','concat.3.d','concat.3.e','concat.3.f'],out=>'concat.3.bcf.out',do_bcf=>1,args=>'-l --ligate-warn');
test_naive_concat($opts,name=>'naive_concat',max_hdr_lines=>10000,max_body_lines=>10000,nfiles=>10);
test_vcf_concat($opts,in=>['concat.4.a','concat.4.b'],out=>'concat.5.out',do_bcf=>0,args=>'-l');
test_vcf_concat($opts,in=>['concat.4.a','concat.4.b'],out=>'concat.5.out',do_bcf=>1,args=>'-l');
test_vcf_concat($opts,in=>['concat.5.a','concat.5.b','concat.5.c'],out=>'concat.5.1.out',do_bcf=>0,args=>'-l --ligate-warn');
test_vcf_concat($opts,in=>['concat.5.a','concat.5.b','concat.5.c'],out=>'concat.5.1.out',do_bcf=>1,args=>'-l --ligate-warn');
test_vcf_concat($opts,in=>['concat.5.a','concat.5.b','concat.5.c'],out=>'concat.5.2.out',do_bcf=>1,args=>'-l --ligate-force');
test_vcf_reheader($opts,in=>'reheader',out=>'reheader.1.out',header=>'reheader.hdr');
test_vcf_reheader($opts,in=>'reheader',out=>'reheader.2.out',samples=>'reheader.samples');
test_vcf_reheader($opts,in=>'reheader',out=>'reheader.2.out',samples=>'reheader.samples2');
Expand Down
Loading

0 comments on commit ef5cf0a

Please sign in to comment.