-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inconsistent results from repeated calls to union!(s::BitSet, r::UnitRange{Int})
#45574
Comments
I can narrow it down a bit further with the script using DataFrames
tup(bs::BitSet) = (; blen = length(bs.bits), offset = bs.offset, slen = length(bs))
ranges = [46644306:46644488, 46644343:46644488, 46648318:46648619, 46648458:46648538,]
function summarize(rng)
bs = BitSet();
result = [tup(bs)];
for r in rng
union!(bs, r)
push!(result, tup(bs))
end
DataFrame(result)
end
summarize(ranges)
summarize(ranges) If you run julia> summarize(ranges)
5×3 DataFrame
Row │ blen offset slen
│ Int64 Int64 Int64
─────┼────────────────────────────────────
1 │ 0 -1152921504606846976 0
2 │ 4 728817 183
3 │ 4 728817 183
4 │ 68 728817 1222
5 │ 68 728817 1222
julia> summarize(ranges)
5×3 DataFrame
Row │ blen offset slen
│ Int64 Int64 Int64
─────┼────────────────────────────────────
1 │ 0 -1152921504606846976 0
2 │ 4 728817 183
3 │ 4 728817 183
4 │ 68 728817 1170
5 │ 68 728817 1170 |
I seem to take a while to get around to a minimal example but this seems clear enough. julia> length(union!(BitSet(46644306:46644488), 46648318:46648619))
1994
julia> length(union!(BitSet(46644306:46644488), 46648318:46648619))
532
julia> length(union!(BitSet(46644306:46644488), 46648318:46648619))
516
julia> length(union!(BitSet(46644306:46644488), 46648318:46648619))
550 |
The julia> union!(BitSet(0:255), BitSet(512:599)).bits
10-element Vector{UInt64}:
0xffffffffffffffff
0xffffffffffffffff
0xffffffffffffffff
0xffffffffffffffff
0x0000000000000000
0x0000000000000000
0x0000000000000000
0x0000000000000000
0xffffffffffffffff
0x0000000000ffffff But a julia> union!(BitSet(0:255), 512:599).bits
10-element Vector{UInt64}:
0xffffffffffffffff
0xffffffffffffffff
0xffffffffffffffff
0xffffffffffffffff
0x00007f8b0f95c850
0x00007f8b0f95c880
0x00007f8b0f95c8b0
0x00007f8b0f95c8e0
0xffffffffffffffff
0x0000000000ffffff |
As part of a larger task I need to count the number of elements in the union of a set of intervals. A sample set of intervals from genomic sequences and a function to evaluate the overlap size is in
bugreport.txt
I get inconsistent results
I suspect that somewhere in the
union!(s::BitSet, r::AbstractUnitRange{<:Integer})
method starting at line 126 of bitset.jl thebits
vector ofs
is being extended without zeroing the extending part.The text was updated successfully, but these errors were encountered: