Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define nextind/prevind for ChainedVectorIndex #75

Merged
merged 3 commits into from
Sep 20, 2022
Merged

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented Sep 18, 2022

The BitVector(::AbstractVector{Bool}) method uses the Base.nextind method to increment the inds from eachindex that I didn't know existed. I'm not exactly sure why they use that instead of just iterating eachindex but it turns out we can define nextind to be pretty efficient for ChainedVectorIndex since we basically already have that logic in eachindex(::ChainedVector).

Fixes the issue reported in #74.

The `BitVector(::AbstractVector{Bool})` method uses the `Base.nextind`
method to increment the inds from `eachindex` that I didn't know existed.
I'm not exactly sure why they use that instead of just iterating `eachindex`
but it turns out we can define `nextind` to be pretty efficient for
`ChainedVectorIndex` since we basically already have that logic in
`eachindex(::ChainedVector)`.

Fixes the issue reported in #74.
@codecov
Copy link

codecov bot commented Sep 18, 2022

Codecov Report

Base: 95.42% // Head: 95.38% // Decreases project coverage by -0.04% ⚠️

Coverage data is based on head (037ef0c) compared to base (ce4f3eb).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #75      +/-   ##
==========================================
- Coverage   95.42%   95.38%   -0.05%     
==========================================
  Files           4        4              
  Lines         897      996      +99     
==========================================
+ Hits          856      950      +94     
- Misses         41       46       +5     
Impacted Files Coverage Δ
src/chainedvector.jl 96.63% <100.00%> (+0.11%) ⬆️
src/SentinelArrays.jl 91.91% <0.00%> (-0.64%) ⬇️
src/missingvector.jl 97.14% <0.00%> (+0.17%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

chunk_i = 1
i += 1
end
return ChainedVectorIndex(chunkidx, chunk, chunk_i, i)
Copy link
Member

@bkamins bkamins Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions:

  1. first is something we discussed some time ago, but can you please confirm that chunks cannot be empty?
  2. the function returns the same value when last element of the array is visited - is this intended? Other definitions behave as follows:
julia> nextind([1, 2], 0)
1

julia> nextind([1, 2], 1)
2

julia> nextind([1, 2], 2)
3

julia> nextind([1, 2], 3)
4

julia> nextind([1,2,3], CartesianIndex(0))
CartesianIndex(1,)

julia> nextind([1,2,3], CartesianIndex(1))
CartesianIndex(2,)

julia> nextind([1,2,3], CartesianIndex(2))
CartesianIndex(3,)

julia> nextind([1,2,3], CartesianIndex(3))
CartesianIndex(4,)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Correct. We have a cleanup! function that is called before operations where we require all non-empty chunks (and it is called by eachindex)
  2. I assumed the intended behavior was that you eventually get an ind which will throw a BoundsError, and so it doesn't matter if you keep incrementing the ind after that since you'll always get the same BoundsError behavior.

Copy link
Member

@bkamins bkamins Sep 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ad 2.

Yes, but your condition is:

if chunk_i < length(chunk)
    ...
elseif chunkidx < length(A.arrays)
    ...
end

so if chunk_i == length(chunk) && chunkidx == length(A.arrays) (so essentially it is a last element of an array) you do not move it to BoundsError, but leave it as is. This was my issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah shoot; good catch. I'll come up with a fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, pushed a fix

@@ -212,11 +213,15 @@ Base.hash(x::ChainedVectorIndex, h::UInt) = hash(x.i, h)

@inline Base.getindex(x::ChainedVectorIndex) = @inbounds x.array[x.array_i]

@inline function Base.getindex(A::ChainedVector, x::ChainedVectorIndex)
Base.checkbounds(::Type{Bool}, A::ChainedVector, ind::ChainedVectorIndex) = 1 <= ind.array_i <= length(ind.array)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkbounds probably should also check i?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah - maybe it is ignored everywhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't rely on i at all except for "display"-like purposes.

@@ -196,6 +196,7 @@ end

# custom index type used in eachindex
struct ChainedVectorIndex{A} <: Integer
arrays_i::Int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we check somewhere that A type uses 1-based indexing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we do

Copy link
Member

@bkamins bkamins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - comments are mostly for me to understand the design better :).

chunk_i = 1
i += 1
else
chunk_i += 1 # make sure this goes out of bounds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

increment of i is missing. Maybe it should just be moved out of the if part and just i = x.i + 1 be written. Same for prevind.

@quinnj quinnj merged commit d3bd9d3 into main Sep 20, 2022
@quinnj quinnj deleted the jq/nextprevind branch September 20, 2022 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants