Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix float grouping #2791

Merged
merged 15 commits into from
Jun 28, 2021
4 changes: 3 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,9 @@

## Bug fixes

* fix bug in how `groupby` handles grouping of float columns
* fix bug in how `groupby` handles grouping of float columns;
now `-0.0` is treated as *not integer* when deciding on which
grouping algorithm should be used
([#2791](https://github.com/JuliaData/DataFrames.jl/pull/2791))
* fix bug in how `issorted` handles custom orderings and improve performance
of sorting when complex custom orderings are passed
Expand Down
5 changes: 5 additions & 0 deletions src/groupeddataframe/groupeddataframe.jl
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,11 @@ an `AbstractDict` can be used to index into a grouped data frame where
the keys are column names of the data frame. The order of the keys does
not matter in this case.

A column is considered to be an integer column when deciding on the grouping
algorithm choice if its `eltype` is a subtype of `Union{Missing, Real}`, all its
elements are either `missing` or pass `isinteger` test, and none of them is
equal to `-0.0`.
bkamins marked this conversation as resolved.
Show resolved Hide resolved

# See also

[`combine`](@ref), [`select`](@ref), [`select!`](@ref), [`transform`](@ref), [`transform!`](@ref)
Expand Down
16 changes: 5 additions & 11 deletions src/groupeddataframe/utils.jl
Original file line number Diff line number Diff line change
Expand Up @@ -145,22 +145,16 @@ function refpool_and_array(x::AbstractArray)
end
elseif x isa AbstractArray{<:Union{Real, Missing}} && length(x) > 0
if !(x isa AbstractArray{<:Union{Integer, Missing}})
if !all(v -> (ismissing(v) | isinteger(v)) & (!isequal(v, -0.0)), x)
if !all(v -> (ismissing(v) | isinteger(v)) & !isequal(v, -0.0), x)
return nothing, nothing
end
end
if Missing <: eltype(x)
smx = skipmissing(x)
y = iterate(smx)
y === nothing && return nothing, nothing
(v, s) = y
minval = maxval = v
while true
y = iterate(smx, s)
y === nothing && break
(v, s) = y
maxval = max(v, maxval)
minval = min(v, minval)
if isempty(smx)
return nothing, nothing
else
minval, maxval = extrema(smx)
end
else
minval, maxval = extrema(x)
Expand Down