Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] work towards updating to DataFrames v0.11+ framework #1088

Closed
wants to merge 1 commit into from

Conversation

tlnagy
Copy link
Member

@tlnagy tlnagy commented Jan 11, 2018

i haven't been able to fix everything due to slight inconsistencies between PooledDataArrays and CategoricalArrays behavior. See JuliaData/CategoricalArrays.jl#117 for
details. Working towards a fix for #1065.

It would nice to have this soon as we're holding back the whole DataFrames ecosystem until we do. Maybe @alyst and @nalimilan could also provide us some feedback to get this working with v0.11

i haven't been able to fix everything due to slight inconsistencies
between PooledDataArrays and CategoricalArrays behavior. See
JuliaData/CategoricalArrays.jl#117 for
details. Working towards a fix for GiovineItalia#1065
@tlnagy tlnagy requested a review from bjarthur January 11, 2018 21:06
@bjarthur
Copy link
Member

thanks for tackling this. i don't use DataFrames much, so am not familiar with the API changes. please though make use of the regression testing infrastructure in Gadfly before you merge.

@tlnagy
Copy link
Member Author

tlnagy commented Jan 11, 2018

I definitely don't plan on merging this until it's thoroughly vetted since it's a lot of major changes.

@@ -1,13 +1,13 @@
#Is this usable data?
isconcrete{T<:Number}(x::T) = !isna(x) && isfinite(x)
isconcrete{T<:Number}(x::T) = !ismissing(x) && isfinite(x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it should be something like:

isconcrete(::Missing) = false
isconcrete(x::Number) = isfinite(x)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This should fix #1083 too, I think

@nalimilan
Copy link
Contributor

Cc: @andreasnoack

@andreasnoack
Copy link
Contributor

Doh. I've been through almost all the same changes earlier this week. A good reminder to open a WIP PR early to communicate the efforts and avoid wasted efforts.

I had a lot of issues with the wrapping behavior of CategoricalArrays so I actually tried to look into using https://github.com/JuliaArrays/IndirectArrays.jl internally in Gadfly while, of course, still allowing CategoricalArrays input. In particular, I think that Compose should remain agnostic to CategoricalArrays but it is was a bit challenging to avoid that Gadfly sends off CategoricalArrays or CategoricalValues to Compose.

@tlnagy
Copy link
Member Author

tlnagy commented Jan 12, 2018

I've been through almost all the same changes earlier this week.

It would be great if you could push your changes to this PR or make a new PR, which ever is easier.

In particular, I think that Compose should remain agnostic to CategoricalArrays

Agreed. And this was where I got hung up as well.

@Mattriks
Copy link
Member

Out of interest, JuliaArrays/IndirectArrays.jl#12

PooledDataArray(newdata, newpool, [false for _ in newdata])
end
cat_aes_var!{T}(xs::CategoricalArray{T}, ys::CategoricalArray{T}) = vcat(xs, ys)
cat_aes_var!{T, U}(xs::CategoricalArray{T}, ys::CategoricalArray{U}) = vcat(xs, ys)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, cat_aes_var{T}(...) is pre-0.6 syntax. Since DataFrames 0.11 requires 0.6, it's better write it as cat_aes_var(...) where {T}, although in this particular case it could be as simple as (replaces both definitions)

cat_aes_var!(xs::CategoricalArray, ys::CategoricalArray) = vcat(xs, ys)

if typeof(vals) <: PooledDataArray
setfield!(aes_grid[i, j], var,
make_pooled_data_array(typeof(vals), staging[i, j]))
if typeof(vals) <: CategoricalArray
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vals isa CategoricalArray or isa(vals, CategoricalArray) is shorter

@tlnagy
Copy link
Member Author

tlnagy commented Feb 1, 2018

Closed in favor of #1090

@tlnagy tlnagy closed this Feb 1, 2018
@tlnagy tlnagy deleted the tn/dataframes-update branch February 1, 2018 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants