Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: collect without inference #135

Merged
merged 11 commits into from
Mar 19, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion src/IndexedTables.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ import Base:
permutedims, reducedim, serialize, deserialize, sort, sort!

export NDSparse, flush!, aggregate!, aggregate_vec, where, pairs, convertdim, columns, column, rows,
itable, update!, aggregate, reducedim_vec, dimlabels
itable, update!, aggregate, reducedim_vec, dimlabels, collectcolumns

const Tup = Union{Tuple,NamedTuple}
const DimName = Union{Int,Symbol}
Expand All @@ -19,6 +19,7 @@ include("utils.jl")
include("columns.jl")
include("table.jl")
include("ndsparse.jl")
include("collect.jl")

#=
# Poor man's traits
Expand Down
34 changes: 34 additions & 0 deletions src/collect.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
collectcolumns(itr) = collectcolumns(itr, Base.iteratorsize(itr))

function collectcolumns(itr, ::Union{Base.HasShape, Base.HasLength})
st = start(itr)
el, st = next(itr, st)
dest = similar(arrayof(typeof(el)), length(itr))
dest[1] = el
collect_to_columns!(dest, itr, 2, st)
end

function collect_to_columns!(dest::Columns{T, U}, itr, offs, st) where {T, U}
# collect to dest array, checking the type of each result. if a result does not
# match, widen the result type and re-dispatch.
i = offs
while !done(itr, st)
el, st = next(itr, st)
S = typeof(el)
if all((s <: t) for (s, t) in zip(S.parameters, T.parameters))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this inferable if eltype(itr) is concrete or once T has become at least as wide as it? That's essential for performance in Base map.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong enough intuition to understand what the compiler can optimize. Here Base uses el isa T || typeof(el) === T. What should I check exactly to test inferrability? Maybe making this a function is_fieldwise_subtype(el::S, ::Type{T}) = all((s <: t) for (s, t) in zip(S.parameters, T.parameters)) would help the compiler, I'm not sure

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@code_warntype will tell you whether the return type is inferred. Using a helper function could indeed help (maybe after marking it as Base.@pure, but the chances are that I'm wrong).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked with constant eltype and now it is tyoe stable. Somehow the way to achieve that was to use a generated function to check is_fieldwise_subtype, which we should do anyway for performance. It's not inferrable in the NamedTuples case (the initialization step arrayof which we already use in map is not inferrable), but I hope all the NamedTuples business will be much better in 0.7

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, should probably benchmark it to make sure it's doing OK compared with the inference based map...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's probably a way to achieve the same result without a generated function, but these things are always tricky...

@inbounds dest[i] = el::T
i += 1
else
Rparams = map(typejoin, T.parameters, S.parameters)
R = get_tuple_type_from_params(el, Rparams)
new = similar(arrayof(R), length(itr))
@inbounds for l in 1:i-1; new[l] = dest[l]; end
@inbounds new[i] = el
return collect_to_columns!(new, itr, i+1, st)
end
end
return dest
end

get_tuple_type_from_params(el::Tuple, params) = Tuple{params...}
get_tuple_type_from_params(el::NamedTuple, params) = eval(:(NamedTuples.@NT($(keys(el)...)))){params...}
1 change: 1 addition & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@ using Base.Test
include("test_core.jl")
include("test_utils.jl")
include("test_tabletraits.jl")
include("test_collect.jl")

end
27 changes: 27 additions & 0 deletions test/test_collect.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
@testset "collectnamedtuples" begin
v = [@NT(a = 1, b = 2), @NT(a = 1, b = 3)]
@test collectcolumns(v) == Columns(@NT(a = Int[1, 1], b = Int[2, 3]))

v = [@NT(a = 1, b = 2), @NT(a = 1.2, b = 3)]
@test collectcolumns(v) == Columns(@NT(a = Real[1, 1.2], b = Int[2, 3]))

v = [@NT(a = 1, b = 2), @NT(a = 1.2, b = "3")]
@test collectcolumns(v) == Columns(@NT(a = Real[1, 1.2], b = Any[2, "3"]))

v = [@NT(a = 1, b = 2), @NT(a = 1.2, b = 2), @NT(a = 1, b = "3")]
@test collectcolumns(v) == Columns(@NT(a = Real[1, 1.2, 1], b = Any[2, 2, "3"]))
end

@testset "collecttuples" begin
v = [(1, 2), (1, 3)]
@test collectcolumns(v) == Columns((Int[1, 1], Int[2, 3]))

v = [(1, 2), (1.2, 3)]
@test collectcolumns(v) == Columns((Real[1, 1.2], Int[2, 3]))

v = [(1, 2), (1.2, "3")]
@test collectcolumns(v) == Columns((Real[1, 1.2], Any[2, "3"]))

v = [(1, 2), (1.2, 2), (1, "3")]
@test collectcolumns(v) == Columns((Real[1, 1.2, 1], Any[2, 2, "3"]))
end