Type-stabilizing array concatenations #19387

pabloferz · 2016-11-22T18:27:54Z

This is a rewriting of the some of the array concatenation methods to make them type stable. The PR adds a type stable version of cat (cat{n}(::Type{Val{n}}, X...)) while preserving the existing API (which cannot be made inferable, but should be just a bit faster than before anyway).

Fixes #13665, #19038 and #19304

NOTE: I believe the first three commits could be backported to 0.5.

nalimilan · 2016-11-22T20:36:20Z

base/abstractarray.jl

+
+function _cat(T::Type, shape, sifter, X...)
+    N = length(shape)
+    A = cat_similar(X[1], T, shape)


While you're at it, could you choose the return type based on all input types (#2326)? Or do you think it should go into another PR?

I'd like to handle that somehow, but I'd say that it would have to be addressed elsewhere. For one I have some ideas that would require the new sub-typing algorithm in place.

stevengj · 2016-11-23T20:31:27Z

Do we have benchmark coverage of these functions for dense and sparse matrices?

stevengj · 2016-11-23T20:32:44Z

Needs tests?

Sacha0 · 2016-11-23T21:33:42Z

base/sparse/sparsevector.jl

    hcat(X...)
 end
 function vcat(Xin::_SparseConcatGroup...)
-    X = SparseMatrixCSC[issparse(x) ? x : sparse(x) for x in Xin]
+    X = map(x -> SparseMatrixCSC(issparse(x) ? x : sparse(x)), Xin)


Out of curiosity, why replace the comprehensions with maps? Generate Tuples rather than Vectors?

Would SparseMatrixCSC suffice in place of x -> SparseMatrixCSC(issparse(x) ? x : sparse(x))?

Best!

Out of curiosity, why replace the comprehensions with maps? Generate Tuples rather than Vectors?

Just to avoid the allocation from creating the array, although it shouldn't make much of a difference.

Would SparseMatrixCSC suffice in place of x -> SparseMatrixCSC(issparse(x) ? x : sparse(x))?

SparseMatrixCSC([1]) is not the same as SparseMatrixCSC(sparse([1])). Actually, the first one fails.

I'm confused, don't both map and the comprehension allocate the same number of arrays?

map returns a tuple for tuple inputs, while a comprehension creates an array.

Is just for avoiding the array that holds the arrays. But as I said, it should make much of a difference and I can change it back if you think its better.

Your modifications look great to me --- thank you for the explanations!

pabloferz · 2016-11-24T14:07:25Z

I don't think there are benchmarks for concatenations of mixed dense and sparse matrices. Is it worth to check the performance anyway?

pabloferz · 2016-11-24T18:30:58Z

I think we can remove the 'needs tests' label, there already tests covering the pointed issues.

pabloferz · 2016-11-25T22:08:09Z

Unless there is an impact on performance or there are any more comments, this is ready on my side.

stevengj · 2016-11-25T23:07:57Z

It needs @inferred tests for the type stability, to prevent regressions.

pabloferz · 2016-11-25T23:16:29Z

There are already tests in there, but I can add more if they don't seem enough.

stevengj · 2016-11-25T23:24:52Z

@pabloferz, I just want to make sure that tests were added for any issues that you fixed, ~~e.g. I don't see a test for #19304~~ oh, now I see it.

nalimilan · 2016-11-26T06:45:45Z

base/abstractarray.jl

+    catdims = dims2cat(dims)
+    shape = cat_shape(catdims, (), map(cat_size, X)...)
+    A = cat_similar(X[1], T, shape)
+    if countnz(catdims) > 1 && T <: Number


Wouldn't putting T <: Number first be better to avoid computing the countnz part when possible?

Good call. Changed.

stevengj · 2016-12-01T14:14:27Z

LGTM.

tkelman · 2016-12-01T15:02:57Z

Why wasn't nanosoldier run here?

stevengj · 2016-12-01T17:42:03Z

Forgot, sorry.

pabloferz · 2016-12-02T17:16:23Z

Fortunately, there were only speed improvements that came from this. See https://github.com/JuliaCI/BaseBenchmarkReports/blob/a9c6ffef26b60d527d06b80ac3ea2fde79637a2a/daily_2016_12_2/report.md

tkelman · 2016-12-07T04:22:34Z

This broke quite a few packages: https://htmlpreview.github.io/?https://github.com/JuliaCI/pkg.julialang.org/blob/ac017050e0c662e46feef496e77ac12baa85583c/pulse.html

I haven't bisected all of them, so this isn't necessarily at fault for every one, but it's at least the underlying cause for AverageShiftedHistograms and BlackBoxOptim.

nalimilan · 2016-12-07T09:03:41Z

The new failure in CategoricalArrays tests is due to the fact that vcat(["a"], "az") now returns a Vector{Any}, when it previously returned a Vector{String} (both are ==). Of course this can (should?) be written as vcat(["a"], ["az"]), but I'm not sure whether Any is intended or not. For example, vcat([1], 2) still returns a Vector{Int}.

pabloferz · 2016-12-07T14:19:49Z

@nalimilan That's because we this uses now promote_eltype for type stability reasons, but eltype(String) == Char. So we need special handling for scalars (in this case non AbstractArray subtypes before doing the promote_eltype). Fortunately, that is pretty easy to fix.

@tkelman I'll look into the ones you mentioned above to see what is the problem, and I you get a bigger list I can look into that too.

pabloferz · 2016-12-07T18:53:15Z

@tkelman I didn't check them all, but #19523 seems to fix the problems on a bunch of packages.

nalimilan reviewed Nov 22, 2016

View reviewed changes

pabloferz force-pushed the pz/typestablecat branch 2 times, most recently from e72d1c8 to 318cf38 Compare November 23, 2016 15:10

Sacha0 reviewed Nov 23, 2016

View reviewed changes

pabloferz force-pushed the pz/typestablecat branch 2 times, most recently from a244ad6 to 7c5ea3e Compare November 24, 2016 05:33

kshyatt added the needs tests Unit tests are required for this change label Nov 24, 2016

pabloferz force-pushed the pz/typestablecat branch from 7c5ea3e to 369c194 Compare November 24, 2016 18:27

pabloferz changed the title ~~WIP: Type-stabilizing array concatenations~~ Type-stabilizing array concatenations Nov 25, 2016

nalimilan reviewed Nov 26, 2016

View reviewed changes

pabloferz added 4 commits November 26, 2016 09:52

Add a type stable array concatenation method

31df927

Add eltype for Dates.Period

70a54ea

Simplify BitArray concatenation

240bc2d

Type stable concatenation of sparse arrays

524cb5f

pabloferz force-pushed the pz/typestablecat branch from 369c194 to 524cb5f Compare November 26, 2016 15:53

stevengj removed the needs tests Unit tests are required for this change label Dec 1, 2016

stevengj merged commit 684dc9c into JuliaLang:master Dec 1, 2016

KristofferC added the potential benchmark Could make a good benchmark in BaseBenchmarks label Dec 1, 2016

pabloferz deleted the pz/typestablecat branch December 2, 2016 17:11

pabloferz mentioned this pull request Dec 7, 2016

RFC: Specialize promotion for concatenations #19523

Merged

cstjean mentioned this pull request Feb 12, 2018

hcat not type-stable with union types #26004

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type-stabilizing array concatenations #19387

Type-stabilizing array concatenations #19387

pabloferz commented Nov 22, 2016 •

edited

Loading

nalimilan Nov 22, 2016

pabloferz Nov 23, 2016

stevengj commented Nov 23, 2016

stevengj commented Nov 23, 2016

Sacha0 Nov 23, 2016

pabloferz Nov 23, 2016 •

edited

Loading

stevengj Nov 24, 2016

nalimilan Nov 24, 2016

pabloferz Nov 24, 2016

Sacha0 Nov 24, 2016

pabloferz commented Nov 24, 2016 •

edited

Loading

pabloferz commented Nov 24, 2016

pabloferz commented Nov 25, 2016

stevengj commented Nov 25, 2016

pabloferz commented Nov 25, 2016

stevengj commented Nov 25, 2016 •

edited

Loading

nalimilan Nov 26, 2016

pabloferz Nov 26, 2016

stevengj commented Dec 1, 2016

tkelman commented Dec 1, 2016

stevengj commented Dec 1, 2016

pabloferz commented Dec 2, 2016

tkelman commented Dec 7, 2016

nalimilan commented Dec 7, 2016

pabloferz commented Dec 7, 2016

pabloferz commented Dec 7, 2016

Type-stabilizing array concatenations #19387

Type-stabilizing array concatenations #19387

Conversation

pabloferz commented Nov 22, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stevengj commented Nov 23, 2016

stevengj commented Nov 23, 2016

Choose a reason for hiding this comment

pabloferz Nov 23, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pabloferz commented Nov 24, 2016 • edited Loading

pabloferz commented Nov 24, 2016

pabloferz commented Nov 25, 2016

stevengj commented Nov 25, 2016

pabloferz commented Nov 25, 2016

stevengj commented Nov 25, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stevengj commented Dec 1, 2016

tkelman commented Dec 1, 2016

stevengj commented Dec 1, 2016

pabloferz commented Dec 2, 2016

tkelman commented Dec 7, 2016

nalimilan commented Dec 7, 2016

pabloferz commented Dec 7, 2016

pabloferz commented Dec 7, 2016

pabloferz commented Nov 22, 2016 •

edited

Loading

pabloferz Nov 23, 2016 •

edited

Loading

pabloferz commented Nov 24, 2016 •

edited

Loading

stevengj commented Nov 25, 2016 •

edited

Loading