-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type-stabilizing array concatenations #19387
Conversation
|
||
function _cat(T::Type, shape, sifter, X...) | ||
N = length(shape) | ||
A = cat_similar(X[1], T, shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're at it, could you choose the return type based on all input types (#2326)? Or do you think it should go into another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to handle that somehow, but I'd say that it would have to be addressed elsewhere. For one I have some ideas that would require the new sub-typing algorithm in place.
e72d1c8
to
318cf38
Compare
Do we have benchmark coverage of these functions for dense and sparse matrices? |
Needs tests? |
hcat(X...) | ||
end | ||
function vcat(Xin::_SparseConcatGroup...) | ||
X = SparseMatrixCSC[issparse(x) ? x : sparse(x) for x in Xin] | ||
X = map(x -> SparseMatrixCSC(issparse(x) ? x : sparse(x)), Xin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why replace the comprehensions with map
s? Generate Tuple
s rather than Vector
s?
Would SparseMatrixCSC
suffice in place of x -> SparseMatrixCSC(issparse(x) ? x : sparse(x))
?
Best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why replace the comprehensions with maps? Generate Tuples rather than Vectors?
Just to avoid the allocation from creating the array, although it shouldn't make much of a difference.
Would SparseMatrixCSC suffice in place of x -> SparseMatrixCSC(issparse(x) ? x : sparse(x))?
SparseMatrixCSC([1])
is not the same as SparseMatrixCSC(sparse([1]))
. Actually, the first one fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, don't both map
and the comprehension allocate the same number of arrays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
map
returns a tuple for tuple inputs, while a comprehension creates an array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is just for avoiding the array that holds the arrays. But as I said, it should make much of a difference and I can change it back if you think its better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your modifications look great to me --- thank you for the explanations!
a244ad6
to
7c5ea3e
Compare
I don't think there are benchmarks for concatenations of mixed dense and sparse matrices. Is it worth to check the performance anyway? |
7c5ea3e
to
369c194
Compare
I think we can remove the 'needs tests' label, there already tests covering the pointed issues. |
Unless there is an impact on performance or there are any more comments, this is ready on my side. |
It needs |
There are already tests in there, but I can add more if they don't seem enough. |
@pabloferz, I just want to make sure that tests were added for any issues that you fixed, |
catdims = dims2cat(dims) | ||
shape = cat_shape(catdims, (), map(cat_size, X)...) | ||
A = cat_similar(X[1], T, shape) | ||
if countnz(catdims) > 1 && T <: Number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't putting T <: Number
first be better to avoid computing the countnz
part when possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. Changed.
369c194
to
524cb5f
Compare
LGTM. |
Why wasn't nanosoldier run here? |
Forgot, sorry. |
Fortunately, there were only speed improvements that came from this. See https://github.com/JuliaCI/BaseBenchmarkReports/blob/a9c6ffef26b60d527d06b80ac3ea2fde79637a2a/daily_2016_12_2/report.md |
This broke quite a few packages: https://htmlpreview.github.io/?https://github.com/JuliaCI/pkg.julialang.org/blob/ac017050e0c662e46feef496e77ac12baa85583c/pulse.html I haven't bisected all of them, so this isn't necessarily at fault for every one, but it's at least the underlying cause for AverageShiftedHistograms and BlackBoxOptim. |
The new failure in CategoricalArrays tests is due to the fact that |
@nalimilan That's because we this uses now @tkelman I'll look into the ones you mentioned above to see what is the problem, and I you get a bigger list I can look into that too. |
This is a rewriting of the some of the array concatenation methods to make them type stable. The PR adds a type stable version of
cat
(cat{n}(::Type{Val{n}}, X...)
) while preserving the existing API (which cannot be made inferable, but should be just a bit faster than before anyway).Fixes #13665, #19038 and #19304
NOTE: I believe the first three commits could be backported to 0.5.