Scalars and array allocation convenience function #27675

hamzaelsaawy · 2018-06-20T01:05:29Z

I often see zero-dim arrays after reducing along and squeezing all dims in an array

Frequent Array{T}(dims) is awkward to type, and using zeros/ones feels inappropriate since array contents will be overwritten.

Alias for zero dimensional array, for edge cases in squeezes and reduces. `array` (a possible poor choice of name) aliases Array, mostly to avoid the braces.

andyferris · 2018-06-20T01:25:43Z

See zero-dim arrays after reducing along and squeezing all dims in an array

Sorry... is this a reference to a discussion somewhere?

hamzaelsaawy · 2018-06-20T01:32:14Z

Sorry, no, its just bad grammar :/
"I often see"

andyferris · 2018-06-20T01:58:57Z

OK, thanks! I edited your opening post to reflect this. A few thoughts:

This PR introduces two new concepts, an AbstractScalar{T} type alias for AbstractArray{T, 0} and an array constructor convenience constructor to avoid needing undef. It would typically be ideal to split separate ideas into separate PRs. (Personally, I also love to see the opening comment of a PR explain the changes and motivation and especially any context like other highly related github issues such as #18379).

Regarding AbstractScalar and Scalar, I've implemented something like this in StaticArrays,jl and generally find the idea useful, especially for making broadcast treat some array elements as scalars, e.g. translating a vector of 3d points (vectors) by a 3d vector. However, see also #27608 for an alternative take on the broadcasting issue / creating containers with one element. I think we've had some discussion in the past that "scalar" is a somewhat unusual choice of terminology because it usually indicates a value not inside a container - e.g. Float64 is a scalar, but an array (of any dimension) with one Float64 inside is still an array. While Scalar{T} = Array{T,0} makes some sense for consistency with Vector and Matrix being Arrays, in this case it seems possible that a new mutable struct could be more efficient (Array has double-indireciton and a header that might be larger than the element you are storing, so not exactly ideal).

Regarding array, I think we've gotten rid of some older (lower case) convenience constructors in the past in favor of using the types, so not sure if that is going full circle. The requirement to type Array{T}(undef, dims) was I believe deliberate to make it annoying to create (unsafe, if you're not careful enough) uninitialized memory.

In my mind the real issue here is that we have no syntax for creating a 0-d array. If e.g. [1] made Array{T,0} and [1,] made Array{T,1}, or something else so simple existed, then a lot of this pain would simply disappear!

andyferris · 2018-06-20T02:00:03Z

Oh I should point out that fill(1) seems to be the easiest way to create an Array{Int, 0} with a 1 inside, in case you find that useful.

JeffBezanson · 2018-06-20T02:47:06Z

Welcome, and thank you for the suggestion!

It's difficult to introduce an AbstractScalar type, since the idea of "scalar" should also include numbers, but most people feel numbers and arrays are disjoint. It's also difficult to define precisely; e.g. is a string a scalar?

I feel that having all of zeros(dims...), array(dims...), Array{T}(dims...), array(T, dims...), etc. is too many ways to do the same thing. We're better off sticking to more general syntax that's easier to guess once you know the pattern.

hamzaelsaawy · 2018-06-20T03:15:06Z

I agree, Scalar isnt exactly the best name, but I see them often enough as a consequence (not as a purposeful creation) that I thought it needed a name to save key strokes.
Creating them (via fill or squeeze or Scalar) isn't so much the point (and I definitely agree, storing a scalar as 0d array is inefficient) as much as I think we should make their existence explicit so its both either to type and make them more known as edge cases for reduces

In my mind the real issue here is that we have no syntax for creating a 0-d array. If e.g. [1] made Array{T,0} and [1,] made Array{T,1}, or something else so simple existed, then a lot of this pain would simply disappear!

Unfortunately, this PR does conflate two issues, but Id like to push back against this and fight for easy syntax for creating (safely) allocated uninitialized arrays.
fill seems the likely candidate for more general syntax to create arrays, but im not sure its an ideal fit for this.

oscardssmith · 2018-06-20T12:44:12Z

I think the distinction between arrays and scalars issue nearly as clean as many people seem to want it to be. consider the following 4 types: int, complex, a point in R3, and a 2x2 matrix. on the one hand, it's intuitive to see the first 2 (or 3) as scalars and the other(s) as containers, but each of them is an element of a group making them scalarish, and each can be considered a container of bits, floats, floats, and floats respectively. I think the solution to this is to provide useful guesses, but to allow users to override with 0 cost wrappers in either direction

mbauman · 2018-06-20T17:53:47Z

I'm curious what APIs are returning these zero-dimensional arrays back to you so frequently. Our reductions preserve dimensions — and if you reduce over an entire array, then you get an unwrapped scalar. Is it entirely from squeeze?

Would it make sense to return an unwrapped scalar from squeeze if no dimensions are left?

hamzaelsaawy · 2018-06-20T20:30:57Z

I use squeeze after marginalizing factors/conditional probability distributions, so that is what can return the 0d array.

I think having squeeze return a number instead of a 0d array breaks its purity. There is a conceptual consistency to having a 0d array, and it definitely makes sense in my case
I am just looking to give them a name/alias that makes sense.

hamzaelsaawy · 2018-06-20T20:45:43Z

Also, as a follow up to @JeffBezanson, why don't we shift more functionality to the Array constructor?
As in, have it implement fill, ones, zeros, etc

Array([T, ]dims::Dims) = Array{T, length(dims)}(dims)
Array([T, ]dims::Dims, el::T) = fill(convert(T, el), inds)

Wouldn't this simplify and unify everything, give a "general syntax that's easier to guess once you know the pattern", and avoid having to type out the braces ?

JeffBezanson · 2018-06-20T21:23:00Z

We used to have Array(T, dims) and deprecated it a while ago.

Some of those could maybe be unified, but at least they accept different arguments and do different things, while array(::Type{T}, dims...) where {T} = Array{T}(dims...) is just a rearrangement of the exact same arguments and functionality.

JeffBezanson · 2018-06-20T21:30:41Z

Also one has to be careful adding too many behaviors to the same function. For example replacing fill with Array(dims, el) means that the second argument to Array is now an element to fill the array with, and probably can't be used for anything else. However we have started adding some constructors of the form Array{T}(initializer, dims).

hamzaelsaawy · 2018-06-20T21:34:17Z

So I guess my question boils down to this:
Is there no preference between non-parameterized and parameterized constructors, ie we dont think users should be provided with A(T, V) if only to avoid writing A{T, V}()

JeffBezanson · 2018-06-20T21:43:55Z

The answer is that we should avoid spurious syntax variations. If A{T}() works and makes sense for that type, then A(T) should not be added just to save two characters. It wastes precious positional argument slots, and you have to learn when A{T}() and A(T) are equivalent and when they're not.

hamzaelsaawy · 2018-06-20T22:05:30Z

And you also prefer to not alias zero dimensional array?

mbauman · 2018-06-20T22:24:17Z

Alright, here's a bit of dirty laundry: I have needed this myself in the past in developing Julia's core array infrastructure — or, at least, I thought I did! I actually introduced a non-exported alias a few years ago:

julia/base/subarray.jl

Line 171 in d555a9a

AbstractZeroDimArray{T} = AbstractArray{T, 0}

But embarrassingly, that is purely a relic of a previous refactoring. It wasn't even used in the commit that introduced it! It was simply missed and forgotten, and nobody picked it up and used it since. Not even a package has discovered it. If I remember correctly, I introduced that before we had the convenient shorthand AbstractArray{<:Any, 0}. Now I don't even think about reaching for a different name for this concept; the type parameters describe it well. I think you're the first person who has requested a name for this.

There's a tradeoff with names — there's definitely a power behind giving something a name, but if you give too many things names, you have too much to learn and can't generalize your knowledge.

andyferris · 2018-06-20T23:46:51Z

why don't we shift more functionality to the Array constructor?

FYI there has been discussions along this line, where we use the constructor a lot more and more directly. There are various issues/PRs (open or closed) around collect, zeros, ones, etc. My favorite so far has been a = Array{T}(x, dims) behaves like a = Array{T}(undef, dims); a .= x. I believe the status is that we need to wait for some deprecations to settle before we add functionality along those lines.

Added scalar alias & convenience function for array alloc

f5afd1e

Alias for zero dimensional array, for edge cases in squeezes and reduces. `array` (a possible poor choice of name) aliases Array, mostly to avoid the braces.

hamzaelsaawy closed this Jun 20, 2018

hamzaelsaawy deleted the scalars_and_empty branch June 21, 2018 22:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scalars and array allocation convenience function #27675

Scalars and array allocation convenience function #27675

hamzaelsaawy commented Jun 20, 2018 •

edited by andyferris

Loading

andyferris commented Jun 20, 2018

hamzaelsaawy commented Jun 20, 2018

andyferris commented Jun 20, 2018 •

edited

Loading

andyferris commented Jun 20, 2018

JeffBezanson commented Jun 20, 2018

hamzaelsaawy commented Jun 20, 2018

oscardssmith commented Jun 20, 2018

mbauman commented Jun 20, 2018

hamzaelsaawy commented Jun 20, 2018 •

edited

Loading

hamzaelsaawy commented Jun 20, 2018 •

edited

Loading

JeffBezanson commented Jun 20, 2018

JeffBezanson commented Jun 20, 2018

hamzaelsaawy commented Jun 20, 2018 •

edited

Loading

JeffBezanson commented Jun 20, 2018

hamzaelsaawy commented Jun 20, 2018

mbauman commented Jun 20, 2018

andyferris commented Jun 20, 2018

Scalars and array allocation convenience function #27675

Scalars and array allocation convenience function #27675

Conversation

hamzaelsaawy commented Jun 20, 2018 • edited by andyferris Loading

andyferris commented Jun 20, 2018

hamzaelsaawy commented Jun 20, 2018

andyferris commented Jun 20, 2018 • edited Loading

andyferris commented Jun 20, 2018

JeffBezanson commented Jun 20, 2018

hamzaelsaawy commented Jun 20, 2018

oscardssmith commented Jun 20, 2018

mbauman commented Jun 20, 2018

hamzaelsaawy commented Jun 20, 2018 • edited Loading

hamzaelsaawy commented Jun 20, 2018 • edited Loading

JeffBezanson commented Jun 20, 2018

JeffBezanson commented Jun 20, 2018

hamzaelsaawy commented Jun 20, 2018 • edited Loading

JeffBezanson commented Jun 20, 2018

hamzaelsaawy commented Jun 20, 2018

mbauman commented Jun 20, 2018

andyferris commented Jun 20, 2018

hamzaelsaawy commented Jun 20, 2018 •

edited by andyferris

Loading

andyferris commented Jun 20, 2018 •

edited

Loading

hamzaelsaawy commented Jun 20, 2018 •

edited

Loading

hamzaelsaawy commented Jun 20, 2018 •

edited

Loading

hamzaelsaawy commented Jun 20, 2018 •

edited

Loading