Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change mean_axis to use FromPrimitive #518

Merged
merged 2 commits into from
Dec 4, 2018

Conversation

jturner314
Copy link
Member

The documentation for the Zero and One traits says only that they are the additive and multiplicative identities; it doesn't say anything about converting an integer to a float by adding One::one() to Zero::zero() repeatedly. Additionally, it's nice to convert the length to A directly instead of having to use a loop.

The documentation for the `Zero` and `One` traits says only that they
are the additive and multiplicative identities; it doesn't say
anything about converting an integer to a float by adding `One::one()`
to `Zero::zero()` repeatedly. Additionally, it's nice to convert the
length to `A` directly instead of having to use a loop.
@bluss
Copy link
Member

bluss commented Oct 27, 2018

I'd love to be constructive, but :( Isn't there a better way to convert the integer?

We could limit this method to only float and complex numbers (do we have nice traits for that? https://github.com/bluss/complexfloat is a draft). Ideally we support a wider range of scalar types than complex numbers, though.

@jturner314
Copy link
Member Author

The other trait from the num-traits crate that would work for this is NumCast, which to some extent reveals less about the internals of mean_axis. It is similar in complexity to implement on custom types, though, because it requires ToPrimitive.

The standard library implements some conversions from integers with From, but not enough to be useful (e.g. From<i32> is implemented for f64 but not f32 because it's only lossless when converting to f64). There's the unstable TryFrom trait, which may work in the future, but it's unclear how it will handle converting to floating-point numbers with a mantissa with fewer bits than the integer (will it just not be implemented, return an error, or round).

Would an A: NumCast bound be better?

@bluss
Copy link
Member

bluss commented Oct 28, 2018

FromPrimitive does the right thing for floats at least. Num's traits feel imperfect (that we specify exactly Zero + Add + Div, we should do this at a bit higher level). Here we'd like an approxmating conversion that does not fail.

On one hand I'd like to restrict this to complex numbers, since we know that we don't handle bignums graciously (we probably clone them too much inside sum_axis). But the other case is custom Copy-able scalars, and I guess it could make sense to panic in the usize to scalar conversion there.

@jturner314
Copy link
Member Author

On one hand I'd like to restrict this to complex numbers, since we know that we don't handle bignums graciously (we probably clone them too much inside sum_axis).

Why aren't we using Add<&'a A, Output=A> instead of Add<A, Output=A>?

Here we'd like an approxmating conversion that does not fail.

That's what we've been thinking about, but I think the best thing would be if we didn't have to convert the length to A at all, but rather if we could use A: PromotingDiv<usize, Output = A> and directly perform the division. Unfortunately, we can't use Div for this because the Div trait implementations in the standard library are fairly limited.

  • For floating-point values, this PromotingDiv trait would be implemented by converting the usize to the floating-point type (possibly rounding) and performing the division. In some cases, it may make sense to use a higher-precision floating point type for the division. (For example, ::std::u128::MAX as f32 is inf, while ::std::u128::MAX as f64 is a finite value.)
  • For integer values, this trait would be implemented either by performing the division with however many bits are necessary for full precision and then converting the result back to the Output type, or by observing that the result is always zero if the divisor is greater than the maximum possible value of the dividend type. (For why this behavior would be nice, consider the current implementation or this PR if A is i8: we can't calculate the mean of more than 127 i8 values, even if the sum fits within i8.)
  • For complex numbers, just operate elementwise on the real and imaginary parts.
  • For bignums, convert the usize to a bignum and perform the division.

This also got me thinking about the sum, too. It would be nice if the sum could exceed the maximum of the element type as long as the mean doesn't. For example, consider array![::std::i8::MAX, ::std::i8::MAX]; it would be nice if mean_axis could return ::std::i8::MAX even though the sum of the elements is greater than ::std::i8::MAX. How to do this isn't obvious, though. One option would be to determine the sum type based on the number of elements being added:

// for A = i8
if n <= 1 {
    // use i8
} else if n <= 2 {
    // use i16
}  else if n <= 4 {
    // use i32
} else if n <= 8 {
    // use i64
} else {
    // use i128
}

In practice, this would mean that A would need to provide a mean function because its behavior depends on both the precise type of A and the number of elements. I don't like this, and I'm not sure it's worth doing.

With the options we have available today, I think FromPrimitive or NumCast make the most sense.

@bluss
Copy link
Member

bluss commented Oct 28, 2018

Why aren't we using Add<&'a A, Output=A> instead of Add<A, Output=A>?

I think we need even more to do it in a way that saves copies for bignums, for exampe preferring to use AddAssign with &A on the right hand side, so we don't have to create a new number for every operation. Best to implement and look at this whole picture, if we'd want to do it. We used the current way to keep it simple and build a good enough library first :)

Promoting sounds interesting but not important enough. Aren't for example accurate summation algorithms more interesting for sums and means?

And I don't need bignums, I don't want us to spend significant time and complexity on things that ndarray probably isn't going to be used for. (This is why we lean heavily towards floats, isn't it?)

With the options we have available today, I think FromPrimitive or NumCast make the most sense.

Yes. There are options of rewriting the numeric traits. I'd still like to do that, but of course I can't commit to doing that, don't have the time.

@jturner314
Copy link
Member Author

I think we need even more to do it in a way that saves copies for bignums, for exampe preferring to use AddAssign with &A on the right hand side, so we don't have to create a new number for every operation.

I don't understand. With A: Add<&'a A, Output = A>, the signature of the add method is fn add(A, other: &A) -> A. The implementation takes ownership of the first argument, so it can modify it in-place and return it. That's how it's actually implemented for BigUint, for example. No copies are necessary (unless you count moves, which are hopefully optimized out by the compiler, and the data in BigUint is a Vec anyway, so unoptimized moves are still cheap).

Promoting sounds interesting but not important enough. Aren't for example accurate summation algorithms more interesting for sums and means?

Yes, for floating-point numbers. I was thinking about adding a sum_kbn method for A: Float.

And I don't need bignums, I don't want us to spend significant time and complexity on things that ndarray probably isn't going to be used for. (This is why we lean heavily towards floats, isn't it?)

I agree. All I really care about for my use-cases is f64.

For the purpose of this PR, should we merge the current version in the next breaking release?

@bluss
Copy link
Member

bluss commented Oct 28, 2018

Good point, that makes sense. Sure, let's merge this when we can.

@bluss
Copy link
Member

bluss commented Nov 24, 2018

I think my vision for our numerical methods is still a more narrow (less generic) view of primitive numeric types and floats, and possibly using custom numerical traits instead of just the ones from num

Add to that a 'static bound for the scalar so that we can use ad-hoc type based dispatch, just like we do for dot. At this point I don't see much alternatives if we want to dispatch to for example specific SIMD implementations for f32, f64 and complex specifically.

@bluss
Copy link
Member

bluss commented Dec 4, 2018

Thanks! I meant to merge this after resolving the conflict. Apparently forgot.

@bluss bluss merged commit 9daa777 into rust-ndarray:master Dec 4, 2018
@jturner314 jturner314 deleted the mean-fromprimitive branch December 5, 2018 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants