Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inplace operations (sum_axis, ...) #425

Closed
davenza opened this issue Mar 16, 2018 · 7 comments
Closed

Inplace operations (sum_axis, ...) #425

davenza opened this issue Mar 16, 2018 · 7 comments

Comments

@davenza
Copy link

davenza commented Mar 16, 2018

It would be really useful to have inplace versions for some operations.

In one of my projects I am using sum_axis(). However, I would prefer to make the operation inplace. I don't think is necessary to allocate more memory. Is there any plans to do such a thing?

I've tried to develop my own version of the sum_axis_inplace by wrapping the ArrayD inside a struct:

pub struct TabularValues<V>(pub ArrayD<V>);

However, making inplace changes in the Array involves a bunch of unsafe code (in my implementation I used multiple ArrayViews of the data in the deleted axis with ArrayView::from_shape_ptr()). Still, my implementation is not complete because I don't know how to modify the shape and strides. In fact, I would also like to call shrink_to_fit in the inner Vec, but I don't know how to do that.

Having easier access to the elements of the array (Vec, shape, strides) would be useful.

@jturner314
Copy link
Member

jturner314 commented Mar 17, 2018

This is one way to get the raw Vec, shrink it, and create a new array with the desired shape and strides (without allocating a new array):

  1. At some point, you need to compute the sums and store them in the correct elements. (You could do this later (e.g. after step 3) or as the first step, whatever is easiest.)
  2. Get the shape/stride information you need with the .dim()/.raw_dim()/.shape()/.strides() methods.
  3. Use .into_raw_vec() to extract the underlying Vec from the Array.
  4. Now that you have the underlying Vec, you can remove the extra elements and call .shrink_to_fit().
  5. Create an array with the desired shape and strides from the shrunken Vec by using Array::from_shape_vec() or ::from_shape_vec_unchecked(). (See the example in the docs for how to pass the shape and strides.)

I'm curious how your implementation works, though. It seems to me that in general it's a difficult problem to be able to shrink the underlying Vec after computing the sum, when the array can have arbitrary strides. Let's say that you have an array

[[a, b, c],
 [d, e, f]]

then in row-major memory layout, this would be

a, b, c, d, e, f

in memory, and for column-major memory layout it would be

a, d, b, e, c, f

In general, you can have arrays with weird custom memory layouts with arbitrary strides.

Let's say you're trying to perform a sum_axis() on axis 0 in-place, so you want the result to be:

[a + d, b + e, c + f]

If you want to be able to shrink the underlying Vec, then you need the result to be stored in the first three elements of the Vec. For the row-major layout, this is nice. You can compute the sums and save the results to the first row, so you'd get this in memory:

a + d, b + e, c + f, d, e, f

You can then shrink the Vec to the first three elements, and you're done. For the column-major layout, this doesn't work quite as nicely. If you were to compute the sums and store them in the first row, you'd get

a + d, d, b + e, e, c + f, f

in memory. However, you need the results in the first three elements in memory, not scattered throughout, in order to be able to shrink the Vec. At this point, it might be possible to move the results to the first three elements with careful swapping, but getting this correct could be challenging. (This is somewhat like an in-place transposition with N≠M. It's a little simpler because you don't care about overwriting most of the data, but it's also more complex because in general you have to deal with more that two dimensions and arbitrary strides.)

Another approach is to immediately place the results in the first three elements while being careful about the sequence you perform the in-place sums. You do have to be careful, though, because for example you don't want to overwrite d with b + e before you use it to compute a + d. If you're careful, you could get

a + d, b + e, c + f, e, c, f

which you can shrink to the first three elements. For general strides, this becomes a very complex problem to avoid overwriting elements with results before you need to read them to compute sums. There's probably some algorithm that would work by swapping elements around to keep them from being overwritten before they're needed, but I'm not really sure.

All this becomes much simpler if you don't care about shrinking the underlying Vec. You can perform the sums in-place (with the results possibly scattered throughout the array's memory depending on the axis you're summing over and the memory layout) and ignore the extra elements. It is possible to do this entirely using safe code (although in a slightly convoluted way):

#[macro_use]
extern crate ndarray;
extern crate num_traits;

use ndarray::prelude::*;
use ndarray::RemoveAxis;
use num_traits::Zero;
use std::ops::{Add, AddAssign};

/// Sums along the axis without allocating a new array.
///
/// **Panics** if `axis` is out-of-bounds or if `arr.len_of(axis) == 0`.
fn sum_axis_inplace<A, D>(mut arr: Array<A, D>, axis: Axis) -> Array<A, D::Smaller>
where
    A: Clone + Add<Output = A> + AddAssign + Zero,
    D: Dimension + RemoveAxis,
{
    {
        // Get mutable views of the first subview along `axis` and the rest of the array.
        let (first, rest) = arr.view_mut().split_at(axis, 1);
        // Remove the extra axis to simplify things a bit.
        let mut first = first.remove_axis(axis);
        // Zip the pieces together and perform the in-place sums, saving the results to `first`.
        azip!(mut first, ref rest (rest.lanes(axis)) in {
            *first += rest.scalar_sum()
        });
    }
    // Select the first subview (where the results were stored).
    arr.into_subview(axis, 0)
}

fn main() {
    let arr = array![[0, 1, 2], [3, 4, 5]];
    assert_eq!(sum_axis_inplace(arr, Axis(0)), array![3, 5, 7]);
    let arr = array![[0, 1, 2], [3, 4, 5]];
    assert_eq!(sum_axis_inplace(arr, Axis(1)), array![3, 12]);

    let arr = array![[[0, 1], [2, 3]], [[4, 5], [6, 7]]];
    assert_eq!(
        sum_axis_inplace(arr.clone(), Axis(0)),
        array![[4, 6], [8, 10]],
    );
    assert_eq!(
        sum_axis_inplace(arr.clone(), Axis(1)),
        array![[2, 4], [10, 12]],
    );
    assert_eq!(
        sum_axis_inplace(arr.clone(), Axis(2)),
        array![[1, 5], [9, 13]],
    );
}

This uses ndarray = "0.11" and num-traits = "0.1". It works for both fixed-dimension (Array1, Array2, …) and dynamic-dimension (ArrayD) arrays. It needs some more testing, but I think it's correct.

Does my comment answer your question?

What would you suggest for "easier access to the elements of the array (Vec, shape, strides)"? With the various methods I talked about in the list at the beginning of the comment, I think it's possible to access everything. However, I do think that the .into_raw_vec() method is a little hidden since it's in the docs for Array instead of ArrayBase. It also may not be very obvious that you can specify strides with ::from_shape_vec(). Are the docs the primary issue, or is there something in the interface itself that would be helpful to change?

Edit:

"[Are] there any plans to [provide in-place axis operations]?" At the moment, I don't have any plans to do this myself, but maybe @bluss does. In most cases, I'd rather just create a new array with the correct number of elements (the behavior of the current .sum_axis() method) rather than trying to reuse the existing array and keep around the old elements too. (I suppose you could drop the unused elements in the Vec that are to the back of the last used one, but that only makes a significant difference in special cases.) I can see some value in an in-place .sum_axis(), but it doesn't seem useful enough to add to the public interface at this point, and it's possible for the user to implement themselves in terms of existing methods (e.g. the implementation I provided above). I would like to overhaul the axis operations at some point in other ways (e.g. be able to sum over multiple axes and provide more operations (argmin, argmax, stdev, variance)) but it's not a very high priority for me right now.

@davenza
Copy link
Author

davenza commented Mar 17, 2018

Thanks @jturner314 for you quick and detailed response. I think your solution is pretty good.

A few clarifications:

  • I am a newbie in Rust and ndarray. Read my comments with that in mind.

  • When I said: "Having easier access to the elements of the array (Vec, shape, strides) would be useful.", I was referring to the ability to access to the inner Vec (a &Vec or &mut Vec) without having to destroy the full Array. Also, the ability to modify the shape and strides (getting &mut [Ix]) would be useful. You could say that those functions can be dangerous if not used properly. Well, in that case, I think that the most useful thing to do is mark them as unsafe.

  • Your function solution takes ownership of the array. I was trying to do the sum_axis_inplace() by just taking a &mut. In fact, I found the .into_raw_vec(), but it takes ownership of the Array. The method .remove_axis() also takes ownership of the array. Why I was trying to do that with a &mut?

    • Because the "taking ownership"-way leads to code such as: let summed = arr.sum_axis_inplace(Axis(0)); instead of arr.sum_axis_inplace(Axis(0));
    • I thought: "It should be possible to do" and I just tried hard :).
  • The .shrink_to_fit() method is not a hard requisite, but I thought that it would be cool to free some unused space.

As you were curious about my implementation, I leave it here. Is only defined for dynamic arrays. It uses a bunch of unsafe code and is very, very ugly, trying to workaround the ownership rules that rustc enforces (bad idea, I know). This implementation does not work because it does not modify the shape and strides of the resulting array. However, it locates the results in the first positions of the Vec.

Probably, the worst Rust code ever written:

#[macro_use]
extern crate ndarray;
extern crate num_traits;
extern crate core;

use num_traits::Zero;
use std::ops::{Add, AddAssign};
use std::fmt::Debug;
use ndarray::{Ix, Ixs, ArrayView, ShapeBuilder, Axis, ArrayViewD, ArrayD};

/// Copied from ndarray's dimension/mod.rs
/// Calculate offset from `Ix` stride converting sign properly
#[inline(always)]
fn stride_offset(n: Ix, stride: Ixs) -> isize {
    (n as isize) * (stride as isize)
}


#[derive(Clone, Debug, Eq, PartialEq)]
pub struct TabularValues<V>(pub ArrayD<V>);

impl<V> TabularValues<V>
    where V: Debug + Zero + Clone + AddAssign<V>
{

    /// Returns the shape and stride arrays with the axis removed.
    fn shape_remove_axis(&self, axis: &Axis) -> (Vec<Ix>, Vec<Ix>) {
        let shape = self.0.shape().clone();
        let new_shape = shape.iter()
            .enumerate()
            // Removing the axis
            .filter(|&(index, _)| index != axis.index())
            .map(|(_, &dim)| dim)
            .collect::<Vec<Ix>>();

        let strides = self.0.strides().clone();
        let new_strides = strides.iter()
            .enumerate()
            // Removing the axis
            .filter(|&(index, _)| index != axis.index())
            .map(|(_, &stride)| stride as usize)
            .collect::<Vec<Ix>>();
        (new_shape, new_strides)
    }

    /// Push ArrayViewDs of each index of the given axis to vec_subviews. With the new_shape and
    /// new_strides, we can obtain views of the given axis. This is the same as filling a Vec with
    /// the result of self.0.subview(axis, i); with i from 1 to ax_len.
    /// This avoids rustc complaining again..
    fn get_subviews(&self, axis: Axis, new_shape: Vec<Ix>, new_strides: Vec<Ix>, vec_subviews: &mut Vec<ArrayViewD<V>>) {
        let ax_len = self.0.len_of(axis);
        // We include all the subviews from 1 to ax_len because all these subviews will be summed to the
        // subview with index 0.
        for i in 1..ax_len {
            let off = stride_offset(i, self.0.strides()[axis.index()]);
            let mut ptr = self.0.as_ptr();
            unsafe {
                ptr = ptr.offset(off);
                let subview = ArrayView::from_shape_ptr(new_shape.clone().strides(new_strides.clone()), ptr);
                vec_subviews.push(subview);
            }
        }
    }


    pub fn sum_axis_inplace(&mut self, axis: Axis)
    {
        let ax_len = self.0.len_of(axis);

        let (new_shape, new_strides) = self.shape_remove_axis(&axis);
        // This vector should be initialized here because if it is created by the function get_subviews()
        // there is a rustc error of inmutable/mutable reference to self.
        let mut vec_subviews = Vec::with_capacity(ax_len - 1);
        self.get_subviews(axis, new_shape.clone(), new_strides.clone(), &mut vec_subviews);
        {
            // Do the sum over a view in the axis with index 0.
            let mut subview_mut = self.0.subview_mut(axis, 0);
            vec_subviews.iter()
                .for_each(|sub_view| {
                    subview_mut += sub_view
                });
        }

        let mut ptr = self.0.as_ptr();
        // Relocate the summed elements in the first positions of the vec.
        unsafe {
//             This is the same as self.0.subview(axis, 0);
//            It avoids an inmutable/mutable reference error with self in the next line.
            let subview = ArrayView::from_shape_ptr(new_shape.clone().strides(new_strides.clone()), ptr);
            self.0.iter_mut().zip(subview.iter())
                .for_each(|(mut_ref, subview_value)| {
                    *mut_ref = subview_value.clone()
                });
        }

        // Modify shape and strides
        // ....
    }
}



fn main() {
    let mut arr = TabularValues(array![[0, 1, 2], [3, 4, 5]].into_dyn());
    arr.sum_axis_inplace(Axis(0));
    assert_eq!(arr, TabularValues(array![3, 5, 7].into_dyn()));
    let mut arr = TabularValues(array![[0, 1, 2], [3, 4, 5]].into_dyn());
    arr.sum_axis_inplace(Axis(1));
    assert_eq!(arr, TabularValues(array![3, 12].into_dyn()));

    let mut arr = TabularValues(array![[[0, 1], [2, 3]], [[4, 5], [6, 7]]].into_dyn());

    let mut arr_test = arr.clone();
    arr_test.sum_axis_inplace(Axis(0));
    assert_eq!(
        arr_test,
        TabularValues(array![[4, 6], [8, 10]].into_dyn()),
    );
    let mut arr_test = arr.clone();
    arr_test.sum_axis_inplace((Axis(1)));
    assert_eq!(
        arr_test,
        TabularValues(array![[2, 4], [10, 12]].into_dyn()),
    );
    let mut arr_test = arr.clone();
    arr_test.sum_axis_inplace((Axis(2)));
    assert_eq!(
        arr_test,
        TabularValues(array![[1, 5], [9, 13]].into_dyn()),
    );
}

As offtopic, I found myself also solving the "be able to sum over multiple axes". That was easier, but it would be cool to have it implemented.

@jturner314
Copy link
Member

I am a newbie in Rust and ndarray. Read my comments with that in mind.

Welcome to Rust and ndarray!

I was referring to the ability to access to the inner Vec (a &Vec or &mut Vec) without having to destroy the full Array. Also, the ability to modify the shape and strides (getting &mut [Ix]) would be useful. You could say that those functions can be dangerous if not used properly. Well, in that case, I think that the most useful thing to do is mark them as unsafe.

I understand now. I'm hesitant to provide direct mutable access to those things just because it's so easy to make a mistake, but I suppose we can make this unsafe and just try to document all the ways this could go wrong. @bluss What are your thoughts on providing direct mutable access to these things?

For shape and strides, a safe approach would be to add a method .set_shape_strides() that can safety-check the new shape and strides. (It might be possible to extend .into_shape() to handle this, although it's for a somewhat different use-case. See #390 for a discussion about limitations of .into_shape().)

For the Vec, other than removing elements in use, the most dangerous part about allowing the user to modify the Vec is that some operations on Vec (e.g. .push()) can perform a reallocation that moves the data elsewhere in memory. The issue with this is that ArrayBase has a pointer to the first element of the data. It would be easy for the user to forget to update the pointer after pushing a new element to the Vec (or performing some other operation). I think it makes sense to protect mutable access to the Vec with a context manager (a type that derefs to Vec) or by taking a closure that operates on the Vec, so that after the mutable borrow is done, we can check that the new length of the Vec is sufficient and update the pointer if necessary. I need to think more about this.

Your function solution takes ownership of the array. I was trying to do the sum_axis_inplace() by just taking a &mut. In fact, I found the .into_raw_vec(), but it takes ownership of the Array. The method .remove_axis() also takes ownership of the array.

Oh, okay. Here's a (somewhat hacky but safe) solution using the currently available methods:

#[macro_use]
extern crate ndarray;
extern crate num_traits;

use ndarray::prelude::*;
use num_traits::Zero;
use std::ops::{Add, AddAssign};

/// Sums along the axis without allocating a new array.
///
/// **Panics** if `axis` is out-of-bounds or if `arr.len_of(axis) == 0`.
fn sum_axis_inplace<A>(arr: &mut ArrayD<A>, axis: Axis)
where
    A: Clone + Add<Output = A> + AddAssign + Zero,
{
    {
        // Get mutable views of the first subview along `axis` and the rest of the array.
        let (first, rest) = arr.view_mut().split_at(axis, 1);
        // Remove the extra axis to simplify things a bit.
        let mut first = first.remove_axis(axis);
        // Zip the pieces together and perform the in-place sums, saving the results to `first`.
        azip!(mut first, ref rest (rest.lanes(axis)) in {
            *first += rest.scalar_sum()
        });
    }
    // Take ownership of the array by swapping a temporary into the `arr` reference.
    // (This doesn't require any heap allocations.)
    let owned = ::std::mem::replace(arr, Array::from_vec(Vec::new()).into_dyn());
    // Select the first subview (where the results were stored) and put it into the `arr` reference.
    ::std::mem::replace(arr, owned.into_subview(axis, 0));
}

fn main() {
    let mut arr = array![[0, 1, 2], [3, 4, 5]].into_dyn();
    sum_axis_inplace(&mut arr, Axis(0));
    assert_eq!(arr, array![3, 5, 7].into_dyn());

    let mut arr = array![[0, 1, 2], [3, 4, 5]].into_dyn();
    sum_axis_inplace(&mut arr, Axis(1));
    assert_eq!(arr, array![3, 12].into_dyn());

    let mut arr = array![[[0, 1], [2, 3]], [[4, 5], [6, 7]]].into_dyn();
    sum_axis_inplace(&mut arr, Axis(0));
    assert_eq!(arr, array![[4, 6], [8, 10]].into_dyn());

    let mut arr = array![[[0, 1], [2, 3]], [[4, 5], [6, 7]]].into_dyn();
    sum_axis_inplace(&mut arr, Axis(1));
    assert_eq!(arr, array![[2, 4], [10, 12]].into_dyn());

    let mut arr = array![[[0, 1], [2, 3]], [[4, 5], [6, 7]]].into_dyn();
    sum_axis_inplace(&mut arr, Axis(2));
    assert_eq!(arr, array![[1, 5], [9, 13]].into_dyn());
}

Note that one option is to not remove the extra axis and instead just set the length of that axis to 1 by using .subview_inplace() instead of .into_subview(). (This would work for both fixed-dimensional and dynamic-dimensional arrays.) That's not quite as nice, though.

That said, I do think it would be good to add in-place variants of .remove_axis(), .insert_axis(), etc., to dynamic-dimensional arrays/views. (These variants wouldn't work for fixed-dimension arrays/views because of type changes when adding/removing axes.)

The .shrink_to_fit() method is not a hard requisite, but I thought that it would be cool to free some unused space.

I realized that this problem is actually a lot easier than I initially thought. In fact, I think it would be useful to add a .shrink_to_fit() method to Array that removes unused elements.

If all of the strides are positive and you keep the order of the elements in memory the same, you can just shift the used elements to the front of the Vec by iterating from the front to the back. For example, let's say you have array

[[[a, b],
  [c, d]],
 [[e, f],
  [g, h]]]

with column-major memory layout, so it looks like this in memory:

a, e, c, g, b, f, d, h

Note that the strides are 1, 2, 4. If you want to sum over axis 1, you can store the results in the first subview along axis 1, so after performing the sums you'd have

[[[a + c, b + d],
  [c, d]],
 [[e + g, f +h],
  [g, h]]]

which is

a + c, e + g, c, g, b + d, f + h, d, h

in memory. Now, you just need to shift the sums to the front of the Vec. Performing the swaps one-by-one from front to back, you'd get

a + c, e + g, c, g, b + d, f + h, d, h
a + c, e + g, b + d, g, c, f + h, d, h
a + c, e + g, b + d, f + h, c, g, d, h

Then, you can drop the last four elements and you're done. The resulting array is column-major layout with strides 1, 2.

This appears to be the approach that the "Relocate the summed elements in the first positions of the vec." section in your implementation is generally trying to follow. You do have to be careful with a couple of things, though:

  • Calling self.0.iter_mut() does not iterate over the elements in memory order; it iterates in logical order, which isn't what you want in this case. In this case, the simplest approach is probably just to take a pointer to the first element (with .as_mut_ptr()) and then .offset() it by 1 more each time.

  • When shrinking the array, the strides must be updated correctly. It's not correct to simply remove the stride corresponding to the removed axis; some of the other strides become smaller because elements were removed. (In the example, axis 1 was removed, but the last stride also changed from 4 to 2.)

How do you determine strides for the shrunken array that maintain the memory order of the original array, particularly when the original array can have weird custom strides? If all of the strides are positive, I think this will work:

  1. Set the element in the shape corresponding to the axis being summed over to 1.
  2. Determine an "index array" that sorts the strides of the original array in descending order. (Do this by sorting 0..n, using the strides as keys in descending order.)
  3. Reorder the shape according to the index array from step 2 (equivalent to sorting the shape, using the strides as the keys in descending order).
  4. Compute the strides for the shape resulting from step 3, assuming standard (row-major) layout.
  5. Using the index array from step 2, "unsort" the computed strides so that they're in the order that the strides were in the original array.
  6. Remove the stride corresponding to the axis being summed over.

For example, let's say the shape was 2, 3, 4, the strides were 1, 2, 6, and you're removing axis 1. The results of the steps would be:

  1. shape [2, 1, 4]
  2. index array [2, 1, 0]
  3. shape [4, 1, 2]
  4. strides [2, 2, 1]
  5. strides [1, 2, 2]
  6. strides [1, 2]

I don't think this will work unchanged for negative strides; they require some more thought. However, I think it will work for weird custom strides as long as they're all positive.

Since you're new to Rust and ndarray, I thought a few tips related to your code might be useful. (If this isn't helpful, please disregard.):

  • The easiest way I've found to remove an element when copying a slice, instead of using .iter()/.enumerate()/.filter()/.collect(), is to use two calls to .copy_from_slice() or .clone_from_slice(). (Copy the elements before the one you want to remove, then copy the elements after the one you want to remove.)

  • You're probably aware of this, but as an alternative to a TabularValues wrapper for ArrayD, it's also possible to create your own trait and implement it for ArrayD (so that the array doesn't have to be wrapped before calling the custom method). (In some cases, a wrapper type is the better approach, but in other cases it's nice to be able to avoid a wrapper.)

  • One thing I didn't realize when I was learning Rust is that it's undefined behavior to have mutable and immutable references to the same value, even in unsafe code. (You can have *const and *mut pointers to the same value, but not & and &mut references.) ndarray makes the assumption that ArrayView and ArrayViewMut behave like & and &mut, not *const and *mut. You may be interested in the discussion in Add multislice! macro #388. (My initial implementation for Add multislice! macro #388 had this issue, but we're working on adding pointer views (which would correspond to *const and *mut) to resolve this.)

Anyway, in summary to all this:

  • I think the code I provided above will work for your use-case, although it doesn't move the sums to the front of the Vec and drop the unused elements.
  • It would be good to add a .set_shape_strides() method to ArrayBase or extend .into_shape() to handle this.
  • It's worth some thought about if we want to provide mutable access to the underlying Vec, and if so, how to design the interface.
  • It would be good to add in-place variants of .remove_axis(), .insert_axis(), .into_subview(), and .slice_move() for dynamic-dimensional arrays.
  • It would be good to add a .shrink_to_fit() method to Array that shifts all of the used elements to the front of the Vec and drops the unused elements.

Since we've already had a lot of discussion in this thread, I'd like to create separate issues for the API changes. Does the code I provided resolve your use-case for this issue? What do you think of the API changes I listed?

@davenza
Copy link
Author

davenza commented Mar 19, 2018

Thank you very much for you kindly support and comments.

I implemented your version without further problems (nice trick using mem::replace() to take ownership). I don't shrink the vector to avoid dealing with unsafe code and pointer arithmetics. It would be nice to have it implemented in ndarray directly, so you don't have to deal with the movement of the results at the start of the vector and the computation of shape/strides. If it is implemented in the future, I certainly will use it.

With the rest of the comments, your suggestions seems nice to me.

  • The in-place versions of some functions can be certainly more ergonomic (especially if you need mem::replace to take ownership).
  • You made a good observation about the problems with the access to the inner Vec. Indeed, more thought is needed. I think that I'm not ready to suggest any design. However, with respect to the documentation, I think that the functions that access to the inner elements of the Array should be included in a section like "Advanced methods", as the basic user should not use those functions if they are not needed (also, marking them unsafe will make some people think twice before using them).

@jturner314
Copy link
Member

Okay, it sounds like this issue is resolved, so I'm closing it. I've created new issues for the potential API changes.

@davenza
Copy link
Author

davenza commented Mar 21, 2018

The issue is closed, but I thought it would be interesting to share some benchmark results comparing the sum_axis/sum_axis_inplace functions. The sum_axis_inplace is implemented exactly as in (#425 (comment)). The benchmark is executed with different array sizes (always dynamic arrays) and f64. Each array has a dimensionality of n and each dimension has always a cardinality of 2, so there is 2^n elements.

The results:

n sum_axis sum_axis_inplace Factor
2 641.41 ns 564.42 ns 0.88
4 1.4430 us 907.83 us 0.629
6 4.0597 us 2.6401 us 0.65
8 11.308 us 5.6031 us 0.495
10 37.226 us 17.379 us 0.467
12 246.17 us 116.94 us 0.475
14 709.73 us 301.43 us 0.425

We can see an important performance improvement, especially in large arrays. I think that in those instances, the heap allocations dominate the computation, so the larger the array, the larger the improvements.

@bluss
Copy link
Member

bluss commented Mar 22, 2018

(Hi!) I just want to chime in and say that we don't want to expose something that locks us in to using a Vec for holding the array elements. That we can convert to/from a Vec is ok, but not that we show a Vec behind a reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants