Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive cloning in Stream::fold #105

Open
wolfiestyle opened this issue Feb 7, 2017 · 8 comments
Open

Excessive cloning in Stream::fold #105

wolfiestyle opened this issue Feb 7, 2017 · 8 comments

Comments

@wolfiestyle
Copy link
Contributor

Tried to implement something that accumulates all the stream values into a Vec using Stream::fold, but found that it clones the accumulator values multiple times during it's operation. Example code:

extern crate carboxyl;
use carboxyl::Sink;
use std::fmt::Debug;

#[derive(Debug)]
struct Storage<T>
{
    vec: Vec<T>,
}

impl<T> Storage<T>
{
    fn new() -> Self
    {
        Storage{ vec: Vec::new() }
    }

    fn push(mut self, item: T) -> Self
    {
        self.vec.push(item);
        self
    }
}

impl<T: Clone + Debug> Clone for Storage<T>
{
    fn clone(&self) -> Self
    {
        println!("storage cloned! {:?}", self.vec);
        Storage{ vec: self.vec.clone() }
    }
}

fn main()
{
    let sink = Sink::new();
    let signal = sink.stream().fold(Storage::new(), Storage::push);

    sink.send(11);
    sink.send(22);
    sink.send(33);

    println!("result: {:?}", signal.sample());
}

output:

storage cloned! []
storage cloned! []
storage cloned! [11]
storage cloned! [11]
storage cloned! [11]
storage cloned! [11, 22]
storage cloned! [11, 22]
storage cloned! [11, 22]
storage cloned! [11, 22, 33]
storage cloned! [11, 22, 33]
storage cloned! [11, 22, 33]
result: Storage { vec: [11, 22, 33] }

Don't know if I'm using it wrong, but this seems pretty inefficient. A fold operation shouldn't require cloning the accumulator.

@wolfiestyle
Copy link
Contributor Author

wolfiestyle commented Feb 11, 2017

could remove one clone here: wolfiestyle@d18ed3a

But it seems that the problem comes from sampling (used in the implementation of fold). That causes extra clones that can't be easily avoided.

@Moredread
Copy link
Contributor

@aepsil0n do you see a usecase where the accumulator is a costly object to clone, or should that be avoided anyways?

@milibopp
Copy link
Owner

First of all, thanks for looking into this @Darkstalker. I would definitely accept your commit as a pull request. It would be nice to add a minimal test case along the lines of your example to avoid future regressions.

The implementation of fold is purely semantic, I have not put any effort to optimize it. That goes for a large part of the functionality. In this particular case my prime suspect would be this line, which would be hard to get rid of. But I'd have to debug this to know for sure.

It is certainly necessary to clone once at the end for sampling. The rest might be avoidable in theory. As it is, carboxyl is not provide a zero-cost abstraction for event handling. Reference counting, cloning, dynamic dispatch, etc. are still an issue. I am just not sure, how easy this is to fix without a complete rewrite.

@Moredread you are right to some extent. There is something to be said about data structures. Functional programming in the style promoted by this library works best with cheaply clonable purely functional data structures (see Okazaki, 1996). However, it would be nice not to be wasteful. ;)

@wolfiestyle
Copy link
Contributor Author

made the pull request, but couldn't figure out a reliable way to test it

@milibopp
Copy link
Owner

fine by me for now… I'll leave this issue open nonetheless, also because there is more to it.

@wolfiestyle
Copy link
Contributor Author

wolfiestyle commented Mar 3, 2017

Did some research about this, and I think I found a solution: passing the value around as Cow<T> and decide at runtime when to pass the value as ref or owned. This way we can clone only when it's really needed. But it causes significant changes to the API. More details here:

https://play.rust-lang.org/?gist=7eee873a19bdbf32122843f9e0d4f133&version=stable&backtrace=0

@milibopp
Copy link
Owner

milibopp commented Mar 6, 2017

Interesting… this might take some work to update the code accordingly.

@wolfiestyle
Copy link
Contributor Author

Made a repository with all my work here: https://github.com/darkstalker/frappe
It has most of the carboxyl functionality except threading (don't need it). I'll probably publish this as an alternative, since it has different goals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants