Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to opt out of the NetworkBehaviourEventProcess mechanism #1630

Merged
merged 4 commits into from
Jul 8, 2020

Conversation

thomaseizinger
Copy link
Contributor

Putting this up as a PoC to spark discussion. Happy to change the name of the attribute if someone can come up with something better 😅

The idea is the following:

Previously, we used the libp2p::Swarm by putting all dependencies that need to react to events from the network into the top-level NetworkBehaviour and put the relevant code into the inject_event callbacks.

I've always found this a bit awkward. Having to put things into the NetworkBehaviour even though they are not really part of the "network stack" of an application doesn't feel "right".
So I've been thinking about an alternative approach and came up with this:

Basically, the Swarm is part of a bigger loop and events are bubbled all the way and distributed to other components in the application.

async fn main() {
	let mut swarm = unimplemented!("build swarm from network behaviour");
	let mut database = unimplemented!("construct database");	

	loop {
		match swarm.next().await {
			BehaviourOutEvent::Ping(message) => {
				database.save_message(message);
			}
		}
	}
}

If the main components of the application are structured around such a polling mechanism, this works really nicely. All components can just be mutable and distribute events to each other without any further locking mechanism.

This PR extends the custom derive to make this usage easier by not requiring the implementation of NetworkBehaviourEventProcess but instead, converts every event into the specificed OutEvent and bubbles it up all the way.
This is already possible by simply saving all events inside the inject_event function and having a custom poll function that emits them again. This PR just makes this pattern easier to use.

Thoughts?
Is there a reason I am missing that makes this usage of rust-libp2p a really bad idea?

The only thing I could think of is starvation of the loop if there is little network activity but I think that can be solved in an acceptable way by using a timeout on the swarm.next() future.

@koivunej
Copy link
Contributor

This matches what I've been thinking about for rust-ipfs quite well. I was planning to try just implementing NetworkBehaviour directly but of course it would nicer if this way was supported, perhaps as an off-by-default option in the derive?

The only thing I could think of is starvation of the loop if there is little network activity but I think that can be solved in an acceptable way by using a timeout on the swarm.next() future.

Related to this we have a ... bit ... ugly way of looping around polling everything, including the swarm in non-async context. If one were to use async fn instead, I think this could be worked around with tokio::select!, StreamExt::select_next_some and so on? This was attempted quite a long time ago, can't remember what all issues there were.

@thomaseizinger
Copy link
Contributor Author

The problem we ran into with select! is that you get mutability borrow checker issues if you want to use the Swarm in any other select! branch because the future returned by next still holds a mutable borrow of the Swarm.

I am not sure what a really good solution to this looks like.

One I was thinking of is:

Select all the futures in the loop with the select_all function and drop all the non-completed ones after one resolved. This would solve the mutability borrow checker errors but only works if the futures are safe to drop without loosing progress.
For the swarm, that is - I think - true because it just polls the swarm internally. Actually, the fact that the mutable reference is still active shows that the underlying struct is actively modified and the future doesn't hold any important state internally.
All the other components you are waiting on in the loop need to satisfy this constraint as well, then you should be fine.

@tomaka
Copy link
Member

tomaka commented Jun 29, 2020

The problem we ran into with select! is that you get mutability borrow checker issues if you want to use the Swarm in any other select! branch because the future returned by next still holds a mutable borrow of the Swarm.

I don't know about tokio::select!, but in futures::select! that has now been fixed!
I've been using select! quite successfully for this exact situation in other projects.

Copy link
Member

@tomaka tomaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good.
I'm not a fan of the attribute name "bubble_up_events", but I can't come up with anything better.

@thomaseizinger
Copy link
Contributor Author

The problem we ran into with select! is that you get mutability borrow checker issues if you want to use the Swarm in any other select! branch because the future returned by next still holds a mutable borrow of the Swarm.

I don't know about tokio::select!, but in futures::select! that has now been fixed!
I've been using select! quite successfully for this exact situation in other projects.

Maybe I misunderstood the problem then, I thought that could be design not be fixable 🤔

We had something along these lines:

let mut swarm = todo!();
let mut some_other_component = todo!();

loop {
    futures::select! {
        // swarm.next requires &mut self
        swarm_event = swarm.next() => {
            dbg!(swarm_event);
        }
        other_event = some_other_component.next() => {
            // `react_to_event` also requires &mut self
            swarm.react_to_event(other_event) // mutability lifetime error here because of 2nd mutable binding to `swarm`
        }
    }
}

My understanding is, that you can't fix that unless you drop the future returned from swarm.next().

I've been using select! quite successfully for this exact situation in other projects.

I am very eager to learn more about this! Can you point to some code please?

I'm not a fan of the attribute name "bubble_up_events", but I can't come up with anything better.

I will try and think of something better!
I had put this together quickly as a PoC so I didn't want to waste time on naming things :D

@tomaka
Copy link
Member

tomaka commented Jun 30, 2020

My understanding is, that you can't fix that unless you drop the future returned from swarm.next().

That is true, but the macro does make sure that this future gets dropped if you enter the other_event = some_other_component.next() block.

@thomaseizinger
Copy link
Contributor Author

My understanding is, that you can't fix that unless you drop the future returned from swarm.next().

That is true, but the macro does make sure that this future gets dropped if you enter the other_event = some_other_component.next() block.

That is interesting, I will have to play around with this a bit more then to see if I can get it working. Thanks!

By the way, I've thought of some different names for the flag introduced in this PR:

  • forward_events = true: Because we are forwarding events to the owner of the swarm.
  • event_process = false: Because we are disabling the local event processing.

We could also introduce an enum parameter "events" (or "handle_events"):

#[behaviour(events = "process_locally")
#[behaviour(events = "forward_to_swarm")

@thomaseizinger thomaseizinger changed the title Allow derived NetworkBehaviour to bubble up events for consumption through swarm.next() Allow users to opt out of the NetworkBehaviourEventProcess mechanism Jul 2, 2020
@thomaseizinger thomaseizinger marked this pull request as ready for review July 2, 2020 02:02
@thomaseizinger
Copy link
Contributor Author

@tomaka I rolled with #[behaviour(event_process = false)] now and also had a crack at updating the documentation accordingly.

I tried to correctly set inter-doc links and the only way I got it working was through the trait.XYZ.html notation for both, the libp2p crate and the libp2p_swarm crate.
I hope that is fine :)

swarm/CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

@romanb romanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ad intra-doc-links: Is the problem simply that core-derive has no dependency on libp2p-swarm? If so, am I the only one thinking that even if it is technically not necessary to have a libp2p-swarm dependency in order for it to compile, core-derive should declare one since it provides a macro for libp2p-swarm and depends on a specific API version of that crate? I would then hope that intra-rustdoc-links work and also the included test(s) run against the version specified as a dependency and so check for compatibility.

Ad bubbling up events: Another attribute name suggestion would be event_delegate, defaulting to false. Essentially !delegate = process. I have no strong preferences there though.

@thomaseizinger
Copy link
Contributor Author

Ad intra-doc-links: Is the problem simply that core-derive has no dependency on libp2p-swarm? If so, am I the only one thinking that even if it is technically not necessary to have a libp2p-swarm dependency in order for it to compile, core-derive should declare one since it provides a macro for libp2p-swarm and depends on a specific API version of that crate? I would then hope that intra-rustdoc-links work and also the included test(s) run against the version specified as a dependency and so check for compatibility.

Not quite. The problem I was referring to was that I couldn't make intra-doc links in the form of "crate::NetworkBehaviour" work for both, the libp2p-swarm crate and the libp2p crate.
The problem was that because libp2p-swarm is re-exported in libp2p, there wouldn't be one valid identifier for the types (at least I couldn't make it work).
Using links to the html files works in both cases, regardless of which doc you are browsing :)

Ad bubbling up events: Another attribute name suggestion would be event_delegate, defaulting to false. Essentially !delegate = process. I have no strong preferences there though.

I am not attached to event_process. It does align nicely with the name of the NetworkBehaviourEventProcess trait though :)

@romanb
Copy link
Contributor

romanb commented Jul 6, 2020

Not quite. The problem I was referring to was that I couldn't make intra-doc links in the form of "crate::NetworkBehaviour" work for both, the libp2p-swarm crate and the libp2p crate. [..]

That sounds a bit like rust-lang/rust#65983 (context rust-lang/rust#43466). I personally would prefer we stick to intra-rustdoc-links, even if there are still issues, to avoid mixing different styles of links in this project and because with the other type of links we never get any warning about dead links.

@thomaseizinger
Copy link
Contributor Author

Not quite. The problem I was referring to was that I couldn't make intra-doc links in the form of "crate::NetworkBehaviour" work for both, the libp2p-swarm crate and the libp2p crate. [..]

That sounds a bit like rust-lang/rust#65983 (context rust-lang/rust#43466). I personally would prefer we stick to intra-rustdoc-links, even if there are still issues, to avoid mixing different styles of links in this project and because with the other type of links we never get any warning about dead links.

I am happy to change them. In which docs should they work, libp2p or libp2p-swarm?

@romanb
Copy link
Contributor

romanb commented Jul 7, 2020

Not quite. The problem I was referring to was that I couldn't make intra-doc links in the form of "crate::NetworkBehaviour" work for both, the libp2p-swarm crate and the libp2p crate. [..]

That sounds a bit like rust-lang/rust#65983 (context rust-lang/rust#43466). I personally would prefer we stick to intra-rustdoc-links, even if there are still issues, to avoid mixing different styles of links in this project and because with the other type of links we never get any warning about dead links.

I am happy to change them. In which docs should they work, libp2p or libp2p-swarm?

If rust-lang/rust#65983 is the cause, then libp2p-swarm, since the other will then eventually be fixed.

@thomaseizinger
Copy link
Contributor Author

@romanb Updated! All the docs I touched use intra-doc links now :)

@romanb
Copy link
Contributor

romanb commented Jul 8, 2020

Since this seems to be a backward-compatible addition, I'm inclined to cut a release of libp2p-core-derive upon merge. Let me know if there are objections.

@romanb romanb merged commit c4a5497 into libp2p:master Jul 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants