Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First-class technique for splitting input streams or absorbing input events #3570

Open
alice-i-cecile opened this issue Jan 6, 2022 · 16 comments
Labels
A-Input Player input via keyboard, mouse, gamepad, and more A-UI Graphical user interfaces, styles, layouts, and widgets C-Feature A new feature, making something new possible S-Needs-Design-Doc This issue or PR is particularly complex, and needs an approved design doc before it can be merged

Comments

@alice-i-cecile
Copy link
Member

alice-i-cecile commented Jan 6, 2022

What problem does this solve or what need does it fill?

Typically, when users click on a UI element (including in game units), we don't want to also handle that click in the rest of the world.

The same pattern is observed when typing into a chat box, where the keyboard inputs are stolen.

Right now, the default behavior is to pass all events through.

What solution would you like?

A few solutions immediately come to mind:

  1. Centralize this input stream splitting technique into a single standard mega-system that can be customized using function pointers.
  2. Assign ownership of input stream events by storing some form of owner (entities? labels?) on the input events as an optional field.
  3. Assign ownership of input events using entity-specific events (Suggestion: Entity Events #2070, Per entity Events #2116).
  4. Add a split method to ResMut<Events<T>>, which returns two iterators of events based on whether the supplied predicate is true. This doesn't make the chaos any better, but does improve ergonomics.

What alternative(s) have you considered?

Consume events in one-off systems using ResMut<Events<Input<T>>>.
This was the approach taken in vladbat00/bevy_egui#50, specifically vladbat00/bevy_egui@53c1773#diff-420f4e08a25299e0e1fb09f4757e1d5d027a2278f37030c09ab9c06786cfa52eR283.

However this is not a particularly principle or discoverable approach, and risks serious chaos as input events are silently eaten according to decentralized rules based on system ordering.

Additional context

Initially discussed on Discord. Led to #3569. Related reddit post with proposed workarounds.

@alice-i-cecile alice-i-cecile added C-Feature A new feature, making something new possible A-Input Player input via keyboard, mouse, gamepad, and more A-UI Graphical user interfaces, styles, layouts, and widgets S-Needs-Design-Doc This issue or PR is particularly complex, and needs an approved design doc before it can be merged labels Jan 6, 2022
@rezural
Copy link
Contributor

rezural commented Jan 6, 2022

Here are a few of the Use cases I can see involved, without too much of an opinion on how this could be solved:

There are a few cases which complicate how events may want to be handled:

  • PointerHidden: the mouse pointer is hidden, the App probably wants exclusive access to the systems
  • PointerVisible: the mouse pointer is showing, users can indicate that they want to interact exclusively with Plugin's based on a couple of schemes:
    • PointerLocation: are we hovering over a plugin's UI region?
    • PointerClicked: the user has clicked within a region, indicating they want to interact with this regions Plugin event code
      • Would need Some way of indicating (via predicate?) that the user has 'escaped' from exclusive interaction (i.e. escape key, or moving out of region, or clicking out of region)
    • Via a KeyCombo
  • KeyCombo: The user has entered a key combo which indicates they want to focus on a particular Plugin

Possible, high level solutions:

Event consumption registration:

  • Notion of different crates/plugins registering for possibly consuming events
    • The App itself
    • Each relevant Plugin

It is probably most flexible that the App chooses the event splitting.
Bevy could provide boilerplate Event splitting Strategy for i.e. the following:

  • PointerHidden: The pointer is hidden, default all events to the App
  • PointerVisible: The pointer is visible, either:
    • PointerLocation or PointerClicked strategies defines when a Plugin gets access via the pointer's location
  • KeyCombo: Events forward to a particular Plugin, until the user indicates they want to escape this mode (ESC, PointerClicked outside of region etc.)

with Apps able to provide their own event splitting behaviors

Plugins could notify bevy that they may want exclusive access to events based on predicates (is pointer over, is key combo just pressed, pointer + click)
This would seem to be brittle in that Plugins decide the UI workflow that should be within the perview of the App (i.e. hover based etc.)

@mockersf
Copy link
Member

mockersf commented Jan 6, 2022

I think a plugin should not remove info from the built-in Events<T> or Input<T> from Bevy. I may want to capture mouse click even on top of egui UI because I have a shader that displays a ripple effect on clicks. I may want to capture key events because I want to react to a shortcut even if the player is typing in a text field.

When having only one plugin that want to filter events, I originally thought that it must create a new event type and resend original Bevy events, mapping them to its own type, and filtering them to remove the one it already dealt with but that doesn't scale if more than one plugin want to capture events.

For that case I'm thinking now it could be interesting to be able to attach tags to events, and then being able to filter base on those tags so that you could asks for Events<MouseButton, (Without<Egui>, Without<OtherThing>)> and that starts looking like the Query api...

@rezural
Copy link
Contributor

rezural commented Jan 6, 2022

Yes I agree.

I hope that this serves to illustrate the point that there are currently no relevant abstractions that relieve the need for such egregious hacks.

@HackerFoo
Copy link
Contributor

HackerFoo commented Jan 6, 2022

Here' s my suggestion:

  • Leave input events as they are.
  • Add a new resource to track focus.
  • Each target can claim focus with something like focus.claim(layer, tick), where a higher tick, or the same tick and a higher layer take precedence. tick increases with each input event.
  • Systems can ignore events when they don't have focus, if it makes sense.
  • A new event could be added that includes the owner entity and focus change events, but this would have to be delayed until after the claiming phase.

I use a more sophisticated system that uses ray casts to assign focus, I've approximated this here with layer. My system also tracks the motion of multiple touch points simultaneously, and so must reset the motion vectors on ownership changes to keep a common frame of reference for each entity. A similar problem may need to be solved when implementing something like drag-and-drop.

Without such a system, gestures would be very hard to implement. Even corner cases would cause trouble, like clicking a button, moving off the button and back (or not), then releasing. So maybe it makes sense to implement a higher level input processing system as I have done, as a separate plugin.

Also, multi-touch and mouse input are quite different, so each might use a different plugin.

@alice-i-cecile
Copy link
Member Author

Add a new resource to track focus.

Okay, interesting. So you'd have something like a:

struct CurrentlyFocused(Option<Box<dyn Focusable>>);

This would be extremely useful for bevyengine/rfcs#41

IMO you have to use a trait object there: enums are the natural fit but they're not composable.

Do we ever want to have multiple objects / areas focused at once? No, not really. And if users need this, they can do make their own weird struct to represent hybrid states.

@PaperCow
Copy link
Contributor

PaperCow commented Jan 6, 2022

For that case I'm thinking now it could be interesting to be able to attach tags to events, and then being able to filter base on those tags so that you could asks for Events<MouseButton, (Without<Egui>, Without<OtherThing>)> and that starts looking like the Query api...

This is an interesting thought. There have been a number of issues and discussions around events that an API like this could address. I know I have been working around a situation similar to #1431 that this would work perfectly for in my own project. Seems to me this could also handle the same situations as things like #1626 in a very "bevy way".

@HackerFoo
Copy link
Contributor

Do we ever want to have multiple objects / areas focused at once? No, not really. And if users need this, they can do make their own weird struct to represent hybrid states.

The input to my app is multi-touch, so it can assign multiple contact points to multiple objects. For single point input, it might make sense to store all layers to form a hierarchy. For example, clicking a text box might also highlight the box enclosing (in a layer behind) the text box. So you could have:

type Layer = usize;
struct CurrentlyFocused(Vec<(Layer, Box<dyn Focusable>)>);

and keep it sorted by Layer, although I'd use the Entity:

struct CurrentlyFocused(Vec<(Layer, Entity)>);

Claiming would clear entries with the same or higher layer and a lower tick, replacing that part of the hierarchy. No action should be taken based on CurrentlyFocused until all systems have had a chance to claim focus, so these might have to run in a different stage.

@nicopap
Copy link
Contributor

nicopap commented Jan 24, 2022

I find @HackerFoo 's suggestion very interesting. I've actually came up with a similar system independently in ui-navigation (although didn't implement it yet). What you call claim I call lock here, and instead of a method on a resource, I have a behavior on a specific entity. To "unlock" you would need to send a NavRequest::Free event. See: nicopap/ui-navigation#7

There is no real concept of ownership, basically the locking mechanism is just a way to toggle off input for the ui-nav system. Without such an escape hatch, the user would need to account for my plugin in their own input management systems, which is not ideal.

For ui-nav, focus claim contention is irrelevant, because it imposes a focus hierarchy and is the sole source of truth for knowing what is focused. This also means the tick and layer parameters are irrelevant, since the ui-nav system knows the hierarchy (see the rfc for details bevyengine/rfcs#41)

Beyond that, there no motion vector clipping or coordinate system adjustments.

@Rust-Ninja-Sabi
Copy link

I run into this using Egui and mouse events in my game. Is there any solution now?

Thanks.

Best regards Sabi

@frederickjjoubert
Copy link

Thank you for creating this PR and the discussions above, I am eager to see it added to Bevy in the future; dealing with mouse clicks (needing to ray cast into the game world) going through UI has been a big pain point for me thus far.

@hxYuki
Copy link
Contributor

hxYuki commented May 16, 2023

IMO it's more of a UI thing. In most case having .is_pointer_over_ui and .is_textbox_focused is enough, you will just need to check these two values in your system. I use this in my code, it will be better to have it integrated (also the input support).

#[derive(Resource, Default)]
pub struct UiHandling {
    pub is_pointer_over_ui: bool,
}
#[derive(Component)]
pub struct NoPointerCapture;
fn check_ui_interaction(
    mut ui_handling: ResMut<UiHandling>,
    interaction_query: Query<
        &Interaction,
        (With<Node>, Changed<Interaction>, Without<NoPointerCapture>),
    >,
) {
    ui_handling.is_pointer_over_ui = interaction_query
        .iter()
        .any(|i| matches!(i, Interaction::Clicked | Interaction::Hovered));
}

Or in further, let knowing if a event was read (read events when ui clicked/typed). For example, record event on reading and clear on update, this should be easier than synchronizing EventReaders.

@ivakam
Copy link

ivakam commented Sep 5, 2023

IMO it's more of a UI thing. In most case having .is_pointer_over_ui and .is_textbox_focused is enough, you will just need to check these two values in your system. I use this in my code, it will be better to have it integrated (also the input support).

#[derive(Resource, Default)]
pub struct UiHandling {
    pub is_pointer_over_ui: bool,
}
#[derive(Component)]
pub struct NoPointerCapture;
fn check_ui_interaction(
    mut ui_handling: ResMut<UiHandling>,
    interaction_query: Query<
        &Interaction,
        (With<Node>, Changed<Interaction>, Without<NoPointerCapture>),
    >,
) {
    ui_handling.is_pointer_over_ui = interaction_query
        .iter()
        .any(|i| matches!(i, Interaction::Clicked | Interaction::Hovered));
}

Or in further, let knowing if a event was read (read events when ui clicked/typed). For example, record event on reading and clear on update, this should be easier than synchronizing EventReaders.

Correct me if I'm wrong, but this seemingly does nothing to alleviate the fact that interactions aren't absorbed by the top-level element (and there's no native way of achieving it cleanly) which is what the discussion centers on. With your snippet, multiple elements on top of each other could all return true for is_pointer_over_ui at once, even though some of them are hidden.

@hxYuki
Copy link
Contributor

hxYuki commented Sep 5, 2023

Correct me if I'm wrong, but this seemingly does nothing to alleviate the fact that interactions aren't absorbed by the top-level element (and there's no native way of achieving it cleanly) which is what the discussion centers on. With your snippet, multiple elements on top of each other could all return true for is_pointer_over_ui at once, even though some of them are hidden.

Well, with this snippet you're able to know if an area is covered by UI. You still have to manually ignore this operation in lower-levels' logic if you want. As for invisible nodes it also works, their Interaction will always be None.
Usually you just want to distinguish it in UI and Game World, the UI has done that well inside, for the latter you may prefer raycasting ways. But there're corner cases. Good to have better ways telling "level"s, and it's reasonable to have the rest handled on your own.

@alice-i-cecile
Copy link
Member Author

@aevyrie @alice 🌹 Thinking about how to use bevy_eventlistener and leafwing input manager together. Example use case: an editor with a game preview window. Left and right arrow keys rotate the camera on the 3d view, unless a text input field has focus, in which case the arrow keys move the text cursor. In JavaScript the way this would be handled is by making the input manager a top-level listener on the document. The text input widget would call stop propagation on key events, so those never make it up to the root. That is, keyboard events start out from the current Focus element, bubble upward, and if they make it all the way to the root they get fed into the global input system.

@viridia
Copy link
Contributor

viridia commented Mar 20, 2024

What I want to see is an ecosystem in which frameworks like Leafwing Input Manager (LWIM) and bevy_eventlistener (BEL) can coexist. We are already part of the way there in that both of these frameworks support various kinds of contextual routing:

  • LWIM supports routing based on the current game state.
  • BEL supports routing based on the target, or focus, entity. For spatial events such as pointer events, the target element is determined by a spatial search (mod_picking); for non-spatial events we'd typically use the entity pointed to by the Focus resource as the start of bubbling.

This means that we already have a viable solution for the "modal" operation where either one or the other (LWIM or BEL) is in complete control - so for example, in the game settings mode, BEL could handle all events, whereas in the main game HUD LWIM could take over. This can be done by (a) ensuring that LWIM is disabled in the settings game state, and (b) ensuring that no entities in the main HUD have BEL input handlers registered.

What we're lacking, though, is a solution for the case where both input managers are active simultaneously. An example use case is a game editor with a live preview window: we would want arrow keys to be able to, say, rotate the camera, unless there is a text input field that currently has focus, in which case the arrow keys would instead be used to move the text cursor.

As mentioned in the previous post, the way this sort of thing is typically done on the web is to install a top-level event handler on the document root. Widgets such as buttons, checkboxes, and text input fields will intercept events bubbling upward by calling .stopPropagation() on them. Any event which is not intercepted will eventually make it to the top and get handled by the global handler.

Things are a bit trickier in Bevy because there is no global document root: there can be an arbitrary number of separate UI graphs. One approach here is to define a "catch-all" listener at the root of each UI hierarchy. This, however, requires the user to remember to install the catch-all handler for each UI graph. This can be especially troublesome in the case of modal dialogs and popup menus: for reasons of layout, we want these elements to be globally positioned relative to the window, so they need to be unparented. (A better solution would be to support "Fixed" positioning as mentioned in #9564). If the dialog or menu widget is being provided by a third-party crate, then the crate may not know about the need to forward events that bubble up to the top, and might not provide a way to install a catch-all handler at that level.

Alternatively, we could modify BEL to automatically forward unhandled events to LWIM. Unfortunately, this creates an incestuous coupling between two crates that ought to be otherwise independent.

This is where my ideas get a bit nebulous, but I've been envisioning something like an "Input Graph" analogous to the "Animation Graph". For example, one can envision a graph that looks like this:

graph TD;
    GamePad1-->LWIM;
    GamePad2-->LWIM;
    PointingDevice-->BEL;
    Keyboard-->BEL;
    BEL-->LWIM;
Loading

In this graph we have "source" or "producer" nodes which represent various kinds of hardware input devices, "intermediate" nodes which have both inputs and outputs, and "consumer" nodes which only consume events. These nodes would be wired together at app initialization time, although the wiring could be conditional based on the current game state.

In this example, BEL would only forward to the next stage events which had not had .stop_propagation() called on them. Of course, users could invent additional kinds of filters or processing stages in the graph.

One requirement for a node graph like this is that all of the events being passed between the nodes would have to have a standardized type. This includes not only the event payload, but any transitional information needed to calculate .just_pressed().

@miketwenty1
Copy link

One method that would make this easy to work with would be to extend FocusPolicy or have a new NonUiFocusPolicy field available that can be used to Block on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Input Player input via keyboard, mouse, gamepad, and more A-UI Graphical user interfaces, styles, layouts, and widgets C-Feature A new feature, making something new possible S-Needs-Design-Doc This issue or PR is particularly complex, and needs an approved design doc before it can be merged
Projects
None yet
Development

No branches or pull requests