Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow real-time collaboration #1754

Merged
merged 69 commits into from
Jun 7, 2024
Merged

Allow real-time collaboration #1754

merged 69 commits into from
Jun 7, 2024

Conversation

almet
Copy link
Member

@almet almet commented Apr 16, 2024

Add the ability to sync map changes with other connected peers.

choropleth-sync.webm

This pull request contains

On the server side:

  • A WebSocket server which creates "rooms" for map members to communicate. Messages are checked for validity before transferring them to the other party.
  • A new server view to create an authentication token, checked by the WS server
  • A new set of settings to handle the WebSocket server configuration.

On the client side:

  • A WebSocket client to consume messages coming from other peers
  • Updaters, a class that's triggered on new messages, which will update the appropriate local objects and triggers "rerendering".
  • Code that transmit local changes to other peers
  • A new button that makes it possible to enable "real-time collaboration".

image

Note

This PR isn't the full story (see the "To be handled later" section below). Currently:

  • You only get the changes while you're connected: changes that happened before you join aren't synced at the moment.
  • Peers can diverge: Because changes are streamed directly from one peer to the others, peers can be in a diverging state. (I'm exploring if using Hybrid Logical Clocks can help this, and what the cost/benefit would be, but it's not the case here).
What works?
Feature Works?
Map properties: syncing OK
Layer properties: syncing OK
Shape properties: syncing OK
Marker: Create OK
Marker: Delete OK
Marker: Move OK
Marker: Clone OK
Line: Create OK
Line: Delete OK
Line: Drag OK
Line: Clone OK
Line: Continue OK
Polyline: Create OK
Polyline: Drag OK
Polyline: Edit OK
Polyline: Delete OK
Polyline: Clone OK
Polyline: transform to polygon OK
Multi-polyline: Extract as a separate shape OK
Multi-polyline: Remove shape from multi OK
Multi-polyline: Make main shape OK
Multi-polyline: transfer shape to edited feature OK
Multi-polyline (vertex): Split line OK
Multi-polyline (vertex): Continue line OK
Multi-polyline: Merge lines OK
Polygon: Create OK
Polygon: Drag OK
Polygon: Edit OK
Polygon: Delete OK
Polygon: Clone OK
Polygon: Transform to polyline OK
Polygon: Start a hole OK
Multi-polygon: Extract as a separate shape OK
Multi-polygon: Remove shape from multi OK
Multi-polygon: Make main shape OK
Browser: Change value NOK
Permissions : sync NOK
Sync imported geometries OK
Sync imported data OK

How to make it work locally?

In order to make it work locally, just update your local settings with the following ones:

WEBSOCKET_ENABLED = True
WEBSOCKET_HOST = "localhost"
WEBSOCKET_PORT = 8002
WEBSOCKET_URI = "ws://localhost:8002"

And then update your dependencies and run the websocket server alongside the django server:

make install
python umap/ws.py

Steps:

  • Create a simple WebSocket server
  • Integrate with UMAP_SETTINGS
  • Validate messages received
  • Handle authentication using server-signed tokens
  • Add a WebSocket client in the JavaScript code
  • Transmit messages to the server when they happen:
    • map properties
    • features and their properties
    • datalayers
  • Receive messages coming from another peer and apply them locally:
    • map properties
    • features and their properties
    • datalayers
  • Add a button to activate/deactivate the WebSocket feature on a map
  • Add a ENABLE_WEBSOCKETS flag in the settings (and probably other configuration entries)
  • Handle datalayer specific settings
  • Move the Enable Collaboration flag somewhere hidden
  • Only enter the WebSocket when on edit mode
  • Bug: multi polygons are copied when a hole is put inside them
  • Feature: Move polygon, polylines, etc
  • Create a geometry when hitting escape
  • Handle import (GeoJSON and others)
  • Feature: handle cloning features
  • Bug: Marker is created even if exited after startMarker
  • Issue a warning when trying to work on a non-existing feature
  • Feature: Convert features, etc (right click menu)
  • Feature: Limit the kind of messages a client can receive (e.g. make it impossible to replace unauthorized keys)
  • Bug: Custom overlay is not set
  • Sync datalayer creation
  • Bug: Custom background is not set
  • Bug: Remote Data not working
  • Bug ?: Versions not working on a datalayer
  • Bug: limit bounds isn't sent
  • Deploy the branch using docker
  • Make the tests pass
  • Add unit tests, for JS modules that are small enough
  • Add functional tests
  • Add documentation
  • Deactivate server-side optimistic merge in the case syncEnabled = True
  • Fix: do not send data before the feature has been created on the other side (when naming for instance)
Tests

Unit tests:

  • Auth token view: 403 is returned when unauthorized
  • Auth token view: token is returned when authorized
  • Auth token view: token is returned when anonymous and authorized
  • Message Dispatcher: unknown updater raises an error
  • Message Dispatcher: unknown operations raise an error
  • Updater: updateObjectValue is able to:
    • update nested keys
    • delete keys if value is undefined
    • update first level keys

Functional tests, with two browser tabs, and the websocket server running:

  • Marker add
  • Marker move
  • Marker delete
  • Polygon add
  • Polygon geometry change
  • Polygon move
  • Polygon delete
  • Map name change
  • Map property change
  • Datalayer type update

To be handled later

The following is also required, but probably in other pull requests:

  • Get and apply operations that happened before a peer joined
  • Exploration: use a django management command rather than a direct python script? Do we want to integrate WebSockets directly in Django (see chore: use asgi rather than wsgi #1701) ?
  • Throttling: frequency of sync: ability to send messages by batches
  • Handle running multiple processes to handle more load
  • Use a pub/sub mechanism (with PostgreSQL), so that connected peers can be matched on distinct processes.
  • Use Hybrid Logical Clocks to ensure peers do not get out of sync
  • Handle WebSocket reconnection / redo the auth roundtrip
  • Disconnection:
    • Show the user an UI when disconnected
    • Prompt the user when something will be deleted / lost
  • Find a mechanism to revoke permissions when the owner changes them (closing the WS connection might be enough, maybe with a specific message type)
  • Add information in the stats endpoint to see the number of "real-time" maps used.
  • Deal with an evil client sending messages to an elevated client: when/how these messages are blocked
  • Deal with maximum number of peers connected (what happens when the limit is met?)

Communication with other peers

There are multiple scenarii where a communication with other peers might prove useful. It also depends how we want to deal with this in general, as another way to handle this is to have the information stored on the server.

When entering the room:

  • The new peer needs the list of operations that happened but are not yet saved. In this case, all operations would need to be stacked locally. I see how this can help work on the undo/redo stack, as well (but also opens up new problems related to time, probably solved by Hybrid Logical Clocks).
  • The new peer needs to know who is in the room.

When saving the map:

  • Other peers might get notified, and get the fresh version from the server (and potentially reapply their changes on top)

In these two cases, it might be helpful to have a 1-1 channel with another peer (and not only do broadcasting). The server might just do the signaling (give the list of connected peers/ids), and the peer might chose to discuss with a privileged one.

To be able to trust the messages are coming from the same client, each peer might get assigned an unique UUID. The server would refuse new connections with this same ID, and messages can be channeled to another peer via the server.

@almet almet changed the title WebSocket server Use web socket to synchronize different views of a map Apr 16, 2024
@almet
Copy link
Member Author

almet commented Apr 19, 2024

💡 The way things are currently implemented, the synced objects (map, datalayers, features) expose a way to get back the syncEngine object.

Another way to do it is to pass the syncEngine instance to the FormBuilder class when building it. It's more duplicated code, but seems more explicit.

@almet almet changed the title Use web socket to synchronize different views of a map Allow real-time collaboration on a map Apr 19, 2024
@almet almet force-pushed the websockets branch 2 times, most recently from d58a144 to 1dc4901 Compare April 19, 2024 19:09
@almet almet force-pushed the websockets branch 2 times, most recently from e1629ba to 5ad3da5 Compare April 29, 2024 16:17
@almet
Copy link
Member Author

almet commented Apr 29, 2024

This works as intended, except for:

  • limit bounds doesn't seem to be sent
  • heatmap specific settings (and probably other specific layers properties) (not sure why, but the layer doesn't seem to be found. I'll investigate)

@almet
Copy link
Member Author

almet commented Apr 30, 2024

Security-related stuff

WebSocket settings (WEBSOCKET_ENABLED, WEBSOCKET_URI) are being sent to the JS client via the map properties, which can be seen as a shared registry between client data and server data.

I was worried this can open security considerations, especially we need to ensure locally defined values can't take precedence over the one served by the server.

In the end, it turns out to be safe, because the way it's done, the values from the local geoJSON are being replaced by the ones from the server (and not the other way around). We're good on this front.

Copy link
Contributor

@davidbgk davidbgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go 🕺

@almet almet changed the title Allow real-time collaboration on a map Allow real-time collaboration May 9, 2024
@almet almet force-pushed the websockets branch 4 times, most recently from 9ed6b4a to 98c3849 Compare May 15, 2024 10:25
Copy link
Member

@yohanboniface yohanboniface left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cooooool! Huge work! 🎉

@@ -43,3 +43,6 @@
globals()["STATICFILES_DIRS"].insert(0, value)
else:
globals()[key] = value

# Expose these settings for consumption by e.g. django.settings.configure.
settings_as_dict = {k: v for k, v in globals().items() if k.isupper()}
Copy link
Member

@yohanboniface yohanboniface May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were are we with the idea of using a Django management command ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it could be done on a separate pull request, would that work for you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, up to you. I think it's just a matter of copy pasting the ws server code in a file at the right place and using the Command class, and this may make your life easier to run the websocket server during the tests (but may also not :p ).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the pull request to run the WebSocket server via a django management command (it was easier than expected !).

See commit 38d19db (#1754) for more info.

@@ -34,11 +34,13 @@ dependencies = [
"django-probes==1.7.0",
"Pillow==10.3.0",
"psycopg==3.1.18",
"pydantic==2.7.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this new dependency ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's still used by the websocket server to check messages are conform

umap/ws.py Outdated
peers = CONNECTIONS[map_id] - {websocket}
# Only relay valid "operation" messages
try:
OperationMessage.model_validate_json(raw_message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd the the server blink here, and let the frontend know about the internal of a message, write it, and validate it before consuming it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be useful later, because some messages will not be relayed to all the other connected parties, so we will need to inspect the messages. A bit early, but I really believe it will be useful very soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's about checking one key, we may do it explicitly ? Without the need and the new layer of pydantic. At list, pydantic seems overkill to me here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see pydantic as doing form validation. In our case, It's more than checking just one key, because each message type might have a different format, that we need to ensure (some messages will be for the server itself, for instance if we decide the server saves the status of the connected peers)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is true that Pydantic in the current code is not doing much.

In the future, as we will add other messages types, we will want to ensure the messages match a format expected by the server, for instance when the clients will send disconnect messages that we don't want to relay on the other parties, or when they will send messages for just one peer, which should not be broadcasted.

In these cases, Pydantic will prove really useful than now, as we will have to check complex messages, and ensure their format match what is expected by the server.

It is worth noting that it already is useful, as it makes sure that the "join" message is holding the proper "token" key, and that only operation messages are valid in the "room".

If it's not too much of a hassle, I would like to keep it in place to avoid doing back and forth between implementation types for the message validation logic.

#### WEBSOCKET_URI

The connection string that will be used by the client to connect to the websocket server.
Use `wss://host:port` if the server is behind TLS, and `ws://host:port` otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe WEBSOCKET_TLS instead of duplicating HOST and PORT in this string ?

Copy link
Member Author

@almet almet May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why we would like this, as it would avoid duplicating the information in the settings.

We need to take in consideration that the "front" WebSocket address (that clients will connect to) might be different than the "back" ip and port which are bound in the host.

This happens for instance for reverse proxies, or when running inside a container. In this case, magically guessing the "front" address based on WEBSOCKET_HOST, WEBSOCKET_PORT and WEBSOCKET_TLS might not be enough, and we would need to introduce a WEBSOCKET_URI setting like the one that's already present in this PR.

I've changed the settings to reflect this BACK/FRONT difference, and updated the documentation to make it clearer, see commit 7990ac3 (#1754)

@almet almet force-pushed the websockets branch 3 times, most recently from d27e806 to de59190 Compare June 6, 2024 21:41
almet added 3 commits June 7, 2024 16:30
This makes it possible to use them in standalone scripts, when using
`django.settings.configure(**settings_dict)`.
almet added 26 commits June 7, 2024 16:32
This tests that the name of the map, and that zoom-control visibility is
properly synced over websockets.
In some cases, you want to stop the propagation of events. The previous
code was using `fromSync=true` and `sync=false` interchangeably. This
makes it use `sync=false` everywhere.
It's now it's responsability to get the authentication token from
the http server and pass it to the websocket server, it will make it
possible to redo the roundtrip when getting disconnected.
Messages are now checked for conformity with the procol we defined, but
stop at the `operation` boundary. Values aren't checked.
(but I would really like to see what web socker would look like)
As it requires more discussion, it will happen in a separate
pull-request.
This is currently a bug in the current implementation. Hopefully fixed
in later commits.
This allows the merge algorithm to not be lost when receiving changes.
Without this change, the optimistic merge algorithm isn't able to make
the distinction between peers, and features end up duplicated.
It is now using `WEBSOCKET_BACK_HOST`, `WEBSOCKET_BACK_PORT` and
`WEBSOCKET_FRONT_URI`.

We need to take in consideration that the "front" WebSocket address
(that clients will connect to) might be different than the "back" ip and
port which are bound in the host.

This happens for instance for reverse proxies, or when running inside
a container.

We considered using a `WEBSOCKET_TLS` setting, to try guessing the
"front" address based on `WEBSOCKET_HOST`, `WEBSOCKET_PORT` and
`WEBSOCKET_TLS`, but as the back and front address can differ, this
would need to introduce a `WEBSOCKET_URI` in any case, so we went with
just using it, and not adding an extra `WEBSOCKET_TLS`.
This allows to handle the loading of the settings in a consistant way,
and aditionnaly to provide a way to override the `WEBSOCKET_BACK_HOST`
and `WEBSOCKET_BACK_PORT` settings with arg commands `--host` and
`--port`.

Without this change, because of how we are currently loading our
settings, we would require the settings the be exposed by the
`umap.settings.__init__` file.

Previous implementations were exposing these settings, with the
following code:

```python
settings_as_dict = {k: v for k, v in globals().items() if k.isupper()}
  ```
Because this `syncUpdatedProperties` function is only called once, it
didn't trigger any issue in practice (as the check was always returning
true).
The function was only used once, so removing it  simplified the whole
flow.
They are now handled in the `render()` call, so there is no more need
for them here.
It is now possible to create proxy objects using `sync_engine.proxy(object)`.
The returned proxy object will automatically inject `metadata` and
`subject` parameters, after looking for them in the `getSyncMetadata`
method (these are only known to the synced objects).

As a result, the calls are now simplified:

```
this.sync.update("key", "value")
```
Because we are dealing with technologies using overlapping vocabulary,
it is easy to get lost. Hopefully this change makes it clear that it
converts geoJSON inputs in Leaflet / uMap objects.
Rather than having it done inside the datalayer itself. This gives us
more control.
Using [pytest-xprocess](https://pytest-xprocess.readthedocs.io/) proved
not being as useful as I thought at first, because it was causing
intermitent failures when starting the process.

The code now directly uses `subprocess.popen` calls to start the server.
The tests are grouped together using the following decorator:

`@pytest.mark.xdist_group(name="websockets")`

Tests now need to be run with the `pytest --dist loadgroup` so that all
tests of the same group happen on the same process.

More details on this blogpost:

  https://blog.notmyidea.org/start-a-process-when-using-pytest-xdist.html
@almet almet marked this pull request as ready for review June 7, 2024 14:48
@almet almet merged commit fc2de3f into master Jun 7, 2024
4 checks passed
@almet almet deleted the websockets branch June 7, 2024 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants