Allow real-time collaboration #1754

almet · 2024-04-16T16:22:59Z

Add the ability to sync map changes with other connected peers.

choropleth-sync.webm

This pull request contains

On the server side:

A WebSocket server which creates "rooms" for map members to communicate. Messages are checked for validity before transferring them to the other party.
A new server view to create an authentication token, checked by the WS server
A new set of settings to handle the WebSocket server configuration.

On the client side:

A WebSocket client to consume messages coming from other peers
Updaters, a class that's triggered on new messages, which will update the appropriate local objects and triggers "rerendering".
Code that transmit local changes to other peers
A new button that makes it possible to enable "real-time collaboration".

Note

This PR isn't the full story (see the "To be handled later" section below). Currently:

You only get the changes while you're connected: changes that happened before you join aren't synced at the moment.
Peers can diverge: Because changes are streamed directly from one peer to the others, peers can be in a diverging state. (I'm exploring if using Hybrid Logical Clocks can help this, and what the cost/benefit would be, but it's not the case here).

What works?

Feature	Works?
Map properties: syncing	OK
Layer properties: syncing	OK
Shape properties: syncing	OK
Marker: Create	OK
Marker: Delete	OK
Marker: Move	OK
Marker: Clone	OK
Line: Create	OK
Line: Delete	OK
Line: Drag	OK
Line: Clone	OK
Line: Continue	OK
Polyline: Create	OK
Polyline: Drag	OK
Polyline: Edit	OK
Polyline: Delete	OK
Polyline: Clone	OK
Polyline: transform to polygon	OK
Multi-polyline: Extract as a separate shape	OK
Multi-polyline: Remove shape from multi	OK
Multi-polyline: Make main shape	OK
Multi-polyline: transfer shape to edited feature	OK
Multi-polyline (vertex): Split line	OK
Multi-polyline (vertex): Continue line	OK
Multi-polyline: Merge lines	OK
Polygon: Create	OK
Polygon: Drag	OK
Polygon: Edit	OK
Polygon: Delete	OK
Polygon: Clone	OK
Polygon: Transform to polyline	OK
Polygon: Start a hole	OK
Multi-polygon: Extract as a separate shape	OK
Multi-polygon: Remove shape from multi	OK
Multi-polygon: Make main shape	OK
Browser: Change value	NOK
Permissions : sync	NOK
Sync imported geometries	OK
Sync imported data	OK

How to make it work locally?

In order to make it work locally, just update your local settings with the following ones:

WEBSOCKET_ENABLED = True
WEBSOCKET_HOST = "localhost"
WEBSOCKET_PORT = 8002
WEBSOCKET_URI = "ws://localhost:8002"

And then update your dependencies and run the websocket server alongside the django server:

make install
python umap/ws.py

Steps:

Tests

Unit tests:

Auth token view: 403 is returned when unauthorized
Auth token view: token is returned when authorized
Auth token view: token is returned when anonymous and authorized
Message Dispatcher: unknown updater raises an error
Message Dispatcher: unknown operations raise an error
Updater: updateObjectValue is able to:
- update nested keys
- delete keys if value is undefined
- update first level keys

Functional tests, with two browser tabs, and the websocket server running:

To be handled later

The following is also required, but probably in other pull requests:

Get and apply operations that happened before a peer joined
Exploration: use a django management command rather than a direct python script? Do we want to integrate WebSockets directly in Django (see chore: use asgi rather than wsgi #1701) ?
Throttling: frequency of sync: ability to send messages by batches
Handle running multiple processes to handle more load
Use a pub/sub mechanism (with PostgreSQL), so that connected peers can be matched on distinct processes.
Use Hybrid Logical Clocks to ensure peers do not get out of sync
Handle WebSocket reconnection / redo the auth roundtrip
Disconnection:
- Show the user an UI when disconnected
- Prompt the user when something will be deleted / lost
Find a mechanism to revoke permissions when the owner changes them (closing the WS connection might be enough, maybe with a specific message type)
Add information in the stats endpoint to see the number of "real-time" maps used.
Deal with an evil client sending messages to an elevated client: when/how these messages are blocked
Deal with maximum number of peers connected (what happens when the limit is met?)

Communication with other peers

There are multiple scenarii where a communication with other peers might prove useful. It also depends how we want to deal with this in general, as another way to handle this is to have the information stored on the server.

When entering the room:

The new peer needs the list of operations that happened but are not yet saved. In this case, all operations would need to be stacked locally. I see how this can help work on the undo/redo stack, as well (but also opens up new problems related to time, probably solved by Hybrid Logical Clocks).
The new peer needs to know who is in the room.

When saving the map:

Other peers might get notified, and get the fresh version from the server (and potentially reapply their changes on top)

In these two cases, it might be helpful to have a 1-1 channel with another peer (and not only do broadcasting). The server might just do the signaling (give the list of connected peers/ids), and the peer might chose to discuss with a privileged one.

To be able to trust the messages are coming from the same client, each peer might get assigned an unique UUID. The server would refuse new connections with this same ID, and messages can be channeled to another peer via the server.

almet · 2024-04-19T14:56:33Z

💡 The way things are currently implemented, the synced objects (map, datalayers, features) expose a way to get back the syncEngine object.

Another way to do it is to pass the syncEngine instance to the FormBuilder class when building it. It's more duplicated code, but seems more explicit.

almet · 2024-04-29T16:27:09Z

This works as intended, except for:

limit bounds doesn't seem to be sent
heatmap specific settings (and probably other specific layers properties) (not sure why, but the layer doesn't seem to be found. I'll investigate)

almet · 2024-04-30T17:32:16Z

Security-related stuff

WebSocket settings (WEBSOCKET_ENABLED, WEBSOCKET_URI) are being sent to the JS client via the map properties, which can be seen as a shared registry between client data and server data.

I was worried this can open security considerations, especially we need to ensure locally defined values can't take precedence over the one served by the server.

In the end, it turns out to be safe, because the way it's done, the values from the local geoJSON are being replaced by the ones from the server (and not the other way around). We're good on this front.

umap/views.py

umap/static/umap/js/modules/global.js

umap/static/umap/js/umap.js

umap/static/umap/js/modules/sync/updaters.js

umap/static/umap/js/umap.features.js

umap/static/umap/js/umap.layer.js

davidbgk

Let's go 🕺

umap/static/umap/js/modules/sync/engine.js

umap/views.py

umap/static/umap/js/umap.layer.js

yohanboniface

Very cooooool! Huge work! 🎉

yohanboniface · 2024-05-16T12:30:00Z

umap/settings/__init__.py

@@ -43,3 +43,6 @@
                    globals()["STATICFILES_DIRS"].insert(0, value)
                else:
                    globals()[key] = value
+
+# Expose these settings for consumption by e.g. django.settings.configure.
+settings_as_dict = {k: v for k, v in globals().items() if k.isupper()}


Were are we with the idea of using a Django management command ?

I believe it could be done on a separate pull request, would that work for you?

OK, up to you. I think it's just a matter of copy pasting the ws server code in a file at the right place and using the Command class, and this may make your life easier to run the websocket server during the tests (but may also not :p ).

I've updated the pull request to run the WebSocket server via a django management command (it was easier than expected !).

See commit 38d19db (#1754) for more info.

umap/settings/base.py

umap/static/umap/js/modules/schema.js

umap/static/umap/js/modules/sync/engine.js

umap/views.py

yohanboniface · 2024-05-16T15:25:28Z

pyproject.toml

@@ -34,11 +34,13 @@ dependencies = [
  "django-probes==1.7.0",
  "Pillow==10.3.0",
  "psycopg==3.1.18",
+  "pydantic==2.7.0",


Do we still need this new dependency ?

Yes, it's still used by the websocket server to check messages are conform

yohanboniface · 2024-05-16T15:28:18Z

umap/ws.py

+            peers = CONNECTIONS[map_id] - {websocket}
+            # Only relay valid "operation" messages
+            try:
+                OperationMessage.model_validate_json(raw_message)


I'd the the server blink here, and let the frontend know about the internal of a message, write it, and validate it before consuming it.

It will be useful later, because some messages will not be relayed to all the other connected parties, so we will need to inspect the messages. A bit early, but I really believe it will be useful very soon.

If it's about checking one key, we may do it explicitly ? Without the need and the new layer of pydantic. At list, pydantic seems overkill to me here.

You can see pydantic as doing form validation. In our case, It's more than checking just one key, because each message type might have a different format, that we need to ensure (some messages will be for the server itself, for instance if we decide the server saves the status of the connected peers)

It is true that Pydantic in the current code is not doing much.

In the future, as we will add other messages types, we will want to ensure the messages match a format expected by the server, for instance when the clients will send disconnect messages that we don't want to relay on the other parties, or when they will send messages for just one peer, which should not be broadcasted.

In these cases, Pydantic will prove really useful than now, as we will have to check complex messages, and ensure their format match what is expected by the server.

It is worth noting that it already is useful, as it makes sure that the "join" message is holding the proper "token" key, and that only operation messages are valid in the "room".

If it's not too much of a hassle, I would like to keep it in place to avoid doing back and forth between implementation types for the message validation logic.

yohanboniface · 2024-05-16T15:30:25Z

docs/config/settings.md

+#### WEBSOCKET_URI
+
+The connection string that will be used by the client to connect to the websocket server.
+Use `wss://host:port` if the server is behind TLS, and `ws://host:port` otherwise.


Maybe WEBSOCKET_TLS instead of duplicating HOST and PORT in this string ?

I see why we would like this, as it would avoid duplicating the information in the settings.

We need to take in consideration that the "front" WebSocket address (that clients will connect to) might be different than the "back" ip and port which are bound in the host.

This happens for instance for reverse proxies, or when running inside a container. In this case, magically guessing the "front" address based on WEBSOCKET_HOST, WEBSOCKET_PORT and WEBSOCKET_TLS might not be enough, and we would need to introduce a WEBSOCKET_URI setting like the one that's already present in this PR.

I've changed the settings to reflect this BACK/FRONT difference, and updated the documentation to make it clearer, see commit 7990ac3 (#1754)

umap/views.py

This makes it possible to use them in standalone scripts, when using `django.settings.configure(**settings_dict)`.

This tests that the name of the map, and that zoom-control visibility is properly synced over websockets.

In some cases, you want to stop the propagation of events. The previous code was using `fromSync=true` and `sync=false` interchangeably. This makes it use `sync=false` everywhere.

It's now it's responsability to get the authentication token from the http server and pass it to the websocket server, it will make it possible to redo the roundtrip when getting disconnected.

Messages are now checked for conformity with the procol we defined, but stop at the `operation` boundary. Values aren't checked.

(but I would really like to see what web socker would look like)

As it requires more discussion, it will happen in a separate pull-request.

This is currently a bug in the current implementation. Hopefully fixed in later commits.

This allows the merge algorithm to not be lost when receiving changes. Without this change, the optimistic merge algorithm isn't able to make the distinction between peers, and features end up duplicated.

It is now using `WEBSOCKET_BACK_HOST`, `WEBSOCKET_BACK_PORT` and `WEBSOCKET_FRONT_URI`. We need to take in consideration that the "front" WebSocket address (that clients will connect to) might be different than the "back" ip and port which are bound in the host. This happens for instance for reverse proxies, or when running inside a container. We considered using a `WEBSOCKET_TLS` setting, to try guessing the "front" address based on `WEBSOCKET_HOST`, `WEBSOCKET_PORT` and `WEBSOCKET_TLS`, but as the back and front address can differ, this would need to introduce a `WEBSOCKET_URI` in any case, so we went with just using it, and not adding an extra `WEBSOCKET_TLS`.

This allows to handle the loading of the settings in a consistant way, and aditionnaly to provide a way to override the `WEBSOCKET_BACK_HOST` and `WEBSOCKET_BACK_PORT` settings with arg commands `--host` and `--port`. Without this change, because of how we are currently loading our settings, we would require the settings the be exposed by the `umap.settings.__init__` file. Previous implementations were exposing these settings, with the following code: ```python settings_as_dict = {k: v for k, v in globals().items() if k.isupper()} ```

Because this `syncUpdatedProperties` function is only called once, it didn't trigger any issue in practice (as the check was always returning true).

The function was only used once, so removing it simplified the whole flow.

They are now handled in the `render()` call, so there is no more need for them here.

It is now possible to create proxy objects using `sync_engine.proxy(object)`. The returned proxy object will automatically inject `metadata` and `subject` parameters, after looking for them in the `getSyncMetadata` method (these are only known to the synced objects). As a result, the calls are now simplified: ``` this.sync.update("key", "value") ```

Because we are dealing with technologies using overlapping vocabulary, it is easy to get lost. Hopefully this change makes it clear that it converts geoJSON inputs in Leaflet / uMap objects.

Rather than having it done inside the datalayer itself. This gives us more control.

Hopefully this is clearer :-)

Using [pytest-xprocess](https://pytest-xprocess.readthedocs.io/) proved not being as useful as I thought at first, because it was causing intermitent failures when starting the process. The code now directly uses `subprocess.popen` calls to start the server. The tests are grouped together using the following decorator: `@pytest.mark.xdist_group(name="websockets")` Tests now need to be run with the `pytest --dist loadgroup` so that all tests of the same group happen on the same process. More details on this blogpost: https://blog.notmyidea.org/start-a-process-when-using-pytest-xdist.html

almet changed the title ~~WebSocket server~~ Use web socket to synchronize different views of a map Apr 16, 2024

almet force-pushed the websockets branch from 76e784a to ccdca30 Compare April 19, 2024 14:53

almet force-pushed the websockets branch from 5e4c410 to 9eb121c Compare April 19, 2024 15:37

almet changed the title ~~Use web socket to synchronize different views of a map~~ Allow real-time collaboration on a map Apr 19, 2024

almet force-pushed the websockets branch 2 times, most recently from d58a144 to 1dc4901 Compare April 19, 2024 19:09

almet force-pushed the websockets branch 2 times, most recently from e1629ba to 5ad3da5 Compare April 29, 2024 16:17

almet force-pushed the websockets branch from ff1e9c0 to 17e3063 Compare May 6, 2024 09:56

almet commented May 7, 2024

View reviewed changes

davidbgk reviewed May 7, 2024

View reviewed changes

umap/static/umap/js/modules/sync/engine.js Outdated Show resolved Hide resolved

umap/views.py Outdated Show resolved Hide resolved

davidbgk reviewed May 7, 2024

View reviewed changes

umap/static/umap/js/umap.layer.js Outdated Show resolved Hide resolved

almet force-pushed the websockets branch from ddc8699 to 87d8b3f Compare May 9, 2024 14:53

almet changed the title ~~Allow real-time collaboration on a map~~ Allow real-time collaboration May 9, 2024

almet force-pushed the websockets branch 4 times, most recently from 9ed6b4a to 98c3849 Compare May 15, 2024 10:25

almet mentioned this pull request May 16, 2024

feat(sync) Add a belongsTo property in the schema #1828

Open

almet force-pushed the websockets branch from 56244cf to b0369d3 Compare May 16, 2024 13:57

yohanboniface reviewed May 16, 2024

View reviewed changes

almet force-pushed the websockets branch 3 times, most recently from d27e806 to de59190 Compare June 6, 2024 21:41

almet added 3 commits June 7, 2024 16:30

feat(settings): Expose settings as a dict.

40d226e

This makes it possible to use them in standalone scripts, when using `django.settings.configure(**settings_dict)`.

WIP

0b5263e

doc: update cookie-related comment

43f5c70

almet added 26 commits June 7, 2024 16:32

test(sync): ensure polygon drag-n-drop is synced

83ff674

test(sync): Ensure map properties are synced

b57c37f

This tests that the name of the map, and that zoom-control visibility is properly synced over websockets.

test(sync): Ensure datalayer properties are synced

b0b0240

fix(schema): dashArray belongs to features as well

49913bf

chore(utils): remove console.log calls

0a1b276

chore(sync): use sync=false everywhere to stop propagation

5c1c93f

In some cases, you want to stop the propagation of events. The previous code was using `fromSync=true` and `sync=false` interchangeably. This makes it use `sync=false` everywhere.

chore(sync): Sync engine now retrieves auth token

8e7e071

It's now it's responsability to get the authentication token from the http server and pass it to the websocket server, it will make it possible to redo the roundtrip when getting disconnected.

chore(sync): relax some validation logic on the websocket server

b1e4233

Messages are now checked for conformity with the procol we defined, but stop at the `operation` boundary. Values aren't checked.

chore(test): fix a typo

5e6cb1f

(but I would really like to see what web socker would look like)

chore(test): remove empty test

f16491a

chore(sync): remove belongsTo for now

5fd0a29

As it requires more discussion, it will happen in a separate pull-request.

docs(sync): Document WEBSOCKET_* settings

c5fdb8a

test(sync): Ensure feature properties are synced

23e28e6

tests(sync): Add a test ensuring cloned features aren't duplicated

8e1cf7b

This is currently a bug in the current implementation. Hopefully fixed in later commits.

fix(sync): sync the reference-version across peers

d5c1e36

This allows the merge algorithm to not be lost when receiving changes. Without this change, the optimistic merge algorithm isn't able to make the distinction between peers, and features end up duplicated.

fixup: changes after @ybon's review.

fd42c2f

fix: use array.includes(string) the proper way.

eb4f462

Because this `syncUpdatedProperties` function is only called once, it didn't trigger any issue in practice (as the check was always returning true).

refactor(sync): Remove syncUpdateProperties function.

5c17497

The function was only used once, so removing it simplified the whole flow.

refactor(sync): remove formbuilder this._redraw() callbacks.

06e4d6a

They are now handled in the `render()` call, so there is no more need for them here.

refactor: rename geometrytoFeatures to geoJSONToLeaflet

5366f24

Because we are dealing with technologies using overlapping vocabulary, it is easy to get lost. Hopefully this change makes it clear that it converts geoJSON inputs in Leaflet / uMap objects.

refactor(sync): Sync layers creation with map.createDataLayer utility.

49d0b71

Rather than having it done inside the datalayer itself. This gives us more control.

refactor(sync): rename ws.py to websocket_server.py

3300319

Hopefully this is clearer :-)

almet force-pushed the websockets branch from de59190 to 2825ebb Compare June 7, 2024 14:32

almet marked this pull request as ready for review June 7, 2024 14:48

almet merged commit fc2de3f into master Jun 7, 2024
4 checks passed

almet deleted the websockets branch June 7, 2024 16:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow real-time collaboration #1754

Allow real-time collaboration #1754

almet commented Apr 16, 2024 •

edited

Loading

almet commented Apr 19, 2024 •

edited

Loading

almet commented Apr 29, 2024

almet commented Apr 30, 2024

davidbgk left a comment

yohanboniface left a comment

yohanboniface May 16, 2024 •

edited by almet

Loading

almet May 16, 2024

yohanboniface May 17, 2024

almet May 31, 2024

yohanboniface May 16, 2024

almet May 16, 2024

yohanboniface May 16, 2024

almet May 16, 2024

yohanboniface May 17, 2024

almet May 17, 2024

almet May 31, 2024

yohanboniface May 16, 2024

almet May 31, 2024 •

edited

Loading

Allow real-time collaboration #1754

Allow real-time collaboration #1754

Conversation

almet commented Apr 16, 2024 • edited Loading

This pull request contains

How to make it work locally?

Steps:

To be handled later

Communication with other peers

almet commented Apr 19, 2024 • edited Loading

almet commented Apr 29, 2024

almet commented Apr 30, 2024

Security-related stuff

davidbgk left a comment

Choose a reason for hiding this comment

yohanboniface left a comment

Choose a reason for hiding this comment

yohanboniface May 16, 2024 • edited by almet Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

almet May 31, 2024 • edited Loading

Choose a reason for hiding this comment

almet commented Apr 16, 2024 •

edited

Loading

almet commented Apr 19, 2024 •

edited

Loading

yohanboniface May 16, 2024 •

edited by almet

Loading

almet May 31, 2024 •

edited

Loading