feat: Add WebSocket server with multi-client support #263

janaab11 · 2024-12-22T17:43:34Z

Overview

Implements a WebSocket server that can handle audio streams from multiple client connections

Changes

Added multi-client support to WebSocket server
Created StreamingInferenceHandler for managing connections
Added Dockerfile for easier deployment

Testing

Tested with multiple concurrent clients
Verified Docker container functionality
Checked resource cleanup on disconnection

Please let me know if any changes or improvements are needed!

Fixes #252

janaab11 · 2024-12-22T17:59:39Z

Couple things I still want to work on:

The way resources are shared between client connections. Currently, each connection shares the same config but the models (for segmentation and embedding, in SpeakerDiarization) are initialised and maintained separately. This is a more flexible design, but quite wasteful.
- I have attempted a fix at this, but parallel connections ended up sharing state. From what I understood, I might need to dig deeper into the Aggregation steps in the pipeline.
Related to the above point. I was also wondering if different configs could share the same underlying model resources at runtime. For eg. I have seen performance differ a lot for the same config when number of speakers are far apart - say, 2 vs 10. And this is a parameter that a client is better suited to configure, when setting up the connection to the server.

janaab11 · 2024-12-22T18:59:30Z

I have also added a cleanup step in the server, when a client disconnects. This was mostly to ensure explicit memory management since client streams are not sharing resources, at the moment - but this should also address #255

janaab11 · 2024-12-31T15:47:52Z

Moved to LazyModel for resource management, based off this comment. Client-specific Pipeline instances now share resources that are initialised in a common PipelineConfig instance.

Still unsure about how this would scale with client connections - would appreciate any thoughts on this!

juanmc2005

Thank you for this PR! The feature is well designed, I'd just like to make a few adjustments to clean up pieces of code that should be discontinued and to make sure that the new API is clean and easy to understand.

The PR is also missing an update to the websocket section of the README with some usage examples. I think that will also give us some ideas on where the API can be improved.

Dockerfile

src/diart/handler.py

src/diart/sources.py

src/diart/handler.py

src/diart/console/serve.py

src/diart/handler.py

juanmc2005 · 2025-01-02T16:45:40Z

@janaab11 about your question concerning the wasteful model copies, I fully agree with this limitation. However, I think it would be best suited for a separate PR, it's a pretty big amount of work and I would hate to delay the multi-client feature because of it. Glad to discuss it in a future PR if that interests you!

janaab11

I have completed most of the suggested changes. A few items remain:

WebSocketAudioSource: I do agree it is acting as a proxy, but prefer keeping it as an AudioSource subclass - makes it consistent with other audio sources. Would like to think more about this and then propose changes.
Documentation: Planning to add WebSocket usage examples to the README

janaab11 · 2025-01-03T00:40:25Z

@janaab11 about your question concerning the wasteful model copies, I fully agree with this limitation. However, I think it would be best suited for a separate PR, it's a pretty big amount of work and I would hate to delay the multi-client feature because of it. Glad to discuss it in a future PR if that interests you!

Definitely interested in resolving this - happy to take it up after closing the work here!

…te functionality

juanmc2005 · 2025-01-03T10:37:22Z

WebSocketAudioSource: I do agree it is acting as a proxy, but prefer keeping it as an AudioSource subclass - makes it consistent with other audio sources. Would like to think more about this and then propose changes.

Oh it would definitely still be a subclass of AudioSource, my suggestion was simply to move it to websockets.py so we hide it from the end user. I don't think such a proxy audio source would be needed under normal circumstances

.dockerignore

Dockerfile

src/diart/console/serve.py

src/diart/websockets.py

janaab11 · 2025-01-03T15:46:13Z

Oh it would definitely still be a subclass of AudioSource, my suggestion was simply to move it to websockets.py so we hide it from the end user. I don't think such a proxy audio source would be needed under normal circumstances

Okay this does make sense to me too - not exposing such websocket-specific functionality. Made the changes!

… in the server itself

janaab11 · 2025-01-03T21:44:42Z

Added error handling for the following edge cases - in send, close and _on_message_received methods:

if client is None:
    return

if client_id not in self._clients:
    return

These edge-cases can occur due to race conditions in the client lifecycle (connect/disconnect/cleanup) or network issues that lead to client state mismatches between the server and client. Added warnings to catch these async timing issues, and documented the edge-case conditions in the respective method's docstring.

janaab11 · 2025-01-03T23:23:02Z

Modified client.py to handle disconnects properly on KeyboardInterrupt events - i believe this was referenced in another issue as well. Do let me know if the implementation is too involved? I had liked the simplicity of the client before this.

Apart from this, complete documentation in the README is pending. Will get to that next.

juanmc2005 · 2025-01-06T15:17:10Z

README.md

@@ -202,6 +202,7 @@ def embedding_loader():
 segmentation = SegmentationModel(segmentation_loader)
 embedding = EmbeddingModel(embedding_loader)
 config = SpeakerDiarizationConfig(
+    # Set the segmentation model used in the paper


This isn't correct. To remove

juanmc2005 · 2025-01-06T15:18:09Z

README.md

@@ -332,20 +333,57 @@ diart.client microphone --host <server-address> --port 7007

 See `-h` for more options.

+### From the Dockerfile


Suggested change

### From the Dockerfile

### From a Docker container

juanmc2005 · 2025-01-06T15:19:27Z

README.md

+
+You can also run the server in a Docker container. First, build the image:
+```shell
+docker build -t diart -f Dockerfile .


-f Dockerfile is not needed, as it will pick up the file with that name in the specified directory

juanmc2005 · 2025-01-06T15:22:41Z

README.md

+
+Run the server with default configuration:
+```shell
+docker run -p 7007:7007 --gpus all -e HF_TOKEN=<token> diart


We should probably add a note somewhere saying that for GPU usage they need to install nvidia-container-toolkit.

Also, is there a way to pick up the HF token from the huggingface-cli config? That way we avoid passing it directly and keeping it in the terminal history. This is possible when running outside docker, and we shouldn't make it mandatory, as it's an important security feature.

juanmc2005 · 2025-01-06T15:24:12Z

README.md

+docker run -p 7007:7007 --gpus all -e HF_TOKEN=<token> diart
+```
+
+Run with custom configuration:


Suggested change

Run with custom configuration:

Example with a custom configuration:

juanmc2005 · 2025-01-06T15:50:49Z

src/diart/websockets.py

+        Raises
+        ------
+        Warning
+            If client not found in self._clients. Common cases:


same as previous comment

juanmc2005 · 2025-01-06T15:51:40Z

src/diart/websockets.py

+        try:
+            # Clean up pipeline state using built-in reset method
+            client_state = self._clients[client_id]
+            client_state.inference.pipeline.reset()


Not sure a reset is required because the pipeline will be removed from memory anyway

juanmc2005 · 2025-01-06T15:52:46Z

src/diart/websockets.py

+            # Ensure client is removed even if cleanup fails
+            self._clients.pop(client_id, None)
+
+    def close_all(self) -> None:


This should be called shutdown() because it shutdowns the server after closing all clients

juanmc2005 · 2025-01-06T15:57:35Z

src/diart/websockets.py

+        while retry_count < max_retries:
+            try:
+                self.server.run_forever()
+                break  # If server exits normally, break the retry loop
+            except OSError as e:
+                logger.warning(f"WebSocket server connection error: {e}")
+                retry_count += 1
+                if retry_count < max_retries:
+                    delay = base_delay * (2 ** (retry_count - 1))  # Exponential backoff
+                    logger.info(
+                        f"Retrying in {delay} seconds... "
+                        f"(attempt {retry_count}/{max_retries})"
+                    )
+                    time.sleep(delay)
+                else:
+                    logger.error(
+                        f"WebSocket server failed to start after {max_retries} attempts. "
+                        f"Last error: {e}"
+                    )
+            except Exception as e:
+                logger.error(f"Fatal server error: {e}")
+                break
+            finally:
+                self.close_all()


Now that I think about it, it's probably not required to retry starting the server, right? I mean if starting the server doesn't work, it's probably a configuration error that should be fixed by the developer, for example if the port is already in use. What do you think? What use case did you have in mind for retrying?

juanmc2005 · 2025-01-06T16:02:26Z

src/diart/websockets.py

+
+        return ClientState(audio_source=audio_source, inference=inference)
+
+    def _on_connect(self, client: Dict[Text, Any], server: WebsocketServer) -> None:


Maybe we should allow a max number of clients to connect? My reasoning is the following: if we have to copy StreamingInference instances (including models) for every new client, the server will most likely crash at some point (especially if sharing GPU). However, given system resources, we can probably estimate how many clients fit in the machine, or if the new client fits in the remaining available resources.

If this is too complicated, we can simply add a parameter inside __init__() for the maximum number of simultaneous clients. Something like client_pool_size: int = 4.

janaab11 added 14 commits November 21, 2024 16:07

add handler for multiple websocket streams

da080e7

move server instance to inference handler

e5fe109

update serve script to use inference handler

12df31b

debug server run

a709d58

turn off progress bar for multiple connections

b9b1780

add error handling for inference server

c8127f7

refactor streaming server for readability

b805575

fix memory leak when clients share Pipeline instance

1b4e9af

reset Pipeline state when client disconnects

b897991

test READY message to client after init

d2566f1

update client to stream audio after READY message

f9f2c80

test CLOSE message to client after disconnect

1db7a8f

added Dockerfile

80ee316

expose custom params for Docker

fb9fecf

janaab11 force-pushed the develop-server branch from 662ad3a to fb9fecf Compare December 22, 2024 17:49

apply styling with black and isort

1975f75

janaab11 force-pushed the develop-server branch 2 times, most recently from bbf2df2 to d4380c4 Compare January 1, 2025 06:50

refactor StreamingHandler to use LazyModel for resource mgmt

7ba2f55

janaab11 force-pushed the develop-server branch from d4380c4 to 7ba2f55 Compare January 1, 2025 06:54

juanmc2005 self-requested a review January 2, 2025 14:36

juanmc2005 assigned janaab11 Jan 2, 2025

juanmc2005 added the feature New feature or request label Jan 2, 2025

juanmc2005 added this to the Version 0.9.2 milestone Jan 2, 2025

juanmc2005 requested changes Jan 2, 2025

View reviewed changes

janaab11 commented Jan 2, 2025

View reviewed changes

janaab11 added 3 commits January 3, 2025 06:43

simplified websocket-server class and improved naming

3d3bb45

improved code quality and style

b2c9293

updated Dockerfile for local builds and reduced image size

bba43ae

janaab11 force-pushed the develop-server branch from 2cbcdc3 to bba43ae Compare January 3, 2025 01:16

refactor close and send methods of WebSocketStreamingServer to separa…

f95d6ca

…te functionality

juanmc2005 reviewed Jan 3, 2025

View reviewed changes

refactor: improve error handling and reduce redundancy

c6a2f92

refactor: make WebSocketAudioSource a proxy and handle audio decoding…

12f7ba9

… in the server itself

janaab11 force-pushed the develop-server branch from 604d140 to 12f7ba9 Compare January 3, 2025 16:58

janaab11 added 6 commits January 3, 2025 23:13

apply styling with black and isort

0f6ac4e

correct styling with isort

c88fbd7

cleanup: remove deprecated output argument

eb200c2

fix typo in dockerignore

16dbbc0

fix(websockets): update socket error handling

dd1ff48

fix(websockets): improve error logging for edge-cases

e1db50d

janaab11 added 2 commits January 4, 2025 03:34

fix(websockets): add retry backoff to server

d944e6f

apply styling with black and isort

f2c3144

janaab11 added 2 commits January 4, 2025 12:52

fix(client): manage stop events and handle errors correctly

b6d6bc6

fix(client): improve error handling and readability

74bd40b

janaab11 force-pushed the develop-server branch from e6607a2 to 74bd40b Compare January 4, 2025 07:26

add documentation for updated websocket server and dockerfile usage

c263239

janaab11 requested a review from juanmc2005 January 6, 2025 13:45

juanmc2005 reviewed Jan 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add WebSocket server with multi-client support #263

feat: Add WebSocket server with multi-client support #263

janaab11 commented Dec 22, 2024 •

edited

Loading

janaab11 commented Dec 22, 2024 •

edited

Loading

janaab11 commented Dec 22, 2024 •

edited

Loading

janaab11 commented Dec 31, 2024 •

edited

Loading

juanmc2005 left a comment

juanmc2005 commented Jan 2, 2025

janaab11 left a comment •

edited

Loading

janaab11 commented Jan 3, 2025

juanmc2005 commented Jan 3, 2025

janaab11 commented Jan 3, 2025

janaab11 commented Jan 3, 2025 •

edited

Loading

janaab11 commented Jan 3, 2025 •

edited

Loading

juanmc2005 Jan 6, 2025

juanmc2005 Jan 6, 2025

juanmc2005 Jan 6, 2025

juanmc2005 Jan 6, 2025

juanmc2005 Jan 6, 2025

juanmc2005 Jan 6, 2025

juanmc2005 Jan 6, 2025

juanmc2005 Jan 6, 2025

juanmc2005 Jan 6, 2025

juanmc2005 Jan 6, 2025

		@@ -332,20 +333,57 @@ diart.client microphone --host <server-address> --port 7007

		See `-h` for more options.

		### From the Dockerfile

	Run with custom configuration:
	Example with a custom configuration:


		return ClientState(audio_source=audio_source, inference=inference)

		def _on_connect(self, client: Dict[Text, Any], server: WebsocketServer) -> None:

feat: Add WebSocket server with multi-client support #263

Are you sure you want to change the base?

feat: Add WebSocket server with multi-client support #263

Conversation

janaab11 commented Dec 22, 2024 • edited Loading

Overview

Changes

Testing

janaab11 commented Dec 22, 2024 • edited Loading

janaab11 commented Dec 22, 2024 • edited Loading

janaab11 commented Dec 31, 2024 • edited Loading

juanmc2005 left a comment

Choose a reason for hiding this comment

juanmc2005 commented Jan 2, 2025

janaab11 left a comment • edited Loading

Choose a reason for hiding this comment

janaab11 commented Jan 3, 2025

juanmc2005 commented Jan 3, 2025

janaab11 commented Jan 3, 2025

janaab11 commented Jan 3, 2025 • edited Loading

janaab11 commented Jan 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

janaab11 commented Dec 22, 2024 •

edited

Loading

janaab11 commented Dec 22, 2024 •

edited

Loading

janaab11 commented Dec 22, 2024 •

edited

Loading

janaab11 commented Dec 31, 2024 •

edited

Loading

janaab11 left a comment •

edited

Loading

janaab11 commented Jan 3, 2025 •

edited

Loading

janaab11 commented Jan 3, 2025 •

edited

Loading