Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client for the websocket server #113

Closed
funboarder13920 opened this issue Dec 16, 2022 · 9 comments
Closed

Client for the websocket server #113

funboarder13920 opened this issue Dec 16, 2022 · 9 comments
Labels
bug Something isn't working feature New feature or request question Further information is requested
Milestone

Comments

@funboarder13920
Copy link

Hello,

I'm trying to use the websocket server for diarization.
I implemented a client as it is not provided. I get an output for the diarization but it is always speaker0.

Here is the js code for my client:

// Set up the start and stop buttons.
document.getElementById('start-button').onclick = start;
document.getElementById('stop-button').onclick = stop;

// Global variables to hold the audio stream and the WebSocket connection.
var stream;
var socket;

function b64encode(chunk) {
    // Convert the chunk array to a Float32Array
    const bytes = new Float32Array(chunk).buffer;

    // Encode the bytes as a base64 string
    let encoded = btoa(String.fromCharCode.apply(null, new Uint8Array(bytes)));

    // Return the encoded string as a UTF-8 encoded string
    return decodeURIComponent(encoded);
}


function start() {
    // Disable the start button.
    document.getElementById('start-button').disabled = true;
    // Enable the stop button.
    document.getElementById('stop-button').disabled = false;

    // Get access to the microphone.
    navigator.mediaDevices.getUserMedia({ audio: true }).then(function (s) {
        stream = s;

        // Create a new WebSocket connection.
        socket = new WebSocket('ws://localhost:7007');

        // When the WebSocket connection is open, start sending the audio data.
        socket.onopen = function () {
            var audioContext = new AudioContext();
            var source = audioContext.createMediaStreamSource(stream);
            var processor = audioContext.createScriptProcessor(1024, 1, 1);
            source.connect(processor);
            processor.connect(audioContext.destination);

            processor.onaudioprocess = function (event) {
                var data = event.inputBuffer.getChannelData(0);
                // var dataString = String.fromCharCode.apply(null, new Uint16Array(data));
                console.log("sending")
                socket.send(b64encode(data));
            };
        };

        socket.onmessage = function (e) {
            console.log(e);
        }
    }).catch(function (error) {
        console.error('Error getting microphone input:', error);
    });
}

function stop() {
    // Disable the stop button.
    document.getElementById('stop-button').disabled = true;
    // Enable the start button.
    document.getElementById('start-button').disabled = false;

    // Close the WebSocket connection.
    socket.close();
    // Stop the audio stream.
    stream.getTracks().forEach(function (track) { track.stop(); });
}

Do you have the code for the client you used ? I would like to make sure that I got that right before looking anywhere else.

Best,

@juanmc2005 juanmc2005 added the question Further information is requested label Dec 16, 2022
@ghost
Copy link

ghost commented Jan 5, 2023

I also made a client and am having trouble getting any response from the server, how did you integrate your client to the server code for the websocket server?

@juanmc2005
Copy link
Owner

Hi @funboarder13920 and @jason-daiuto, sorry for the late answer, I've been super busy writing my PhD thesis.
I think I have the source code for the websocket client but I don't have access to it right now (I'm on vacation).

I'll make sure to post it here when I get back, but I think it would be useful to have a WebsocketClient and even a diart.client/diart.serve to play more easily with these features. I'll add this to the backlog.

@juanmc2005 juanmc2005 added the feature New feature or request label Jan 9, 2023
@juanmc2005 juanmc2005 added this to the Version 0.7 milestone Jan 9, 2023
@juanmc2005
Copy link
Owner

I'm starting to suspect there might be a bug related to RealTimeInference hooks not being called appropriately.
@jason-daiuto could you check if you get a response from the server using version 0.5?

@juanmc2005
Copy link
Owner

Update: I haven't been able to find the code I used at the time. I'll write it again from scratch and add it as a tool (e.g. diart.client) in the next release but it'll take a bit more time.

In the meantime, I'm happy to help with any specific difficulties you may be facing on the client side.
Also, if you managed to get it working and you want to contribute, I would gladly merge a PR with the feature :)

@funboarder13920
Copy link
Author

I attached code for a very simple client in my initial message which at least send the data to the server. However the results I get are not very convincing.
It could be a starting point but there is probably something wrong that I can't figure out.

@juanmc2005
Copy link
Owner

It may be a problem with the audio encoding on the client side. Maybe the discussion of issue #68 (before the existence of WebSocketAudioSource) will be useful.

I suggest you try with a simple Python client first without any UI and with the same library for encoding/decoding (base64) to remove possible sources of errors.

Your client should encode the message in the following way:

message = base64.b64encode(chunk.astype(np.float32).tobytes()).decode("utf-8")

@juanmc2005 juanmc2005 added the bug Something isn't working label Mar 11, 2023
@juanmc2005
Copy link
Owner

@jason-daiuto I managed to reproduce the issue you mentioned.
The client is able to send and the server receives everything.
However, answers are not being sent back.

The implementation of the websocket server as an audio source is a bit tricky, particularly when sending things back.
I think it may be a good idea to implement it in a different way, maybe as a wrapper of RealTimeInference.
I'll try a few things and post any updates here.

@juanmc2005
Copy link
Owner

juanmc2005 commented Mar 13, 2023

@funboarder13920 @jason-daiuto I just re-wrote the websocket audio source and added diart.serve and diart.client to try things out.

Could you install from the branch fix/ws and let me know if the client and server interact correctly?

I'll open a PR with these features shortly.

@juanmc2005
Copy link
Owner

These features are now in the develop branch and will be part of the v0.7 release

Repository owner locked and limited conversation to collaborators Mar 24, 2023
@juanmc2005 juanmc2005 converted this issue into discussion #134 Mar 24, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
bug Something isn't working feature New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants