-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add event for error statistics #18
Comments
I am not sure whether this should be integrated into a signaling server. It might be better suited in a small REST server as an additional prometheus endpoint. |
We already have some prometheus statistics in the signal tower. Of course, this makes statistics more visible because they become part of the protocol, but the event can be optional and would not need to be supported by other palava protocol servers (except that they would need to actvely ignore it). I actually like that the setup is kept so simple that we only have one server software (and one standing connection). |
I can understand both standpoints. Collecting client side errors must not be part of the palava protocol and thus could be collected on a separate service. But this also means another, server , another connection, another source of errors in a distributed system. I would go for a single endpoint. Since nobody else is implementing the palava protocol. Adding the error event does nö harm and is quite simple, right? |
If we implement TURN, then this will probably be less necessary, because when the client asks for a turn credential, we know that stun failed. The only information that we would be missing out on would be how often even TURN fails... (and probably different reasons why the connection with stun failed, but I'm not even sure if there are now ways to find that out - there didn't use to be any when we first researched this some years ago) |
New proposition (since we configure turn in the client directly and not after a failed connection now):
There are new types now for differentiating direct and relayed (turn) connection and the two errors that we can meaningfully differentiate in the client. Any thoughts? |
We need a protocol extension for the palava protocol and an implementation of it in the signal tower for receiving error stats from the client and adding them to prometheus. Proposition:
Of course we can add more types than peerconnection_established and peerconnection_error in the future, but for now, I would start with those 2. The category can be one of different (error) categories and can also be optional(?). For peerconnection_error the categories will most likely be something like "nat_failed", "media_incompatible", "network_error" (just examples, we'll see if we can distinguish these in the in the frontend). For peerconnection_established, these could be "first_connection" or "reconnect"...
The text was updated successfully, but these errors were encountered: