-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate into Message Pack and GZIP support #553
Comments
CBOR format which was inspired by MessagePack seems to be better choice. |
@artas90 thanks for pointing this out. |
We're looking at moving more of our remotable services in VS from JSON to MessagePack. I'm concerned about losing interoperability with node-based processes. I'd love to see |
@artas90 As I already have a great deal of investment and code based on MessagePack, I'm biased. But I wonder if you want to persuade me as to why CBOR is superior to MessagePack. They are both spec'd standards. One being RFC and the other spec'd elsewhere isn't particularly compelling. JSON itself at least started as a spec on https://www.json.org/ as I understand it. |
FWIW, the format we use tends to be important to maintain across an entire RPC session. While we can change text encodings from message to message, switching between JSON and MessagePack isn't supported in StreamJsonRpc. So while you could declare the encoding to be messagepack in your HTTP headers, at least for our purposes we need to know whether MessagePack or JSON will be used up front when establishing the connection. |
@AArnott Honestly I never used neither CBOR nor MessagePack. Just wanted to mention that such standard as CBOR exists. And it might be considered as well |
I rebased the work I started on master today and a first version of this is here: https://github.com/microsoft/vscode-languageserver-node/tree/dbaeumer/encodingSupport2 Implementation wise this works as follows:
For MessagePack in would propose that we indicate it using a different |
Thanks for the update, @dbaeumer. Your design sounds like you're definitely planning on a mix of messagepack and JSON messages on the same connection, which is exactly what I was hoping we wouldn't do. I'll have to review what kind of state our formatters build up over a connection to see what kind of features will break if we switch formats from one message to another. |
This is mainly driven by the fact that I want to offer an upgrade path from the current behavior. In my current thinking the initialize handshake will always be plain json (like the header is always encoded using Otherwise LSP would need to specify command line arguments on how to define content type and encoding on startup. And a server needs to document the supported types and encodings since they are reused in different clients with different content type and encoding capabilities. |
Does every LSP server already have a client portion that knows how to spawn that particular LSP server? In VS that's the way it is. I thought perhaps it was this way in VS Code as well. As I understood it, LSP doesn't prescribe how the LSP server is spawned, what command line args it takes, or how to establish the pipe between the client and server processes. Then the spec itself doesn't have to prescribe a handshake. For example, if I wrote an LSP server for F#, I would also have to write the LSP client that spawns the F# LSP server. Since I want a super-fast experience, I choose to encode with MessagePack. My client and server are hard-coded to always use MessagePack, and things just work. All that would be left is for the But I'm guessing I'm missing something. @dbaeumer can you tell me why you feel it's something that needs to be specified at the LSP spec level? Consider: If we are to include MessagePack provisions in the LSP spec, we could get far more benefit by specifying indexes for each property so that we can encode the data as arrays instead of maps in messagepack. We'd never have to serialize and transmit property names, which can dramatically reduce the payload size and time spent serializing. |
This requires that we spec command line arguments to at least make the way how a content type is picked the same across servers. This simply adds another dimension to the spec which I want to avoid. But more important is the fact that we have requests to support the following:
Hardwiring the content type and encoding will make it a lot harder to support the two things. I have no problem to specify that the content type and encoding must stay fixed after the initialize (doesn't change on every message, as it would be possible with HTTP). |
I'm imagining that even if LSP is "relayed" across multiple nodes (e.g. LSP Server<->LSP Client on Live Share host<->Live Share host<->Live Share guest<->LSP client on guest) that each individual leg may be encoded with MessagePack or not. So it's not something necessarily even in the LSP spec -- just a decision that the individual JSON-RPC library is using when it sets up an individual leg on the connection. I'd like MessagePack encoding support in vscode-jsonrpc so we can use MessagePack for (non-LSP related) JSON-RPC messages as part of our 'brokered services' between VS and VS Code. Beyond that, @CyrusNajmabadi expressed interest in switching their LSP server to messagepack in dotnet/roslyn#42595 (comment), and I think we can do that without any change to the LSP spec (or vscode-jsonrpc for that matter). But if VS Code were to participate in that same perf win, vscode-jsonrpc would need to support MessagePack as well, and the VS Code LSP system would have to support allowing the LSP client that spawns the Roslyn LSP server to set up the |
I think there are two use cases: you have all connection under control and want to use message pack. With the changes I did to the jsonrpc library that is possible. You don't have all connection under control (e.g. you connect to a running server) and need to negotiate the encoding. This needs support in LSP. |
Sounds reasonable. But how can you connect to an existing server without controlling that connection? |
You control the connect but not the content type / encoding. This could depend on the version of the server or the client connecting. |
I'm checking out your |
Oh! You've already merged it into master I found last night. Great. |
Yes, got merge. With the difference that it is not header negotiable anymore. |
I will close the issue. Support has been added to the library but nothing has been added to the spec yet. To make use of this in the protocol we need to start on the LSP level and spec it there. |
@dbaeumer has support for using msgpack instead of JSON been finalized? If yes, is there documentation on enabling it? If not, what's missing and how can I help to push it forward? I believe using msgpack will provide significant performance benefits for our LSP implementation. |
FWIW, one idea my team has played with is the LSP server somehow advertising (through a manifest perhaps) that it supports msgpack via being launched by some switch. Then the LSP client spawns the server with that switch and assumes msgpack instead of JSON. |
The support has been finalized in JSONRPC lib but there is still no handshake up upgrade a server connection. So you need to do what @AArnott suggests that you spawn your server with a special command line argument. |
Adding the command line argument is reasonable and easy to do. But is there some configuration on the client side? How do I tell the client to use msgpack for requests and responses? |
@vinistock good point. I never added that :-( but will be easy to do. But please note that this will only work for transports different than Node IPC. |
Do you have any pointers on how to get started? I can take a stab at making this an option for clients. About the transports: can you elaborate? Would that impact language servers implemented in languages other than NodeJS? |
I'm confused over why a transport mechanism or platform would limit whether you can encode LSP as msgpack. We use msgpack across IPC with a node process for non-LSP purposes already. |
@vinistock it was my mistake. The code is actually already there since you can always pass a function as a server (see https://insiders.vscode.dev/github/microsoft/vscode-languageserver-node/blob/main/client/src/node/main.ts#L126) which returns a @AArnott Node provides a special IPC mechanism where you can directly send and receive JS object and the Node runtime does the serialization / de-serialization. You could may be use msgpack -> number[] -> Node IPC but I guess this is counter productive. You can of course use msgpack with node using sockets / pipes. |
If you a pass a function as the server, because you only return a Is there interest in adding an option and handling this inside the library? Something like const serverOptions = {
run: /* ... */,
debug: /* ... */,
transport_encoding: TransportEncodings.Msgpack
} |
@vinistock yes, that is correct. What we can think of is supporting something like this in options: https://insiders.vscode.dev/github/microsoft/vscode-languageserver-node/blob/main/jsonrpc/src/node/test/messages.test.ts#L237 I don't want to have any hard reference to external libraries. |
Thank you for considering it, I think it'll be a good quality of life improvement when using different encodings - since we can avoid having to deal with the life cycle of the process which is already implemented in the library. I put together a draft of what this could look like in #1250. Please let me know if you're okay with the addition and the approach. |
Currently the transport uses a JSON string as a payload. To reduce the payload we could:
We should investigate both options and measure the saving we can have.
This must be optional and would best be done using header properties to negotiate this as in HTTP
The text was updated successfully, but these errors were encountered: