Performance is not ideal #333

dae · 2022-12-06T00:42:37Z

We currently use protobufjs, and would love to switch over to protobuf-es for the typescript ergonomics and es6/tree shaking that it offers. Parts of our app deal with large messages, so I did a quick performance test, and unfortunately protobuf-es is rather behind at the moment. The test I used was decoding a 6MB binary message, it containing repeated fields with about 150k entries. Protobufjs parses it in about 100ms; protobuf-es takes about 350ms. For reference, the same message as JSON is about 27MB, and JSON.parse() parses it in about 100ms as well. If providing the message/proto would be helpful, please let me know.

NfNitLoop · 2022-12-07T16:00:34Z

Hi @dae -- I'm not developer for this project, but I'm a user and just saw your issue. Providing a concrete use case is always helpful in bug reports so I just thought I'd chime in and say yeah, that's useful. :) Especially if you can also provide minimal code to reproduce the performance issue. Then it becomes a de facto test case so any fixes can be tested with before/after code.

felicio · 2023-02-10T21:19:50Z

Adding Protobuf-ES to protons's benchmarking suite at ipfs/protons#89, where it also performs the slowest.

dae · 2023-02-11T02:27:37Z

I suspect any large message will reproduce the issue, but can produce a sample one if the devs request it.

On a positive note, protobuf-es's code generation seems to be a lot faster than what we currently have with protobufjs, where we have to produce a static version, which is then used to generate the ts, and then a separate json version needs to be produced because the static version's resulting code is huge. protobuf-es appears to be clearly sub-second, when protobufjs was taking around 5 seconds to rebuild each time a proto file was changed.

felicio · 2023-02-13T19:38:19Z

Reference how protons dealt with perf issues at ipfs/protons#51.

jcready · 2023-02-13T20:36:49Z

Protons appears to use protobuf.js's UTF-8 decoder/encoder which has correctness bugs and likely the reason protobuf-es uses the native TextEncoder/Decoder. If you're using the NodeJS runtime for your benchmarks you can comment on this bug to ask for a faster TextEncoder/Decoder: nodejs/node#39879 as Bun and Deno appear to have much faster implementations. Chrome, Firefox, and Safari are also generally much faster using the native TextEncoder/Decoder vs. protobuf.js's implementation (especially once you get above 32 characters) timostamm/protobuf-ts#184 (comment)

smaye81 · 2023-03-20T18:00:22Z

Just to add some context also - It was a design choice to focus on usability, developer experience, and conformance when creating Protobuf-ES.

Consequently, Protobuf-ES is nearly 100% compliant with the Protobuf conformance test suite. To further illustrate this, we've created this Protobuf Conformance repo which shows how other implementations fare against this test suite. You can see that other implementations might be super performant, but the conformance scores are not great.

Nevertheless, we have some ideas for improving performance that are on our roadmap for the future.

timostamm · 2023-08-30T13:27:07Z

Just a small update: Thanks to @kbongort, binary read performance received a ~2.5x bump in v1.2.1. See #459 for details.

dimo414 · 2023-10-17T00:47:06Z

Nevertheless, we have some ideas for improving performance that are on our roadmap for the future.

Any chance you could share some more context on the improvements you're considering, or any sense of a timeline on those improvements landing? We haven't done a rigorous benchmark but our initial tests have shown observable performance regressions if we migrate from protobuf.js. It would be helpful to know what's in progress (or done already, as @timostamm called out).

Repo: https://github.com/bufbuild/protobuf-es Release: https://buf.build/blog/protobuf-es-the-protocol-buffers-typescript-javascript-runtime-we-all-deserve --- Relates to bufbuild/protobuf-es#333 --------- Co-authored-by: Alex Potsides <[email protected]>

smaye81 · 2023-10-23T15:57:52Z

Hi @dimo41. We don't have any timeline to report but this is something we plan to address relatively soon. Some additional things we want to explore are first, investigating whether the changes made in #459 could also be applied to binary write performance as well as JSON read/write performance.

In addition, we'd like to investigate potentially supporting the optimize_for proto option for optimizing for speed. The downside to that is that it will increase code/bundle size at the expense of performance and potentially make testing a bit more difficult, so we want to think through how best to implement it.

malcolmstill · 2023-12-21T11:16:11Z

I was going to open a new issue for this but since writing / encoding performance is mentioned here latterly, I'll add to this discussion.

We have an issue where we are encoding a large number of f64s (doubles). In a particular example we are overall encoding around 4 million floats across a number of protobuf messages (~100 messages so each one contains ~40,000 doubles)
and that is taking, in this case, 1600 milliseconds overall when running .enc (in particular protoDelimited.enc if it makes a difference).

The proto definition is more or less (there are some other fields but the overwhelming amount of data in each message is the data field):

message MyMessage {
  repeated double data = 1;
}

Profiling protoDelimited.enc for this series of messages shows that all the time is spent in double, with double being invoked separately for each f64 value (and allocating an 8-byte array each time).

In our case I'm able to show a ~30x encoding performance improvement (encoding takes ~50 milliseconds instead of ~1600 milliseconds) by extending the class BinaryWriter (packages/protobuf/src/binary-encoding.ts) with, say, an arrayDouble method:

arrayDouble(values: number[]): IBinaryWriter {
  let chunk = new Uint8Array(8 * values.length);
  const view = new DataView(chunk.buffer);
  for (const [i, value] of values.entries()) {
    view.setFloat64(8*i, value, true);
  }
  return this.raw(chunk);
}

Then in writePacked (packages/protobuf/src/private/binary-format-common.ts), and since I've only implemented this for double, special-case when the scalarTypeInfo returns method double to invoke a single arrayDouble call instead of n double calls:

export function writePacked(
  writer: IBinaryWriter,
  type: ScalarType,
  fieldNo: number,
  value: any[]
): void {
  if (!value.length) {
    return;
  }
  writer.tag(fieldNo, WireType.LengthDelimited).fork();
  let [, method] = scalarTypeInfo(type);

  if (method === "double") {
    writer["arrayDouble"](value as number[]);
  } else {
    for (let i = 0; i < value.length; i++) {
      (writer[method] as any)(value[i]);
    }
  }
  writer.join();
}

This seems to work in our case (I have a test where I encode a bunch of Math.random() values, encode the message, immediately decode and check that the original and decoded data match), but this is my first time looking at the protobuf-es code so I don't know enough to be sure there isn't some gotcha with doing this?

Assuming such a change is sensible this would make sense at least for some of the other primitive types, if not in general?

timostamm · 2024-01-02T13:47:48Z

Hey Malcolm, thanks for the comment! This could be applied to all packed protobuf types with fixed size. There's no free lunch though, the downsides are increased bundle size and breaking the IBinaryWriter interface.

Editions will stir up the code paths a bit, so it does not make sense to pull in this performance improvement right now, but it does seem worthwhile to do so after we implemented support for editions. I wonder if an argument to fork() with a size to allocate in advance would be an alternative here. It's also likely that the perf improvement for parsing added in #459 applies to serializing as well.

timostamm · 2024-01-24T14:01:31Z

It looks like a similar perf improvement applied for binary read (see #333 (comment)) can also be applied for binary write. Some details we're noted in #674 (comment).

Ekrekr · 2024-08-21T15:20:25Z

We currently use protobufjs, and would love to switch over to protobuf-es

Just to add a +1: this is the same for us, we can't switch over because the slowdown for encode and decode we would experience is about ~15x.

This library is awesome though! You write very clean code, that's nice to interface with.

Ekrekr · 2024-08-22T13:29:16Z

I set up a repo for comparing encode speeds or protobufs: https://github.com/Ekrekr/prototiming.

~~Looking into the flame graphs, even the UTF8 string encoding is faster in ProtobufJS, but they do some crazy stuff~~ the ECMA encoding I think is faster, it's the memory copying that slows this down:

Protobufes encode: 384.909ms. Uses the ECMA TextEncoder for utf8 encoding:

protobuf-es/packages/protobuf/src/wire/binary-encoding.ts

Line 249 in f76c9f5

let chunk = this.encodeUtf8(value);

.
Protobufjs encode: 240.668ms. Uses this unholy piece of code to write encoding the buffer directly: https://github.com/protobufjs/protobuf.js/blob/5b502e173d6c5b1bcbde50e407e85f88576bb10a/lib/utf8/index.js#L71.

It seems like protobufjs is a lot more efficient at writing directly to the buffer, rather than chunking and copying around array buffers.

I'll dig into this a bit more.

Ekrekr · 2024-08-22T15:10:24Z

The most clear place to optimize seems to be in finish.

An array buffer is made during chunking:

protobuf-es/packages/protobuf/src/wire/binary-encoding.ts

Line 249 in f76c9f5

let chunk = this.encodeUtf8(value);

And a new array buffer is made at finish, and the contents of the chunking copied over to it iteratively:

protobuf-es/packages/protobuf/src/wire/binary-encoding.ts

Line 139 in f76c9f5

let bytes = new Uint8Array(len);

If instead of creating a new array buffer at finish, an array buffer could be made at the start and written to directly, this would prevent writing duplication.

Edit: my experiments with this are in #964.

felicio mentioned this issue Feb 10, 2023

chore: benchmark Protobuf-ES ipfs/protons#89

Merged

ntkme mentioned this issue Oct 28, 2024

Implement sass --embedded in pure JS mode sass/dart-sass#2413

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance is not ideal #333

Performance is not ideal #333

dae commented Dec 6, 2022 •

edited

Loading

NfNitLoop commented Dec 7, 2022

felicio commented Feb 10, 2023 •

edited

Loading

dae commented Feb 11, 2023

felicio commented Feb 13, 2023

jcready commented Feb 13, 2023

smaye81 commented Mar 20, 2023

timostamm commented Aug 30, 2023

dimo414 commented Oct 17, 2023

smaye81 commented Oct 23, 2023

malcolmstill commented Dec 21, 2023 •

edited

Loading

timostamm commented Jan 2, 2024

timostamm commented Jan 24, 2024

Ekrekr commented Aug 21, 2024 •

edited

Loading

Ekrekr commented Aug 22, 2024 •

edited

Loading

Ekrekr commented Aug 22, 2024 •

edited

Loading

Performance is not ideal #333

Performance is not ideal #333

Comments

dae commented Dec 6, 2022 • edited Loading

NfNitLoop commented Dec 7, 2022

felicio commented Feb 10, 2023 • edited Loading

dae commented Feb 11, 2023

felicio commented Feb 13, 2023

jcready commented Feb 13, 2023

smaye81 commented Mar 20, 2023

timostamm commented Aug 30, 2023

dimo414 commented Oct 17, 2023

smaye81 commented Oct 23, 2023

malcolmstill commented Dec 21, 2023 • edited Loading

timostamm commented Jan 2, 2024

timostamm commented Jan 24, 2024

Ekrekr commented Aug 21, 2024 • edited Loading

Ekrekr commented Aug 22, 2024 • edited Loading

Ekrekr commented Aug 22, 2024 • edited Loading

dae commented Dec 6, 2022 •

edited

Loading

felicio commented Feb 10, 2023 •

edited

Loading

malcolmstill commented Dec 21, 2023 •

edited

Loading

Ekrekr commented Aug 21, 2024 •

edited

Loading

Ekrekr commented Aug 22, 2024 •

edited

Loading

Ekrekr commented Aug 22, 2024 •

edited

Loading