-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TextDecoder.decode/JSON.parse vs makeReader Performance... #616
Comments
It sounds like you're using Ion text, is that true? If so, please also try Ion binary. Its performance should be much better, especially for larger data sets.
Could you elaborate on what you're doing? Does it take hundreds of seconds to "create the Ion reader", or to actually read all of the data with it? Could you share your benchmark code?
I won't be able to answer this without better understanding what's being measured. |
@zslayton on the server-side I am using the Java Jackson Ion serializer to create the ion byte array that gets sent to the client...
From there, the client receives an
The line above takes ~565-767 milliseconds to complete... |
Thanks for clarifying. From a skim of the Jackson source, it looks like you'll need to call
That's definitely not expected. |
@zslayton Using the setCreateBinaryWriters(true) did the trick. After setting that the client's Now I just wish the reader was easier to work with ;-). Would be great if the DOM Api was able to just load the properties associated with a structure and only load the value of a property when accessed (i.e., lazily loading values). That would effectively give the DOM Api a similar feel to working with a reader... Or perhaps helper/util functions to make finding keys in a reader easier... |
I've opened #617 to track this. |
@zslayton with the same object structure/data mentioned above the It was my understanding from a previous question ion would only load up to the first value it encounters which in this case would be the |
I believe you're referring to this quote?:
The example document you provided is considered a stream with a single (deeply nested) value. A longer stream might look like this:
You can get the incremental parsing behavior you're describing by using the (less friendly) streaming
While I wouldn't expect a pure-JS (or TS) implementation like ours to outperform the native functions baked into the browser, that's a wider performance gap than I would hope for. However, without seeing an example document it will be hard to diagnose any performance problems. How large is your file? Are you able to share it? You switched to binary Ion earlier in this thread, but just now you mentioned:
Are you comparing the Ion text parser or the binary? |
@zslayton I actually tested both ion binary and ion text with the You can grab a copy of the sample data I have been testing with here Slight correction to my original time measurements. I found the ion I then compared those measurements to TextDecoder.decode/JSON.parse with a encoded json string which took ~81 milliseconds. I agree ion may not be able to beat serialization of native baked-in browser functions. I guess I was hoping ion would provide a smaller binary payload from the server to the client while being a competitive alternative to the native baked-in browser functions all while being schemaless... |
Thanks for the example data! I took some time to dig into this this morning and wanted to share some of my initial findings.
|
file | seconds |
---|---|
sample-data.json | 4.426 |
sample-data.floats.ion | 2.649 |
sample-data.10n | 2.579 |
sample-data.floats.10n | 1.868 |
While I'd like these results to be faster across the board, their relative performance aligns with my expectations; reading binary is faster than reading text, and reading floats is faster than reading decimals. If you don't require the precision offered by decimal
values, encoding them as float
s instead will save you some processing time (but not necessarily space!).
ion.makeReader()
We expect calls to makeReader()
to be essentially free, since the code just constructs a new Reader
object and doesn't do any parsing. However, as you've seen, there are some cases where it's strangely slow.
I've modified my demo code from above to this:
let ion = require('ion-js/dist/es6/es6/Ion.js');
let fs = require('fs');
let path = require('path');
let fileName = process.argv[2];
let filePath = path.join(__dirname, fileName);
let buffer = fs.readFileSync(filePath);
let reader = ion.makeReader(buffer); // <--- Replaced ion.load()
Example timings:
file | seconds |
---|---|
sample-data.json | 0.741 |
sample-data.10n | 0.110 |
The result for sample-data.10n
is about what I'd expect for both of them -- a tenth of a second is about as long as it takes to start a node
process and import ion
. As you'd noticed, the path for sample-data.json
is weirdly slow.
This is caused by the argument handling in makeReader
. If the provided data source is a string
, it's used to create a TextReader
directly. If it's a buffer, it checks to see whether it starts with the binary Ion version marker. If so, it's used to create aBinaryReader
directly. If not, then it decodes it first using a pure-JS custom implementation of decodeUtf8
to produce a string
first.
If I bypass decodeUtf8
by either asking NodeJS to decode the file's bytes for me:
let text = fs.readFileSync(filePath, 'utf8');
let reader = ion.makeReader(text);
file | seconds |
---|---|
sample-data.json | 0.121 |
let buffer = fs.readFileSync(filePath);
let text = new TextDecoder().decode(buffer);
let reader = ion.makeReader(text);
file | seconds |
---|---|
sample-data.json | 0.133 |
This was originally done because some (pretty old) runtimes didn't universally support TextDecoder
, but I suspect that's a restriction we can lift now. I've opened this issue to track that.
2+ seconds to load a 4.5MB json file into memory is still slower than we'd like. Hopefully I can find some time to do some profiling in the near future to find some other low-hanging fruit.
Tests were run on a 2015 macbook pro using node v12.17.0
.
Modified the Unicode decode method used in ion-js as per the experiments mentioned in #618 with the changes in this PR. Here's the findings for the sample data provided in this issue:
From above it's clear using TextDecoder.decode improves performance significantly. |
A follow up to my question regarding ion's laziness. I tried using an ion reader on a large object but I'm seeing a substantial performance difference between
makeReader
andTextDecoder.decode/JSON.parse
.The object structure below has very large numeric arrays in
positions
(178443),lineWidths
(178443),lineColors
(237924).With
TextDecoder.decode/JSON.parse
it takes ~65-150ms to convert anarrayBuffer
to a JSON object. With the ionmakeReader
I'm seeing it taking ~565-765ms to create the ion reader.Obviously the
TextDecoder/JSON
parsers are native. Was curious however if this drastic difference is to be expected with ion from a performance perspective?The text was updated successfully, but these errors were encountered: