Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support sized array and sized map. #23

Merged

Conversation

mbaril
Copy link

@mbaril mbaril commented Jun 27, 2016

*Added support for write field name with long type values.
This is to be able to create map with number as key.

*Added support for size arrays and maps

*Fix CBOR Parser bug when fieldname is not a string.

*Added support for size arrays and maps
*Fix CBOR Parser bug when fieldname is not a string.
@@ -817,7 +817,7 @@ protected String _numberToName(int ch, boolean neg) throws IOException
if (neg) {
i = -i - 1;
}
return String.valueOf(1);
return String.valueOf(i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will go ahead and fix this separately, backporting in 2.7 at least.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix checked in 2.7 (and actually even 2.6 although not sure there will be new releases from that branch)

@cowtowncoder
Copy link
Member

Ok. First things first: thank you for contributing this patch! I know that ability to use length-prefixed objects and especially arrays is something that has been requested.

I went ahead and merged simplest part of the patch (fixing type of 1 that should be i), since that can be merged into earlier versions.

Of remaining changes, I am bit ambivalent regarding addition of ability to write long keys: it may be acceptable to do that, but I'll have to think about it a bit. There's always bit of challenge in dealing with format-specific parts, and I am not sure what the right balance is here.

And then the bulk of changes: supporting length-prefixed writes. At logical level, I think implementation makes sense and is correct. But I have concerns about efficiency of the implementation. Given that checks are called for each and every value write, and that they are using JDK provider List with Integer wrappers, I would expect this to have measurable detrimental effect on performance compared to existing code. I hope to find time to test this locally, using jackson-benchmarks.

But I think implementation could be improved to mostly/completely eliminate the overhead. Two ways to go about it that I can think of are:

  1. Use simple int[] instead of List; code may be bit uglier but access much more efficient. Could alternatively create helper object to contain functionality, essentially similar to primitive-valued stack that a lib like hppc or alternatives would give (but that we can't use due to sheer size of these libs -- I'd love to use them but don't think it's appropriate).
  2. Use custom JsonWriteContext to contain per-level counts; this would just be one new field, no List/Stack needed.

and actually either way it would probably make sense to have one new field in generator itself for "current count", so that regular value writes would only modify that count and not look up state from List.

@mbaril
Copy link
Author

mbaril commented Jun 28, 2016

For the performance related concern:

For #1 I agree that there will be a performance impact but it's difficult to tell without testing if the effect will be negligible or not. I was not aware of the project jackson-benchmarks and thus I haven't tried to measure the performance impacts of the modifications.

For #2, this is something that I look at initially but I had some doubt that is was the right way to go when I looked at how that would be implemented. (Extending JsonWriteContext, overriding some functions, ...)

Concerning your last proposition (use a currentCount) I think that this is something that could be easily implemented and that could also help to reduce the performance impact.

@cowtowncoder
Copy link
Member

I agree that use of JsonWriteContext may get tricky. For what it is worth, custom write context is used by jackson-dataformat-avro and jackson-dataformat-protobuf (both now under jackson-dataformats-binary repo), although I think neither tracks counts; protobuf does keep track of content length indirectly.

@cowtowncoder
Copy link
Member

@mbaril Actually, thinking bit more about possibly exposing non-String keys, I think it might be a good idea to start exposing this via JsonGenerator interface, perhaps first only adding method for long keys (which can generally be used for int, short keys too), and once in place, can be possibly pipe through from databind too. This could be implemented as simple coercion to String in JsonGenerator itself, overridden by backends that do support such key types.

The only question really is timing: I am trying hard to finalize 2.8.0. But. I think addition of maybe writeFieldId(long) (instead of writeFieldName()) should be fine and doable for 2.8.0 -- additional work could then be done for 2.9.0.

I will create an issue at jackson-core for this.

@mbaril
Copy link
Author

mbaril commented Jun 28, 2016

Sound perfect. Thanks!

I also took time too run the jackson-benchmarks to compare commit 2.8.0.rc2-2-g08854c6 with 2.8.0.rc2-4-gb6a63b9.

I haven't look the tests in details but it looks like the modifications have an impact on performance.
(I have run the benchmark with the options com.fasterxml.jackson.perf.cbor.* -wi 7 -i 7 -f 3 -t 2)

benchmarks_results.zip

@cowtowncoder
Copy link
Member

@mbaril Thank you for running the tests and making the change for current count. From simple runs I did locally I could not see much of performance difference, nor was there anything in profiler, so unless I made a mistake it looks like patch as-is wouldn't add measurable overhead for usage where count functionality is not used. I may want to do some minor tweaking (for naming etc), perhaps encapsulate List access in separate class, but all in all I think things look good.

It also looks like we already have CLA for you which is good.

So I think what I'll do is merge the patch, and get this included in 2.8.0. Thanks!

@cowtowncoder cowtowncoder merged commit 5de31af into FasterXML:master Jun 28, 2016
@cowtowncoder
Copy link
Member

Interesting. With some minor playing with attempts to optimize, found out that there is one very performance-sensitive part in decrementElementsRemainingCount().
I was trying to add better checking to ensure that the first "extra" element would trigger exception (and not just when closing the scope which happens well after), but additional check seemed to have disproportionate effect, possibly due to inlining (or lack thereof).

On plus side, patch as-is is remarkably nicely behaving at least wrt jackson-benchmarks, so that's good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants