Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use unions in Java, simplify record batch deserialization, change C++ to use Message type #1

Merged
merged 4 commits into from
Jan 20, 2017

Conversation

wesm
Copy link

@wesm wesm commented Jan 20, 2017

This makes the integration tests pass again. Since individual vectors can contain 2B elements, the size of record batches can technically exceed INT32_MAX

Part of apache#292

Change-Id: I4bae06d5833ffd24f94522ad23ea2dfcc459d86b
@wesm
Copy link
Author

wesm commented Jan 20, 2017

Here's the Travis CI build for this: https://travis-ci.org/wesm/arrow/builds/193751242

@wesm
Copy link
Author

wesm commented Jan 20, 2017

My bad, the integration tests are failing (I ran the Java tests but didn't recreate the JAR). Will fix

@wesm wesm changed the title Restore Block.bodyLength to long WIP: Fix C++ integration tests Jan 20, 2017
Change-Id: Ib8beb014310219a7ab8263802ec94d2ea5af6805
@wesm wesm changed the title WIP: Fix C++ integration tests Use unions in Java, simplify record batch deserialization, change C++ to use Message type Jan 20, 2017
@wesm
Copy link
Author

wesm commented Jan 20, 2017

@nongli I tried to redo MessageSerializer to use unions correctly -- it's hanging in the threaded reader/writer test, so there must an exception raised somewhere. I don't have enough time to figure it out this morning if you want to start from here to get things working again


// Align the output to 8 byte boundary.
out.align();

long metadataSize = out.getCurrentPosition() - metadataStart;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's significantly simpler if you include the padding in the metadata size. The Flatbuffers library shouldn't have a problem with extra bytes at the end of the buffer (it doesn't in C++)

wesm added 2 commits January 20, 2017 11:27
Change-Id: I2571b4ec6b753a4e207c7dbbd2059b7c2bfc0be2
Change-Id: I2ca87b9e944ce9613f63cee7af81f5556a67b5e8
@wesm
Copy link
Author

wesm commented Jan 20, 2017

I got the streaming unit tests passing again. File tests fail though, I won't have time until tonight to look again

@wesm
Copy link
Author

wesm commented Jan 20, 2017

@nongli @julienledem how can I debug or set break points in these unit tests in IntelliJ?

@nongli nongli merged commit f187539 into nongli:file Jan 20, 2017
nongli pushed a commit that referenced this pull request Jan 22, 2017
…on format.

Author: Wes McKinney <[email protected]>
Author: Nong Li <[email protected]>

Closes apache#292 from nongli/file and squashes the following commits:

18890a9 [Wes McKinney] Message fixes. Fix Java test suite. Integration tests pass
f187539 [Nong Li] Merge pull request #1 from wesm/file-change-cpp-impl
e3af434 [Wes McKinney] Remove unused variable
664d5be [Wes McKinney] Fixes, stream tests pass again
ba8db91 [Wes McKinney] Redo MessageSerializer with unions. Still has bugs
21854cc [Wes McKinney] Restore Block.bodyLength to long
7c6f7ef [Nong Li] Update to restore Block behavior
27b3909 [Nong Li] [ARROW-499]: [Java] Update file serialization to use the streaming serialization format.
nongli pushed a commit that referenced this pull request Oct 4, 2017
…XXX in plasma protocol.

Related to apache#878, add DCHECK for ReadXXX.

Author: Yeolar <[email protected]>

Closes apache#887 from Yeolar/fixtypo_plasma_and_add_DCHECK and squashes the following commits:

4df63bc [Yeolar] clang-format for too long lines.
143d254 [Yeolar] Update, compile passed.
09ff103 [Yeolar] Fix conflicts.
b951d8d [Yeolar] Merge pull request #1 from apache/master
ebae611 [Yeolar] Fix typo in plasma protocol; add DCHECK for ReadXXX in plasma protocol.
nongli pushed a commit that referenced this pull request Oct 4, 2017
…ties

As per apache#872 I am upgrading Jackson to the latest version on the current train (2.7.1 --> 2.7.9)

Author: Matt Darwin <(none)>
Author: Matt <[email protected]>

Closes apache#929 from mattdarwin/ARROW-1242-upgrade-jackson and squashes the following commits:

d059517 [Matt Darwin] 1242 upgraing jackson to 2.7.9
bc3b6a0 [Matt] Merge pull request #1 from apache/master
nongli pushed a commit that referenced this pull request Oct 4, 2017
NB this commit excludes Jackson and logback upgrades, since they are dealt with in 871 and 872

Author: Matt Darwin <(none)>
Author: Matt Darwin <[email protected]>
Author: Matt <[email protected]>

Closes apache#873 from mattdarwin/upgrade-libs and squashes the following commits:

9b51f46 [Matt Darwin] Merge branch 'master' into upgrade-libs
284a4ce [Matt Darwin] Merge branch 'master' of https://github.com/apache/arrow
79550b1 [Matt Darwin] rolling back lilith to 0.9.44 since 8 doesn't support java 7
c63eef6 [Matt Darwin] Merge branch 'master' into upgrade-libs
bc3b6a0 [Matt] Merge pull request #1 from apache/master
8599ba0 [Matt Darwin] backing out guava upgrade
80d81e6 [Matt Darwin] downgrading guava to 20 for java 7 compatibility
806f348 [Matt Darwin] Merge branch 'master' into upgrade-libs
8aafb7e [Matt Darwin] correcting indentation in BaseValueVector
94c1469 [Matt Darwin] upgrading netty to 4.0.49
cff5596 [Matt Darwin] reverting to netty 4.0.41.Final
568737d [Matt Darwin] switching to Collections from Guava for empty iterator
c194e48 [Matt Darwin] upgraded hppc to 0.7.2
38be468 [Matt Darwin] upgrading libs except jackson and logback
nongli pushed a commit that referenced this pull request Oct 4, 2017
…(take 2)

sorry, this was still not fixed properly.  logback version is separately specified in 2 places.

Fixed properly this time.

Author: Matt Darwin <(none)>
Author: Matt <[email protected]>

Closes apache#960 from mattdarwin/ARROW-1240-upgrade-logback and squashes the following commits:

3492f66 [Matt Darwin] upgrading logback in tools/pom.xml
206b48d [Matt Darwin] Merge branch 'master' into ARROW-1240-upgrade-logback
284a4ce [Matt Darwin] Merge branch 'master' of https://github.com/apache/arrow
bc3b6a0 [Matt] Merge pull request #1 from apache/master
3e2f676 [Matt Darwin] Merge branch 'master' into ARROW-1240-upgrade-logback
caed163 [Matt Darwin] upgrading slf4j to 1.7.25
nongli pushed a commit that referenced this pull request Oct 4, 2017
…ties (take 2)

sorry, PR apache#929 failed to actually change the Jackson version, since the `jackson.version` variable defined in java/pom.xml is not used in java/vector/pom.xml

That's now fixed in this PR.

Author: Matt Darwin <(none)>
Author: Matt <[email protected]>

Closes apache#957 from mattdarwin/ARROW-1242-upgrade-jackson and squashes the following commits:

ad15e5f [Matt Darwin] Merge branch 'master' into ARROW-1242-upgrade-jackson
ee29d65 [Matt Darwin] Merge branch 'master' of https://github.com/apache/arrow into ARROW-1242-upgrade-jackson
06d7745 [Matt Darwin] upgrading jackson to 2.7.9 PROPERLY this time...
284a4ce [Matt Darwin] Merge branch 'master' of https://github.com/apache/arrow
d059517 [Matt Darwin] 1242 upgraing jackson to 2.7.9
bc3b6a0 [Matt] Merge pull request #1 from apache/master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants