Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accumulo type mapping cleanup #3559

Merged
merged 4 commits into from
May 3, 2020
Merged

Accumulo type mapping cleanup #3559

merged 4 commits into from
May 3, 2020

Conversation

findepi
Copy link
Member

@findepi findepi commented Apr 27, 2020

Removes some unnecessary flexibility in the code, making it easier to do
things like #3558 in the future.

@cla-bot cla-bot bot added the cla-signed label Apr 27, 2020
@adamjshook
Copy link
Member

Cool! +1

findepi added 3 commits April 27, 2020 23:19
The `Field`'s `toString` is complex, as if it was playing some
conversion role, but it does not. Simplify it and make more robust
(avoid throwing exceptions), at the cost of not producing human-readable
string representation of collections.
@adamjshook
Copy link
Member

For some context here, most of this code was for downstream applications that use presto-accumulo as a dependency in an effort to massage any given Java object into the corresponding SQL type. I'm okay tightening it up here.

@findepi
Copy link
Member Author

findepi commented Apr 27, 2020

@adamjshook thanks for looking into this.
That's what i thought also, that there is, or used to be, some other code that benefited from this flexibility.
Otherwise it wouldn't exist.

OTOH, i tried to take a quick stab at the #3558, and i found this flexibility limiting.
That's why I am proposing this change.

BTW i added some more changes to the PR. Please take another look.

Copy link
Member

@adamjshook adamjshook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

The conversion code was flexible, allowing different representations for
given Presto Type. This is unnecessary, as the value passed is always a
Presto Type's stack representation.
@findepi
Copy link
Member Author

findepi commented Apr 28, 2020

thanks @adamjshook for the review

@electrum
Copy link
Member

@adamjshook The connector documentation mentions using the connector JAR as a server-side iterator: https://prestosql.io/docs/current/connector/accumulo.html#installing-the-iterator-dependency

Is this still used? Are there tests for it? Will this PR break it?

It sounds like this is something we'll need to address when moving Presto to Java 11. Maybe by pulling that out into a separate module that builds with Java 8.

@adamjshook
Copy link
Member

@electrum Hi David -- Yes, it is still required in the connector here. I do not see any tests in this repository for the Accumulo iterators, however I do not believe this PR will cause any issues with the iterator. If how the data was stored in Accumulo was being modified, then it would likely break the iterator code.

I think a separate module for Java 8 would be the way to go for the iterator dependency. Java 11 support was recently added for Accumulo (although the Presto integration hasn't been tested AFAIK), but older Accumulo versions would still require a Java 8 version.

@findepi
Copy link
Member Author

findepi commented Apr 29, 2020

What's the entry point in the module for the accumulo iterators?

@adamjshook
Copy link
Member

The servers need these two classes on their classpath in order for the connector to work properly: https://github.com/prestosql/presto/tree/master/presto-accumulo/src/main/java/io/prestosql/plugin/accumulo/iterators

if (!(value instanceof Block)) {
throw new PrestoException(FUNCTION_IMPLEMENTATION_ERROR, "Object is not a Block, but " + value.getClass());
}
// Block
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we enforce the types here?

return (Block) value;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to do this, but intellij frowns on "unnecessary cast". (which i dont want to ignore in general, although here it's incorrect).
This is, however, enforced by the checkArgument(Primitives.wrap(type.getJavaType()).isInstance(value), check above.

@electrum
Copy link
Member

electrum commented May 3, 2020

@adamjshook Thanks for the explanation about the iterators. I only see two iterators, MaxByteArrayCombiner and MinByteArrayCombiner. However, I see more in https://github.com/bloomberg/presto-accumulo/tree/master/presto-accumulo-iterators/src/main/java/com/facebook/presto/accumulo/iterators

Are the ones in bloomberg/presto-accumulo no longer needed? Also, are the *ByteArrayCombiner going to be available in newer Accumulo versions?

@findepi findepi merged commit ad48ae0 into trinodb:master May 3, 2020
@findepi findepi deleted the acc branch May 3, 2020 20:14
@adamjshook
Copy link
Member

@electrum Yes, those were experimental at the time and are no longer needed. There are no plans to put these into the Accumulo distribution, and I am not entirely sure they would be worth while. They are pretty coupled for Presto and are not very general-purpose.

With that said, I am looking into breaking the iterators out into a separate module. Feel free to open an issue and assign it to me if you'd like to track it better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants