Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use direct dependency on Parquet #18445

Merged
merged 5 commits into from
Jul 28, 2023
Merged

Use direct dependency on Parquet #18445

merged 5 commits into from
Jul 28, 2023

Conversation

electrum
Copy link
Member

@electrum electrum commented Jul 28, 2023

Depends on trinodb/trino-hive-apache#49

Release notes

(x) This is not user-visible or docs only and no release notes are required.

@cla-bot cla-bot bot added the cla-signed label Jul 28, 2023
@electrum electrum requested a review from dain July 28, 2023 00:06
@github-actions github-actions bot added tests:hive hudi Hudi connector iceberg Iceberg connector delta-lake Delta Lake connector hive Hive connector labels Jul 28, 2023
@electrum electrum force-pushed the parquet branch 3 times, most recently from 0134da6 to e0b5436 Compare July 28, 2023 00:59
@electrum electrum force-pushed the parquet branch 2 times, most recently from c1d70c2 to febdaec Compare July 28, 2023 05:30
@electrum electrum merged commit 2d669da into trinodb:master Jul 28, 2023
@electrum electrum deleted the parquet branch July 28, 2023 18:20
@github-actions github-actions bot added this to the 423 milestone Jul 28, 2023
hashhar added a commit to starburstdata/hive-json-serde that referenced this pull request Aug 9, 2023
commons-lang3 has been used as a compile time dependency since
f178130 but no explicit dependency was
added.

This used to work since hadoop-common, hive-exec, coral etc. and many
other libraries pull is commons-lang3 as a transitive dependency and
this meant that usually it was on the classpath leading to things
working.

After a combination of trinodb/trino#18444,
trinodb/trino#18445 and
trinodb/trino-hive-apache#49 this no longer
works and hive-open-x product test suite in SEP fails with errors like:

    2023-08-09 01:21:23 SEVERE: Failure cause:
    io.trino.tempto.query.QueryExecutionException: java.sql.SQLException: Query failed (#20230808_193622_00006_ubxkq): org/apache/commons/lang3/tuple/Pair
    ...
    Caused by: java.lang.NoClassDefFoundError: org/apache/commons/lang3/tuple/Pair
        at org.openx.data.jsonserde.objectinspector.JsonObjectInspectorFactory.getJsonObjectInspectorFromTypeInfo(JsonObjectInspectorFactory.java:66)
        at org.openx.data.jsonserde.JsonSerDe.initialize(JsonSerDe.java:150)
        at io.trino.plugin.hive.util.HiveReaderUtil.initializeDeserializer(HiveReaderUtil.java:273)
        at io.trino.plugin.hive.util.HiveReaderUtil.getDeserializer(HiveReaderUtil.java:238)
        at io.trino.plugin.hive.GenericHiveRecordCursor.<init>(GenericHiveRecordCursor.java:141)
        at io.trino.plugin.hive.GenericHiveRecordCursorProvider.lambda$createRecordCursor$1(GenericHiveRecordCursorProvider.java:109)
        at io.trino.hdfs.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:25)
        at io.trino.hdfs.HdfsEnvironment.doAs(HdfsEnvironment.java:125)
        at io.trino.plugin.hive.GenericHiveRecordCursorProvider.createRecordCursor(GenericHiveRecordCursorProvider.java:96)
        at io.trino.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:256)
        at io.trino.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:154)
        at com.starburstdata.trino.plugins.dynamicfiltering.DynamicRowFilteringPageSourceProvider.createPageSource(DynamicRowFilteringPageSourceProvider.java:54)
        at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:48)
        at io.trino.split.PageSourceManager.createPageSource(PageSourceManager.java:61)
        at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:296)
        at io.trino.operator.Driver.processInternal(Driver.java:395)
        at io.trino.operator.Driver.lambda$process$8(Driver.java:298)
        at io.trino.operator.Driver.tryWithLock(Driver.java:694)
        at io.trino.operator.Driver.process(Driver.java:290)
        at io.trino.operator.Driver.processForDuration(Driver.java:261)
        at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:887)
        at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
        at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:555)
        at io.trino.$gen.Trino_422_e_49_gb1b5240____20230808_193431_2.run(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
    Caused by: java.lang.ClassNotFoundException: org.apache.commons.lang3.tuple.Pair
        at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:587)
        at io.trino.server.PluginClassLoader.loadClass(PluginClassLoader.java:128)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
        ... 27 more
hashhar added a commit to starburstdata/hive-json-serde that referenced this pull request Aug 9, 2023
commons-lang3 has been used as a compile time dependency since
f178130 but no explicit dependency was
added.

This used to work since hadoop-common, hive-exec, coral etc. and many
other libraries pull is commons-lang3 as a transitive dependency and
this meant that usually it was on the classpath leading to things
working.

After a combination of trinodb/trino#18444,
trinodb/trino#18445 and
trinodb/trino-hive-apache#49 this no longer
works reliably leading to failures like:

    2023-08-09 01:21:23 SEVERE: Failure cause:
    io.trino.tempto.query.QueryExecutionException: java.sql.SQLException: Query failed (#20230808_193622_00006_ubxkq): org/apache/commons/lang3/tuple/Pair
    ...
    Caused by: java.lang.NoClassDefFoundError: org/apache/commons/lang3/tuple/Pair
        at org.openx.data.jsonserde.objectinspector.JsonObjectInspectorFactory.getJsonObjectInspectorFromTypeInfo(JsonObjectInspectorFactory.java:66)
        at org.openx.data.jsonserde.JsonSerDe.initialize(JsonSerDe.java:150)
        at io.trino.plugin.hive.util.HiveReaderUtil.initializeDeserializer(HiveReaderUtil.java:273)
        at io.trino.plugin.hive.util.HiveReaderUtil.getDeserializer(HiveReaderUtil.java:238)
        at io.trino.plugin.hive.GenericHiveRecordCursor.<init>(GenericHiveRecordCursor.java:141)
        at io.trino.plugin.hive.GenericHiveRecordCursorProvider.lambda$createRecordCursor$1(GenericHiveRecordCursorProvider.java:109)
        at io.trino.hdfs.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:25)
        at io.trino.hdfs.HdfsEnvironment.doAs(HdfsEnvironment.java:125)
        at io.trino.plugin.hive.GenericHiveRecordCursorProvider.createRecordCursor(GenericHiveRecordCursorProvider.java:96)
        at io.trino.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:256)
        at io.trino.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:154)
	...
        at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:48)
        at io.trino.split.PageSourceManager.createPageSource(PageSourceManager.java:61)
        at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:296)
        at io.trino.operator.Driver.processInternal(Driver.java:395)
        at io.trino.operator.Driver.lambda$process$8(Driver.java:298)
        at io.trino.operator.Driver.tryWithLock(Driver.java:694)
        at io.trino.operator.Driver.process(Driver.java:290)
        at io.trino.operator.Driver.processForDuration(Driver.java:261)
        at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:887)
        at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
        at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:555)
        at io.trino.$gen.Trino_422_e_49_gb1b5240____20230808_193431_2.run(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
    Caused by: java.lang.ClassNotFoundException: org.apache.commons.lang3.tuple.Pair
        at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:587)
        at io.trino.server.PluginClassLoader.loadClass(PluginClassLoader.java:128)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
        ... 27 more
hashhar added a commit to starburstdata/hive-json-serde that referenced this pull request Aug 9, 2023
commons-lang3 has been used as a compile time dependency since
f178130 but no explicit dependency was
added.

This used to work since hadoop-common, hive-exec, coral etc. and many
other libraries pull is commons-lang3 as a transitive dependency and
this meant that usually it was on the classpath leading to things
working.

After a combination of trinodb/trino#18444,
trinodb/trino#18445 and
trinodb/trino-hive-apache#49 this no longer
works reliably leading to failures like:

    2023-08-09 01:21:23 SEVERE: Failure cause:
    io.trino.tempto.query.QueryExecutionException: java.sql.SQLException: Query failed (#20230808_193622_00006_ubxkq): org/apache/commons/lang3/tuple/Pair
    ...
    Caused by: java.lang.NoClassDefFoundError: org/apache/commons/lang3/tuple/Pair
        at org.openx.data.jsonserde.objectinspector.JsonObjectInspectorFactory.getJsonObjectInspectorFromTypeInfo(JsonObjectInspectorFactory.java:66)
        at org.openx.data.jsonserde.JsonSerDe.initialize(JsonSerDe.java:150)
        at io.trino.plugin.hive.util.HiveReaderUtil.initializeDeserializer(HiveReaderUtil.java:273)
        at io.trino.plugin.hive.util.HiveReaderUtil.getDeserializer(HiveReaderUtil.java:238)
        at io.trino.plugin.hive.GenericHiveRecordCursor.<init>(GenericHiveRecordCursor.java:141)
        at io.trino.plugin.hive.GenericHiveRecordCursorProvider.lambda$createRecordCursor$1(GenericHiveRecordCursorProvider.java:109)
        at io.trino.hdfs.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:25)
        at io.trino.hdfs.HdfsEnvironment.doAs(HdfsEnvironment.java:125)
        at io.trino.plugin.hive.GenericHiveRecordCursorProvider.createRecordCursor(GenericHiveRecordCursorProvider.java:96)
        at io.trino.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:256)
        at io.trino.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:154)
	...
        at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:48)
        at io.trino.split.PageSourceManager.createPageSource(PageSourceManager.java:61)
        at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:296)
        at io.trino.operator.Driver.processInternal(Driver.java:395)
        at io.trino.operator.Driver.lambda$process$8(Driver.java:298)
        at io.trino.operator.Driver.tryWithLock(Driver.java:694)
        at io.trino.operator.Driver.process(Driver.java:290)
        at io.trino.operator.Driver.processForDuration(Driver.java:261)
        at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:887)
        at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
        at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:555)
        at io.trino.$gen.Trino_422_e_49_gb1b5240____20230808_193431_2.run(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
    Caused by: java.lang.ClassNotFoundException: org.apache.commons.lang3.tuple.Pair
        at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:445)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:587)
        at io.trino.server.PluginClassLoader.loadClass(PluginClassLoader.java:128)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
        ... 27 more
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed delta-lake Delta Lake connector hive Hive connector hudi Hudi connector iceberg Iceberg connector
Development

Successfully merging this pull request may close these issues.

2 participants