Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'cannot use empty rangeList' error after upgrade from 359 to 362 #9424

Closed
woodchuck1206 opened this issue Sep 29, 2021 · 3 comments · Fixed by #10873
Closed

'cannot use empty rangeList' error after upgrade from 359 to 362 #9424

woodchuck1206 opened this issue Sep 29, 2021 · 3 comments · Fixed by #10873
Labels
bug Something isn't working

Comments

@woodchuck1206
Copy link

Hi, I upgraded Trino from 359 to 362 and the error below showed up. (sensitive info masked)

io.trino.spi.TrinoException: Error opening Hive split s3://*************************.parquet.snappy (offset=****, length=*****): cannot use empty rangeList
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:287)
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:163)
	at io.trino.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:286)
	at io.trino.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:175)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:49)
	at io.trino.split.PageSourceManager.createPageSource(PageSourceManager.java:64)
	at io.trino.operator.ScanFilterAndProjectOperator$SplitToPages.process(ScanFilterAndProjectOperator.java:267)
	at io.trino.operator.ScanFilterAndProjectOperator$SplitToPages.process(ScanFilterAndProjectOperator.java:195)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:319)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:372)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:306)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:372)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:306)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:372)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:221)
	at io.trino.operator.WorkProcessorUtils.lambda$processStateMonitor$2(WorkProcessorUtils.java:200)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:372)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:221)
	at io.trino.operator.WorkProcessorUtils.lambda$finishWhen$3(WorkProcessorUtils.java:215)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:372)
	at io.trino.operator.WorkProcessorSourceOperatorAdapter.getOutput(WorkProcessorSourceOperatorAdapter.java:151)
	at io.trino.operator.Driver.processInternal(Driver.java:387)
	at io.trino.operator.Driver.lambda$processFor$9(Driver.java:291)
	at io.trino.operator.Driver.tryWithLock(Driver.java:683)
	at io.trino.operator.Driver.processFor(Driver.java:284)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1076)
	at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
	at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
	at io.trino.$gen.Trino_362____20210929_020554_2.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.IllegalArgumentException: cannot use empty rangeList
	at io.trino.spi.predicate.SortedRangeSet.of(SortedRangeSet.java:239)
	at io.trino.spi.predicate.ValueSet.ofRanges(ValueSet.java:84)
	at io.trino.parquet.predicate.TupleDomainParquetPredicate.getDomain(TupleDomainParquetPredicate.java:326)
	at io.trino.parquet.predicate.TupleDomainParquetPredicate.getDomain(TupleDomainParquetPredicate.java:433)
	at io.trino.parquet.predicate.TupleDomainParquetPredicate.effectivePredicateMatches(TupleDomainParquetPredicate.java:180)
	at io.trino.parquet.predicate.TupleDomainParquetPredicate.matches(TupleDomainParquetPredicate.java:132)
	at io.trino.parquet.predicate.PredicateUtils.dictionaryPredicatesMatch(PredicateUtils.java:169)
	at io.trino.parquet.predicate.PredicateUtils.predicateMatches(PredicateUtils.java:143)
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:246)
	... 31 more

Since this error hadn't been shown in 359, I believe it has something to do with the later updates, although I don't seem to be able to locate the culprit from the release notes.

  • Just had a look at the code changes, and seems like the 'ranges' variable in TupleDomainParquetPredicate.java wasn't added for some reasons.
    I cannot supply the error-triggering parquet file, but if you guys want me to run some tests on it, I will be happy to help.
    ++ will dig deeper later when i have time, but there are multiple columns filled with all None values in this parquet, maybe it's the cause?
@findepi findepi added the bug Something isn't working label Sep 29, 2021
@antonysouthworth-halter
Copy link

i get the same "cannot use empty rangeList" error in 414 (on Amazon EMR)

stack trace
io.trino.spi.TrinoException: Corrupted statistics for column "filter_farm_id" in Parquet file "s3://XXX/XXX/XXX/partition_metric_name=XXX/partition_utc_year=2023/partition_utc_month=08/partition_utc_day=26/225439bd-8e0c-4a7c-b992-cbdcc4ce779e.parquet.snappy". Corrupted column index: [Boundary order: UNORDERED
                      null count  min                                       max                                     
page-0                         0  <none>                                    <none>                                  
]
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:309)
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:183)
	at io.trino.plugin.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:216)
	at io.trino.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:154)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:49)
	at io.trino.split.PageSourceManager.createPageSource(PageSourceManager.java:62)
	at io.trino.operator.ScanFilterAndProjectOperator$SplitToPages.process(ScanFilterAndProjectOperator.java:266)
	at io.trino.operator.ScanFilterAndProjectOperator$SplitToPages.process(ScanFilterAndProjectOperator.java:194)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:360)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:347)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:347)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:262)
	at io.trino.operator.WorkProcessorUtils.lambda$processStateMonitor$2(WorkProcessorUtils.java:241)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:262)
	at io.trino.operator.WorkProcessorUtils.lambda$finishWhen$3(WorkProcessorUtils.java:256)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:413)
	at io.trino.operator.WorkProcessorSourceOperatorAdapter.getOutput(WorkProcessorSourceOperatorAdapter.java:146)
	at io.trino.operator.Driver.processInternal(Driver.java:402)
	at io.trino.operator.Driver.lambda$process$8(Driver.java:305)
	at io.trino.operator.Driver.tryWithLock(Driver.java:701)
	at io.trino.operator.Driver.process(Driver.java:297)
	at io.trino.operator.Driver.processForDuration(Driver.java:268)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:888)
	at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
	at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:561)
	at io.trino.$gen.Trino_414____20230907_005824_2.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: io.trino.parquet.ParquetCorruptionException: Corrupted statistics for column "filter_farm_id" in Parquet file "s3://XXX/XXX/XXX/partition_metric_name=XXX/partition_utc_year=2023/partition_utc_month=08/partition_utc_day=26/225439bd-8e0c-4a7c-b992-cbdcc4ce779e.parquet.snappy". Corrupted column index: [Boundary order: UNORDERED
                      null count  min                                       max                                     
page-0                         0  <none>                                    <none>                                  
]
	at io.trino.parquet.predicate.TupleDomainParquetPredicate.corruptionException(TupleDomainParquetPredicate.java:601)
	at io.trino.parquet.predicate.TupleDomainParquetPredicate.getDomain(TupleDomainParquetPredicate.java:542)
	at io.trino.parquet.predicate.TupleDomainParquetPredicate.matches(TupleDomainParquetPredicate.java:219)
	at io.trino.parquet.predicate.PredicateUtils.predicateMatches(PredicateUtils.java:157)
	at io.trino.plugin.hive.parquet.ParquetPageSourceFactory.createPageSource(ParquetPageSourceFactory.java:255)
	... 32 more
Caused by: java.lang.IllegalArgumentException: cannot use empty rangeList
	at io.trino.spi.predicate.SortedRangeSet.of(SortedRangeSet.java:247)
	at io.trino.spi.predicate.SortedRangeSet$Builder.build(SortedRangeSet.java:1019)
	at io.trino.parquet.predicate.TupleDomainParquetPredicate.getDomain(TupleDomainParquetPredicate.java:434)
	at io.trino.parquet.predicate.TupleDomainParquetPredicate.getDomain(TupleDomainParquetPredicate.java:539)
	... 35 more

it seems weird to me because Amazon Athena (V3 engine) is able to query this data just fine 🤷

@raunaqmorarka
Copy link
Member

@antonysouthworth-halter can you provide a sample file and query to reproduce the problem ? It's most likely a different problem from the original issue here.

@hashhar
Copy link
Member

hashhar commented Sep 8, 2023

@antonysouthworth-halter File a new issue.

Also check if disabling parquet column indexes helps you with parquet.use-column-index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

Successfully merging a pull request may close this issue.

5 participants