-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] test_parquet_read_merge_schema failed w/ TITAN V #5493
Comments
I am skeptical that this has anything to do with the TITAN V. The tests fail in a part of the code that has not even toughed the GPU yet. Low memory on the TITAN V would reduce the parallelism of the tests, but the test settings are set up so that it should not matter. I have tried to recreate the failures with Spark 3.1.1 which is the version that precommit was using and Spark 3.1.2 which is the version in this CI that is failing. Neither of them were able to reproduce the problem. I was able to verify that the test failure is reproducible in CI, so now I am going to try and slowly work towards reproducing it myself. Perhaps it is ubuntu 16 instead of 20? Or it could be running all of the tests in the same application? Not sure. |
The only other idea that I have right now is that it might be the order in which files and directories are returned. It could be that they are being returned in different orders and that is causing schema discovery to come up with something different? Not really sure because it should be merging the schemas to produce the read schema. |
…5500) Native footer reader for parquet fetches data fields totally based on read schema, which may lead to overflow if merge schema is enabled. When merge schema is enabled, the file schema of each file partition may not contain the complete (read) schema. In this situation, native footer reader will come up with incorrect footers. Fallback the parquet reading to CPU if merge schema and native footer reader are both enabled, in case of buffer overflow like #5493
Describe the bug
blossom rapids_it-ubuntu16-dev-github build ID 14. This pipeline use titan_v which has relatively less gpu memory as T4/V100
mostly
Caused by: java.lang.AssertionError: End address is too high for setBytes 0x7fcfe5a57628 < 0x7fcfe5a57624
The text was updated successfully, but these errors were encountered: