You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Recently RAPIDS Accelerator for Apache Spark tests have been failing when timestamp types are involved in the input file on disk. I noticed that timestamps seem to be corrupted if the caller specifies a timestamp type to use for any timestamp columns being loaded.
Steps/Code to reproduce bug
Apply the following patch which demonstrates the issue. If you remove the .timestamp_type(cudf::data_type... line on the ORC options builder then the test will pass.
Thanks for looking into this, @PointKernel! Note that this has blocked the RAPIDS Accelerator 21.12 CI pipelines. If it will take a while to develop a fix, we may want to consider reverting the change that triggered the regression.
Closes#9365
This PR gets rid of integer overflow issues along with the clock rate logic by directly operating on timestamp type id. It also fixes a truncation bug in Parquet. Corresponding unit tests are added.
Authors:
- Yunsong Wang (https://github.com/PointKernel)
Approvers:
- Vukasin Milovanovic (https://github.com/vuule)
URL: #9382
Describe the bug
Recently RAPIDS Accelerator for Apache Spark tests have been failing when timestamp types are involved in the input file on disk. I noticed that timestamps seem to be corrupted if the caller specifies a timestamp type to use for any timestamp columns being loaded.
Steps/Code to reproduce bug
Apply the following patch which demonstrates the issue. If you remove the
.timestamp_type(cudf::data_type...
line on the ORC options builder then the test will pass.Expected behavior
Requesting
TIMESTAMP_NANOSECONDS
should return the same data as not requesting a timestamp result type.The text was updated successfully, but these errors were encountered: