You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
For a certain range of values, the DateTime/timestamp data is not being written correctly into orc file. Round-tripping these values exhibit corruption in sub-second data.
The nanoseconds (coded separately from seconds) are exactly off by 0x20000000 (1<<29)
I'd venture to say the length stream needs to be 64-bit to accommodate the (nanoseconds << 3) (would explain truncated bit 32 resulting in off by 1<<29) https://github.com/rapidsai/cudf/blob/branch-0.18/cpp/src/io/orc/stripe_enc.cu#L783
(That would imply a 64-bit length stream and corresponding reduction from 1024 to 512 for number of values in shared mem circular buffer)
Closes#7355
Use 64 bit variables/buffers to handle nanosecond values since nanosecond encode can overflow a 32bit value in some cases.
Removed the overloaded `intrle_minmax` function, using templated `numeric_limits` functions instead (the alternative was to add another overload).
Performance impact evaluation pending, but this fix seems unavoidable regardless of the impact.
Authors:
- Vukasin Milovanovic (@vuule)
Approvers:
- GALI PREM SAGAR (@galipremsagar)
- Devavret Makkar (@devavret)
- Kumar Aatish (@kaatish)
URL: #7581
Closesrapidsai#7355
Use 64 bit variables/buffers to handle nanosecond values since nanosecond encode can overflow a 32bit value in some cases.
Removed the overloaded `intrle_minmax` function, using templated `numeric_limits` functions instead (the alternative was to add another overload).
Performance impact evaluation pending, but this fix seems unavoidable regardless of the impact.
Authors:
- Vukasin Milovanovic (@vuule)
Approvers:
- GALI PREM SAGAR (@galipremsagar)
- Devavret Makkar (@devavret)
- Kumar Aatish (@kaatish)
URL: rapidsai#7581
Describe the bug
For a certain range of values, the DateTime/timestamp data is not being written correctly into orc file. Round-tripping these values exhibit corruption in sub-second data.
Steps/Code to reproduce bug
Expected behavior
Environment overview (please complete the following information)
Environment details
Please run and paste the output of the
cudf/print_env.sh
script here, to gather any other relevant environment detailsClick here to see environment details
Additional context
Surfaced while running fuzz tests: #6001
The text was updated successfully, but these errors were encountered: