You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not sure whether the existing row index implementation is correct. The documentation is slightly hard to interpret. Particularly these sections from https://orc.apache.org/docs/spec-index.html:
To record positions, each stream needs a sequence of numbers. For uncompressed streams, the position is the byte offset of the RLE run’s start location followed by the number of values that need to be consumed from the run. In compressed streams, the first number is the start of the compression chunk in the stream, followed by the number of decompressed bytes that need to be consumed, and finally the number of values consumed in the RLE.
For columns with multiple streams, the sequences of positions in each stream are concatenated. That was an unfortunate decision on my part that we should fix at some point, because it makes code that uses the indexes error-prone.
The text was updated successfully, but these errors were encountered:
Not sure whether the existing row index implementation is correct. The documentation is slightly hard to interpret. Particularly these sections from https://orc.apache.org/docs/spec-index.html:
The text was updated successfully, but these errors were encountered: