-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explain parameters #111
Comments
Ah, I found this comment: /*
* Number of bytes of uncompressed data to store
* for each index point. This must be a minimum
* of 32768 bytes.
*/ |
Hi @martindurant, the The The |
Thanks for the explanation! |
Thanks for putting this together! The kerchunk will make great use of it.
I am still trying to get my head around how it works, given that "gzip/zlib streams are unsplittable" has been matra for a long time.
In this issue, however, I'd like to ask for more documentation around the arguments to IndexedGzipFile, and the tradeoffs they entail:
spacing
: the more points in the file you index, the better random seeks will tend to be (needing less scrolling), but the bigger the index file will get. I expect this can be any number up to the size of the target file, at which point seeking is equivalent to not using indexed_gzip at allwindow_size
: something to do with how much data is stored with each point? Can it be made small to keep the index file small, and what would be the downside of this? I don't seem to be able to pick just any number without ZranError, is 2**15 the minimum, or is this file dependent?readbuf_size
: if I know that I will always be reading an exact byte range every time or I implement buffering elsewhere, can this be zero?The text was updated successfully, but these errors were encountered: