-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add merge_offset_ranges utility #780
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see this!
I have the one API question that I think needs to be answered.
Note that even without solving that question, the code would already be useful, but require the called to themselves call marge_offset_ranged, and then cat_ranges with the result (i.e., not actually use the max_gap arg of the latter).
Thank you for the changes, @rjzamora ! Can I please request that you remove the copyright and licensing blocks, since this whole project is already licensed as BSD-3. That aside, this is good to go. I was about to say you should add the to the API docs, but maybe that can wait until we see any usage aside from our own. |
Sounds good. Thanks for the review @martindurant ! |
This implements a simple
merge_offset_ranges
utility to support themax_gap
option for thecat_ranges
method added in #744. Note that this utility also recognizes amax_block=
argument to help preserve parallelism for large reads. I intend to use this utility (viacat_ranges
) to help improve remote-file-system performance in Dask.