Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Document and export alloc_request(), notify_filled(). #4484

Closed
wants to merge 1 commit into from

Conversation

kmsquire
Copy link
Member

These are currently used by libuv to get a pointer to a block of memory to fill, and then update the IOBuffer info.

Zlib.jl currently uses IOBuffers in a similar manner, but copies the data twice. It would be nice if it could use this mechanism, and an exported interface is less likely to break/change than a non-exported one.

Also slightly updated the call signature of notify_filled() to be more friendly to external users, minimally affecting libuv use.

@kmsquire
Copy link
Member Author

cc: @dcjones (I have a pull request for Zlib.jl in the works).

@kmsquire
Copy link
Member Author

cc: @vtjnash, @loladiro

@dcjones
Copy link
Contributor

dcjones commented Oct 12, 2013

I'm in favor. I don't think there is currently a way for a C function to write directly to a IOBuffer, and not doing so is pretty inefficient.

@kmsquire
Copy link
Member Author

This mechanism is already used by libuv for exactly that purpose, so this is really just to expose that interface.

@vtjnash
Copy link
Member

vtjnash commented Oct 12, 2013

This seems fine, the only thing I might suggest is to have alloc_request() set writable to false, and have notify_filled to re-enable writing (and check that an alloc request is active).

alloc_request() needs to check that the buffer is writable

notify_filled() needs to check that the buffer has not been "over filled" by checking nbytes against the size of the active request (only one additional field should be sufficient for both purposes)

@JeffBezanson
Copy link
Member

What do you mean by "over filled"? What additional field is needed exactly?

@vtjnash
Copy link
Member

vtjnash commented Oct 12, 2013

assert that nbytes filled <= size allocated -- additional field is the nbytes allocated.

@kmsquire
Copy link
Member Author

Thanks for the feedback, @vtjnash. I've updated the pull request. Let me know if there are any more suggestions, and if not, I'll squash and merge.

@kmsquire
Copy link
Member Author

(Assuming this is okay before 0.2, although I don't think it should be controversial.)

* add alloc_req field to IOBuffer, indicating the size of a reqest
* disable writing while a request is in progress
@kmsquire
Copy link
Member Author

kmsquire commented Dec 4, 2013

Bump.

I squashed the commits. Is there any objection to merging this?

@kmsquire
Copy link
Member Author

Bump.

@JeffBezanson
Copy link
Member

This is actually part of #3887, which is a good thing, but I'd rather not implement that design haphazardly piece-by-piece.

Unfortunately alloc_request is not safe, since it returns a raw pointer. notify_filled is a bit odd since nobody is actually notified --- Zlib/libuv is issuing "requests" to the buffer, but that buffer is part of its internal state. Meanwhile we have code in multi.jl that does its own buffering, then writes to a socket that also has a buffer (though not currently used for writing). From all this I conclude that we have a mess. And it's probably mostly my fault, which is why I want to clean it up.

Another way to look at this is to observe that Zlib.jl is doing its own buffering. It appears to be reusing IOBuffer, but it might as well be using its own byte vector. This is easy to miss, since it's currently possible to inherit most I/O functionality just by reading a byte at a time, but anything fancier you might want to do that assumes a buffer would have to be re-implemented inside Zlib.jl. This applies equally to the libuv code.

@kmsquire
Copy link
Member Author

First, if there's going to be something better, then I'm perfectly fine with the decision not to merge this. I have generally found the IOBuffer implementation a little awkward, since to use it effectively seems to require deep coupling with other code.

That said, I'm not quite following your response, so I wanted to clarify a couple of things, and then get clarification.

notify_filled is a bit odd since nobody is actually notified --- Zlib/libuv is issuing "requests" to the buffer, but that buffer is part of its internal state.

As I understand it, notify_filled is notifying the IOBuffer that data has been added to its buffer by an external (libuv/zlib) process, and that it should therefore update its understanding of the state of the buffer.

Another way to look at this is to observe that Zlib.jl is doing its own buffering. It appears to be reusing IOBuffer, but it might as well be using its own byte vector. This is easy to miss, since it's currently possible to inherit most I/O functionality just by reading a byte at a time, but anything fancier you might want to do that assumes a buffer would have to be re-implemented inside Zlib.jl. This applies equally to the libuv code.

I'm not following you here. For Zlib.jl, at least, my understanding is that it couples zlib with an IOBuffer, so that the IOBuffer is, essentially, what is presented to the user, and when data is needed, zlib is called to fill that buffer. I'm pretty sure libuv works the same way.

For efficiency, zlib updates the IOBuffer buffer directly (using the unexported versions of the functions in this PR, with the understanding that they aren't exported and subject to change or go away). Reading of bytes or arrays from a zlib Reader makes sure the buffer is full enough and then passes the read request onto the IOBuffer, which returns the data to the user.

Beyond reading arrays, what fancier things do you envision the interface being able to do, or what hasn't been implemented yet to pass on to the IOBuffer that would be easier to simply do directly within Zlib.jl?

@JeffBezanson
Copy link
Member

Many fancier features have been proposed: automatic byte-swapping, mark/unmark for re-reading buffered data, etc.

My proposal is that "user" code (i.e. nearly all code that does I/O) only interact with IOBuffer. Then all the fancy features can be implemented once, in IOBuffer. IOBuffer can have a reference to an outside data source, and it requests more data from that source as needed. Then something like libuv or Zlib only needs to implement functions for reading and writing big chunks, and no longer has to worry about what read or write methods users might want, or might be needed for performance.

@kmsquire
Copy link
Member Author

Okay, that sounds good to me!

@kmsquire kmsquire closed this Dec 18, 2013
@kmsquire kmsquire deleted the kms/alloc_request branch February 3, 2014 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants