-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Content.Chunk
a RetainableByteBuffer
#11598
Conversation
Small tweaks to the RBB API to make the concept more uniform throughout the codebase.
@sbordet please don't review saying "this method is not used". Just imagine that it is used and review on the basis of asking is it the right API in the write location. In fix/jetty-12/10541/byteBufferAccumulator4 I'm extending this with the goodness of byteBufferAccumulator2, but in a Mutable API. |
Small tweaks to the RBB API to make the concept more uniform throughout the codebase.
See #11599 for the Mutable API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the cleanups, and I also like where this is leading us.
jetty-core/jetty-io/src/main/java/org/eclipse/jetty/io/ArrayByteBufferPool.java
Outdated
Show resolved
Hide resolved
jetty-core/jetty-io/src/main/java/org/eclipse/jetty/io/ChunkAccumulator.java
Outdated
Show resolved
Hide resolved
@@ -192,7 +194,7 @@ static ByteBuffer asByteBuffer(Source source) throws IOException | |||
*/ | |||
static CompletableFuture<byte[]> asByteArrayAsync(Source source, int maxSize) | |||
{ | |||
return new ChunkAccumulator().readAll(source, maxSize); | |||
return asRetainableByteBuffer(source, null, false, maxSize).thenApply(rbb -> rbb.getByteBuffer().array()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mental note: asRetainableByteBuffer()
should have been called toRetainableByteBuffer()
: as
prefix for modifying the presentation (wrap/unwrap) and to
prefix for anything implying havier work like mem copies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've always thought that as
prefix should be used when returning a different view of the same object. The to
prefix should be used when making a new object from the current one.
So this could be thought of as a to
as it is creates a whole new object.... but that object also mutates the original object (by consuming all its input), so it is kind of a view onto the original object.... if the resulting RBB delays reading the source until it knows how it is going to be used, then it is really is a view onto the source. So as
works as well.
I'm enough on the fence not to disrupt things by changing the name at this point.
jetty-core/jetty-io/src/main/java/org/eclipse/jetty/io/Retainable.java
Outdated
Show resolved
Hide resolved
* @param length the maximum number of bytes to skip | ||
* @return the number of bytes actually skipped | ||
*/ | ||
default int skip(int length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one design issue that needs to be addressed: do we stick to int
for everything related to the RBB's size, do we move to long
or do we postpone this decision to later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good question.... kind of depends if this is a 12.0.x thing or a 12.1.x thing. If the later, we can go long! Let's leave this unresolved for a bit and think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is a new method, I have made this one long. But probably should change all to being longs.
} | ||
|
||
@Override | ||
public String toString() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think toString()
should be delegated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about with a formatted wrapper?
.../jetty-io/src/main/java/org/eclipse/jetty/io/internal/ContentSourceRetainableByteBuffer.java
Outdated
Show resolved
Hide resolved
.../jetty-io/src/main/java/org/eclipse/jetty/io/internal/ContentSourceRetainableByteBuffer.java
Outdated
Show resolved
Hide resolved
}; | ||
Content.Source.asRetainableByteBuffer(source, null, false, -1, promise); | ||
|
||
Retainable.ReferenceCounter counter = new Retainable.ReferenceCounter(3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exposing this constructor just for testing looks a bit dangerous IMHO. I'd prefer to use the default ctor and add a couple of retain()
calls in tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed most... there are still a few I need to review
@lorban thanks for the review. I've fixed most, but will ponder some for a while.... |
@gregw I had a look with @lorban and below my thoughts. I can see the opportunity to have a common super-interface for RBB and Chunk, so that we can have accumulators/aggregators (AAs) that can deal with both. Speaking with @lorban I could see a ray of light when analyzing The use cases around are:
I could not find any use case where a "composite" RBB is ever used as a composite. Converting a To me, I see However, I can see that having a common super-interface for RBB and interface Retainable {
interface WithByteBuffer extends Retainable {
BB getByteBuffer();
void writeToSink(Sink, Callback);
} Note that RWBB would allow implementations of AAs that work with RBB and Chunk. How would that work? WebSocket receives But we cannot, because the caller controls the "lastness", not each individual composite RWBB. // Write to file a list of AAs seen as RWBBs.
void writeToPath(Path file, List<RWBB> rwbbs, Callback finished) {
Sink sink = Sink.from(file);
sink.write(false, "[", NOOP);
// Must use IteratingCallback, below pseudo-code.
for (RWBB rwbb : rwbbs) {
// Controls "lastness", but copies data in the getBB() call.
sink.write(false, rwbb.getBB(), Callback.from(rwbb.release())).block();
// This does not control "lastness", but avoids data copy.
// rwbb.writeTo(sink, Callback.from(rwbb.release());
// This overrides "lastness", and avoids data copy, but may be confusing.
// rwbb.writeTo(sink, false, Callback.from(rwbb.release());
sink.write(false, ",", NOOP);
}
// Finish and close the file.
sink.write(true, "]", finished);
} So, In summary, I am not convinced by the extent of this PR, as it impacts too much. I can see a driver force for a smaller PR introducing Unless I missed, and the hypothetical use case is already there 😀 Lastly, I would avoid to add BB wrapper methods that have the |
@sbordet by saying this you indicate that you still do not understand the intent of this PR. Yes I know there are few users of the composite RBB in this PR. That is because I've kept it minimal. You've spent a lot of time taking about other ways it could be done and why they wouldn't work, but you have not really discussed this PR itself, which does work better than what we have today. I need some time to digest all that you wrote to try to work out why you are not understanding and see if I can find a better way to explain. |
@sbordet Thanks for the long detailed analysis, but I think I see a few things that you've got wrong in your analysis that lead you down some rabbit holes... so you ended up reviewing the wrong rabbit I think.
Note also lots of cleanup in this PR with methods like
They are different lasts, so there is no clash. The last in a chunk is indicating the last buffer in a sequence from a source. The last in the When a chunk is appended to a RBB, it is done so as a RBB and not as a chunk. So the last status of the chunk is irrelevant, even if the chunk is retained. For all we know, a retained chunk is being retained just for a small slice of its data, which might not represent the last byte of that chunk. The lasts are independent and there is no clash.
Don't get obsessed by the existing use-cases. They have all been formed by legacy code, mostly written before we had the option of retaining and even now with the API impedance we sometimes accumulating instead of aggregating. Ultimately if the source data is retainable, we will rarely ever want to aggregate. That should only ever be done if there is no suitable retainable available. This PR makes the retainable for the data more available, but it explicitly has not modified the use-cases. I've done some more use-cases in #11599, but even that is probably legacy stuff. Fundamentally, I don't think we really should know/care about Aggregators vs Accumulators, as we should accumulate when we can and aggregate if not possible... maybe even a mix of those if we have mixed sources. The only decisions the code should be making are: do I want to buffer? Do I want my buffer to grow to hold the whole content, or just parts?, Do I have a size limit? Ultimately the RBB code can then make an internal decision about aggregation vs accumulation. But we have more work to do on this to identify the usecases and to perhaps come up with the correct heuristics. Note that once we start retaining buffers all the way through, we might need to consider the efficiency of small retains. For example if we read a request into a 32KB buffer and then end up reading on 2KB of data and then retaining for only 10B, the protocol layer is currently getting a new buffer to continue reading into and will will have 32KB mostly empty buffer help by the application (this is a risk now, just more so with this PR). The RBB might decide to aggregate if it is only 10B. Or perhaps the protocol layer should continue with the buffer, using only the remaining 30KB? So there is lots more work to do in this area, and heuristics in RBB might not be able to solve all of it. But this PR at least removes the API impedance so we can retain easily if we want to. It's no good thinking we are safe from these issues because our API impedance makes it harder to do.
This PR doesn't have the use-cases. I actually wrote a comment to you in this PR saying "Please don't review saying: this is not used" but then deleted it, thinking it was too rude. I've unhidden the comment now. See it at the top of the PR :)
I agree that doing this directly would be strange. I think this would only be useful if there is some software layer in-between that takes only RBBs and you only have a Source, so you convert.... then later on the layer does a
Clear is a read operation! Clear simply moves the position to the limit, and is equivalent to
Well it started out that way, but we are now exposing
There is no point having a common interface unless there is usable API on it. Calling
Different last!
Naked BB come in two forms: ones backed by a BBPool of some time; and locally allocated ones. So wrapping a RBBs makes sense for the former (although better to get to any RBB it came from), and wrapping is not too expensive for the later, especially as it will provided
Different last!
What are the impacts? What existing APIs are changed by this PR in a dangerous/bad/impactful way? I'm actually thinking this is more of a 12.1.x change, but really there is nothing much in this PR that could not go into 12.0.x Rather than review your own ideas/variants can you review this actual PR and say what the impacts actually are?
ARGH! You do this to me all the time! I do a significant API refactor and include all the updated usages. You say: PR is too big and impossible to review. I split the PR into the minimal change and later PRs to use that change. You say: the API change is not motivated/used! Hence my hidden (now unhidden) comment saying please don't do that. Let's step back and not worry so much about the actual use-cases in the code, other than as approximate examples. Then let's design a really good buffer abstraction that:
Once we have that buffer abstraction, we can increase its usage over time.
The method So can you please have another look at this PR for what it is and not what you think it could/should be. I'm not saying it is perfect, but I'd really like to know what specifically in this PR you don't like... not what you don't like about your own ideas/variations on it. This PR/issue is not the highest priority and is almost certainly a 12.1.x thing, but let's not drop it, as at the very least there are some good cleanups in this PR even if we go no further. Let's hangout about it soon. |
I'm just going to summarize the objectives of this crusade!
|
…10541/byteBufferAccumulator0
What I generally like:
What makes me cautions:
If we're considering those changes for 12.1, I'm thinking we should charge ahead as it is the perfect time to make changes to semi-internal APIs. |
The boolean return of add was seldom being checked, so throw instead More tests
Closing in favour of a new PR with less history.... |
replaced by #11801 |
A minimal set of changes extracted from #11094 and variants in fix/jetty-12/10541/byteBufferAccumulator3 and fix/jetty-12/10541/byteBufferAccumulator4 in an attempt to find a rough consensus on some core API changes.