Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve S3, parquet, and SeekableChannelsProvider #5137

Merged
merged 14 commits into from
Feb 14, 2024

Conversation

devinrsmith
Copy link
Member

Fixes #5096

chipkent
chipkent previously approved these changes Feb 12, 2024
Copy link
Member

@chipkent chipkent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python LGTM. I have not reviewed the rest.

Copy link
Member

@rcaudy rcaudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review.

Util/channel/build.gradle Show resolved Hide resolved
* Return the held buffer to its pool, and cause subsequent calls to {@link #get()} to return {@code null}
*/
void close();
public ByteBuffer take(final int size) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still somewhat inclined to push for a reference-counted PooledReference<ByteBuffer> or similar, and removing the give method from this class. This let's us guarantee safety when code uses the proper acquire and release pattern.

Copy link
Member

@rcaudy rcaudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More partial review

implementation 'software.amazon.awssdk:s3'
implementation 'software.amazon.awssdk:aws-crt-client'
//implementation 'software.amazon.awssdk:netty-nio-client'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should benchmark?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need better way to benchmark outstanding branches...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stan has been working on this, but I don't think he has s3 reading benchmarks yet.

Copy link
Member

@rcaudy rcaudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do better than relying on a chain of completable futures for cleanup.


@Override
public synchronized void onNext(ByteBuffer byteBuffer) {
buffer.put(byteBuffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could NPE in the "limbo" case. Do we care?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The likely alternative is to "gracefully" handle when buffer is null, and do localProducer.completeExceptionally. Given I don't expect this to happen, I think hard NPE here is more appropriate (in which case, I expect onError to be called regardless, and that will do the localProducer.completeExceptionally).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it depends on whether we're expecting log spam that will scare/annoy users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be any log spam afaik.

@devinrsmith devinrsmith requested a review from rcaudy February 14, 2024 01:22
Copy link
Member

@rcaudy rcaudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

Copy link
Member

@rcaudy rcaudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@devinrsmith devinrsmith requested a review from rcaudy February 14, 2024 21:13
@devinrsmith devinrsmith requested a review from rcaudy February 14, 2024 21:51
@devinrsmith devinrsmith requested a review from chipkent February 14, 2024 22:06
@devinrsmith devinrsmith enabled auto-merge (squash) February 14, 2024 22:12
@devinrsmith devinrsmith merged commit 8715076 into deephaven:main Feb 14, 2024
18 checks passed
@devinrsmith devinrsmith deleted the s3-fix-and-refactor branch February 14, 2024 22:35
@github-actions github-actions bot locked and limited conversation to collaborators Feb 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

s3 read_ahead_count and reading errors
3 participants