Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cas] Implement batch upload #307

Merged
merged 1 commit into from
Apr 27, 2021
Merged

[cas] Implement batch upload #307

merged 1 commit into from
Apr 27, 2021

Conversation

nodirg
Copy link
Contributor

@nodirg nodirg commented Apr 23, 2021

Partially implement scheduleUpload - for small blobs.
Use bundler.Bundler to bundle blobs together.

@google-cla google-cla bot added the cla: yes The author signed a CLA label Apr 23, 2021
@nodirg nodirg marked this pull request as ready for review April 23, 2021 18:30
@nodirg nodirg changed the base branch from master to retrier April 23, 2021 23:01
go/pkg/cas/client.go Outdated Show resolved Hide resolved
go/pkg/cas/upload.go Outdated Show resolved Hide resolved
go/pkg/cas/upload_test.go Outdated Show resolved Hide resolved
go/pkg/cas/upload_test.go Show resolved Hide resolved
@nodirg nodirg requested a review from atetubou April 24, 2021 02:03
bytesTransferred += r.Digest.SizeBytes
digestsTransferred++
}
atomic.AddInt64(&u.stats.Batched.Bytes, bytesTransferred)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use atomic package to update this value, better to make this private and provide accessor using atomic.Load*?
Or use package like https://pkg.go.dev/go.uber.org/atomic ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field is private until it is done updating. We don't update it after we publish it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that won't be matter in practice, but golang/go#5045 (comment) says

you shouldn't mix atomic and non-atomic accesses for a given memory word.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it says that and we indeed must follow those rules for concurrent reads and writes. But this is not the case here.

To be be clear: uploader is a private struct. Other packages do not have access to it. Once we return the stat, we no longer write.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think copying value by return is not guaranteed to be valid operation too if atomic is used there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Do you have links to some documentation to backup this up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is the only outstanding comment and I am pretty sure this code is correct. I am happy to fix it in another CL if it is not.

go/pkg/cas/upload.go Show resolved Hide resolved
go/pkg/cas/upload.go Outdated Show resolved Hide resolved
go/pkg/cas/upload.go Show resolved Hide resolved
go/pkg/cas/client.go Show resolved Hide resolved

// BatchUpdateBlobsBatchSize is the maximum number of blobs to upload in a
// single BatchUpdateBlobs RPC.
BatchUpdateBlobsBatchSize int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BatchBlobsMaxBatchSize? I think the RE specifies the same max for both upload and download.

Copy link
Contributor Author

@nodirg nodirg Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, but should they have the same defaults though? The default value for this field was designed for writes. Now the same value is used for reads. This seems a bit arbitrary?

go/pkg/cas/upload.go Outdated Show resolved Hide resolved
go/pkg/cas/upload.go Outdated Show resolved Hide resolved
@nodirg nodirg requested review from rubensf and atetubou April 27, 2021 16:18
go/pkg/cas/client.go Show resolved Hide resolved
go/pkg/cas/upload_test.go Show resolved Hide resolved
go/pkg/cas/upload_test.go Show resolved Hide resolved
go/pkg/cas/client.go Show resolved Hide resolved
@nodirg nodirg requested a review from rubensf April 27, 2021 16:28
go/pkg/cas/upload_test.go Show resolved Hide resolved
go/pkg/cas/upload_test.go Show resolved Hide resolved
go/pkg/cas/client.go Show resolved Hide resolved
@nodirg nodirg changed the base branch from retrier to master April 27, 2021 17:08
Partially implement scheduleUpload - for small blobs.
Use bundler.Bundler to bundle blobs together.
@nodirg nodirg merged commit 4cfed65 into master Apr 27, 2021
@nodirg nodirg deleted the batch-upload branch May 18, 2021 04:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes The author signed a CLA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants