Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for "requester pays" operations #3474

Closed
tseaver opened this issue Jun 5, 2017 · 30 comments
Closed

Add support for "requester pays" operations #3474

tseaver opened this issue Jun 5, 2017 · 30 comments
Assignees
Labels
api: storage Issues related to the Cloud Storage API. backend status: blocked Resolving the issue is dependent on other work. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@tseaver
Copy link
Contributor

tseaver commented Jun 5, 2017

The storage docs don't really explain this, but the JSON API methods have begun to accept a `userProject' query parameter (or is it a header? only its yak-shaver knows for sure). The point of it looks to be to bill any charges associated with the request to a different project than the one which owns the bucket in which it takes place. I can't tell how a bucket owner is supposed to configure things such that foreign projects can only access objects by passing this parameter/header.

See the g-c-dotnet PR implementing preliminary support.

And hre is the g-c-php PR.

@tseaver tseaver added the api: storage Issues related to the Cloud Storage API. label Jun 5, 2017
@tseaver tseaver self-assigned this Jun 5, 2017
@jdpedrie
Copy link

jdpedrie commented Jun 6, 2017

@tseaver, the bucket owner will enable requester pays by setting billing.requesterPays to true in the bucket settings. If this setting is toggled, any requests made to the bucket or an object in the bucket which originate from a non-bucket-owner will throw an error unless the userProject is sent.

https://github.com/GoogleCloudPlatform/google-cloud-php/pull/527/files#diff-703c9fa16eb822e79062a7936fceffe5R103

@tseaver
Copy link
Contributor Author

tseaver commented Jun 6, 2017

@jdpedrie Thanks. That seems less elegant than I had imaginged (e.g., there would be some ACL / IAM tweak which would assign permissions to a "user-is-paying" role).

The whole feature needs both narrative and API docs.

@tseaver
Copy link
Contributor Author

tseaver commented Jun 7, 2017

@jdpedrie Is there anywhere in the API docs which defines the billing field for buckets? Or how / when to populate the userProject header / query parameter / body field?

@tseaver tseaver added backend status: blocked Resolving the issue is dependent on other work. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. labels Jun 7, 2017
@jdpedrie
Copy link

jdpedrie commented Jun 7, 2017

Not that I've seen in the public docs. I shared a google doc with more information. (Check your email).

The REST discovery document will show which resources accept the userProject parameter (it's in the query string). Our implementation simply adds this field in to each request, if the bucket is constructed with requesterPays toggled on.

@tseaver
Copy link
Contributor Author

tseaver commented Jun 7, 2017

@jdpedrie Thanks muchly!

@tseaver
Copy link
Contributor Author

tseaver commented Jun 8, 2017

Implementation plan:

@lukesneeringer does that seem correct to you?

@tseaver
Copy link
Contributor Author

tseaver commented Jun 8, 2017

Note that system testing for this feature requires that a second, valid project be accessible to the account running the tests. @dhermes, @lukesneeringer how do you think we should arrange for that, both for local testing and for CI?

Update: we can test the creation of a bucket w/ requester_pays set without the extra project.

@tseaver
Copy link
Contributor Author

tseaver commented Jun 8, 2017

Does this feature require whitelisting? I just tried adding a system test which passed the billing/requesterPays property along as true and it fails with an unhelpfully-terse 400.

@jdpedrie
Copy link

jdpedrie commented Jun 8, 2017

@tseaver yes -- tested to verify. works on my whitelisted project, fails on personal project.

@dhermes
Copy link
Contributor

dhermes commented Jun 8, 2017

@tseaver We should probably just allow the system test to be skipped locally. It's not too difficult to get a 2nd project for CI.

The plan seems OK 👍

tseaver added a commit that referenced this issue Jun 9, 2017
Also, add 'requester_pays' argument to 'Client.create_bucket'.

Add a system test which exercises the feature.

Note that the new system test is skipped, because 'Buckets.insert' fails
with the 'billing/requesterPays' field set, both in our system tests and
in the 'Try It!' form in the docs.


Toward #3474.
@tseaver
Copy link
Contributor Author

tseaver commented Jun 28, 2017

Note that development of this feature is moved off onto the storage-requester-pays-feature branch until the feature becomes GA.

@frankyn
Copy link
Member

frankyn commented Aug 4, 2017

Hi @tseaver is it possible to merge this branch into the client library for sample support? Other client libraries have merged this feature merged into master.

@lukesneeringer
Copy link
Contributor

lukesneeringer commented Aug 8, 2017

@frankyn Is the feature public beta (or better) yet?

@lukesneeringer lukesneeringer removed the priority: p2 Moderately-important priority. Fix may not be included in next release. label Aug 9, 2017
landrito pushed a commit to landrito/google-cloud-python that referenced this issue Aug 21, 2017
Also, add 'requester_pays' argument to 'Client.create_bucket'.

Add a system test which exercises the feature.

Note that the new system test is skipped, because 'Buckets.insert' fails
with the 'billing/requesterPays' field set, both in our system tests and
in the 'Try It!' form in the docs.


Toward googleapis#3474.
landrito pushed a commit to landrito/google-cloud-python that referenced this issue Aug 22, 2017
Also, add 'requester_pays' argument to 'Client.create_bucket'.

Add a system test which exercises the feature.

Note that the new system test is skipped, because 'Buckets.insert' fails
with the 'billing/requesterPays' field set, both in our system tests and
in the 'Try It!' form in the docs.


Toward googleapis#3474.
landrito pushed a commit to landrito/google-cloud-python that referenced this issue Aug 22, 2017
Also, add 'requester_pays' argument to 'Client.create_bucket'.

Add a system test which exercises the feature.

Note that the new system test is skipped, because 'Buckets.insert' fails
with the 'billing/requesterPays' field set, both in our system tests and
in the 'Try It!' form in the docs.


Toward googleapis#3474.
tseaver added a commit that referenced this issue Sep 22, 2017
Also, add 'requester_pays' argument to 'Client.create_bucket'.

Add a system test which exercises the feature.

Note that the new system test is skipped, because 'Buckets.insert' fails
with the 'billing/requesterPays' field set, both in our system tests and
in the 'Try It!' form in the docs.

Toward #3474.
@tseaver
Copy link
Contributor Author

tseaver commented Sep 22, 2017

@lukesneeringer, @dhermes I have just refreshed the storage-requester_pays-feature branch.

I updated the one system test, previously skipped because the back-end needed us to be white-listed, to run unconditionally. We still need to figure out the CI sytem test story for testing that the user_project arguments work against a bucket created w/ requester_pays (requests passing user_project should be using a different project than the one owning the bucket).

@dhermes
Copy link
Contributor

dhermes commented Sep 25, 2017

@tseaver Let's coordinate the google-cloud-node as the "other project"?

@tseaver
Copy link
Contributor Author

tseaver commented Sep 25, 2017

We need access to run requests passing their project ID as userProject (and I assume we grant similar access for them).

@frankyn
Copy link
Member

frankyn commented Sep 25, 2017

Chiming in: @tseaver, which operations does the system test use? The non-owner project doesn't require whitelisting for requester pays, only the bucket owner project.

@tseaver
Copy link
Contributor Author

tseaver commented Sep 25, 2017

@frankyn The current system tests don't do anything more than create a bucket with requester_pays enabled. Apparently the project we use for CI (precise-truck-742) has been whitelisted, which allows them to pass.

However, to properly test the whole feature, we should add system tests which add / fetch / update object(s) in that bucket, passing user_project=<SOME-OTHER-PROJECT>. We can't do that without having access (both for the developers and for the CI account) to make those requests using the other project.

@frankyn
Copy link
Member

frankyn commented Sep 26, 2017

Thanks for clarifying the state of tests.

Follow-up question, by "having access" you mean grant roles/storage.objectAdmin for different object related permissions when using requester pays?

@tseaver
Copy link
Contributor Author

tseaver commented Sep 26, 2017

System tests run under a service account, associated with a given project. In order to exercise the APIs which take user_pays, that account needs to be "allowed" to pass in a different project than its "default" one, which doesn't seem like a grant of storage-specific roles/permissions on any given bucket / object: it is more something which would have to be set up at the "resource manager" layer (I think).

@frankyn
Copy link
Member

frankyn commented Sep 27, 2017

IIUC, you're looking for more information what is necessary for testing Requester Pays with multiple projects?

The secondary service account should have roles for Project Owner and Storage.Admin to its own Google Project for requester pays to bill the account when for example downloading an object from a requester pays enabled bucket. (Project Owner role assignment is at the Resource Manager level or in the Cloud Console).

A role for reading an object must be granted such as roles/storage.objectViewer to the secondary service account to download an object even when using the requester pays flag.

@tseaver
Copy link
Contributor Author

tseaver commented Sep 27, 2017

@frankyn I think I have not communicated my issue clearly. IIUC, when making API requests to a bucket w/ requesterPays enabled (or an object in that bucket), the caller passing userProject needs two kinds of authorization:

  • Permissions/roles to make the request to the API endpoint.
  • Authorization to charge that request to a user-supplied project. Given the explicit userProject query paramter being passed, I'm assuming that the project-to-be-charged is not (at least necessarily) the one primarily associated with the callers credentials: otherwise, the back-end could just charge the "primary" project without needing the userProject at all.

If my understanding is incorrect, please take that as a sign that there needs to be a "concepts" document available which lays the feature out clearly (AFAICT, the feature is still entirely undocumented, so my understanding is based on reverse engineering the concept from the discovery doc, plus what the other languages have done).

@frankyn
Copy link
Member

frankyn commented Sep 27, 2017

Following-up through email.

tseaver added a commit that referenced this issue Oct 2, 2017
Also, add 'requester_pays' argument to 'Client.create_bucket'.

Add a system test which exercises the feature.

Note that the new system test is skipped, because 'Buckets.insert' fails
with the 'billing/requesterPays' field set, both in our system tests and
in the 'Try It!' form in the docs.

Toward #3474.
tseaver added a commit that referenced this issue Oct 5, 2017
Also, add 'requester_pays' argument to 'Client.create_bucket'.

Add a system test which exercises the feature.

Note that the new system test is skipped, because 'Buckets.insert' fails
with the 'billing/requesterPays' field set, both in our system tests and
in the 'Try It!' form in the docs.

Toward #3474.
tseaver added a commit that referenced this issue Oct 10, 2017
Also, add 'requester_pays' argument to 'Client.create_bucket'.

Add a system test which exercises the feature.

Note that the new system test is skipped, because 'Buckets.insert' fails
with the 'billing/requesterPays' field set, both in our system tests and
in the 'Try It!' form in the docs.

Toward #3474.
tseaver added a commit that referenced this issue Oct 10, 2017
Also, add 'requester_pays' argument to 'Client.create_bucket'.

Add a system test which exercises the feature.

Note that the new system test is skipped, because 'Buckets.insert' fails
with the 'billing/requesterPays' field set, both in our system tests and
in the 'Try It!' form in the docs.

Toward #3474.
tseaver added a commit that referenced this issue Oct 12, 2017
* Add '{Bucket,Blob}.user_project' properties, and pass the corresponding
  'userProject' query parameter in API requests.

Closes #3474.
@vsoch
Copy link

vsoch commented Mar 18, 2018

Will this ever be extended to include storage of the objects themselves? Or (to Branch out a little more distance) to the deployment of an instance on Google Compute Engine? The use case is a project space that offers a general service that brings up instances that then save results to storage. It would be amazing if a user could simply run the tool and provide their credentials without needing to worry about the details of setting up components of their own Google project. For academic groups this would be a game-changer because we could make tools for others to use and we wouldn't run out of money because the cost would be distributed.

@frankyn
Copy link
Member

frankyn commented Mar 20, 2018

Hi @vsoch, thanks for reaching out! Could you provide more context around

ever be extended to include storage objects themselves`?

I didn't quite understand the use-case. If you provide more information that would be helpful as well.

Thank you!

@vsoch
Copy link

vsoch commented Mar 20, 2018

Hey @frankyn sure thing! Right now with requester pays, let's say that I have a project space bucket called BlueBucket. BlueBucket is tied to a web service where people can browse and download their favorite color files. I put an awesome file there, RedFile, and then per the documentation if I turn on requester pays, some of the requesters pay for the charges to interact with it. But I'm still paying these things:

image
(and this is my reference to "include storage objects themselves")

Now let's say that many others contribute GreenFile, OrangeFile, and the way this is possible is via my application authenticating and handling the upload. This means that I am logically paying for storage, and the best I can do is have those that want to download / otherwise request GreenFile and OrangeFile to pay for that! But this gets poblematic (for me) over time because of the costs of RedFile, GreenFile, and OrangeFile are mostly on me.

Instead I would want a distributed cost model. Since the user (let's call him OrangeMan) of OrangeFile is likely to also be the primary requester, I would want to provide him the service if he is responsible for payment of requests to use that file and for the storage itself. This way, we have a bunch of files in the same bucket still, available for the service, but the cost is distributed among the users.

The real world scenario is at Stanford (or other academic places) where we are building tools for scientific containers. The containers are huge, and the tools are basic things like a registry. I would want to provide the tool for Stanford, but have labs (each signed up with their own Google project space accounts) to take responsibility for charges for their images.

Does that make sense?

@frankyn
Copy link
Member

frankyn commented Apr 3, 2018

IIUC, you want to distribute costs of storing objects in a bucket instead of the project owning the bucket incurring the storage costs.

Real use case, a lab will commonly use their own data and they would also like to share the data with other labs in a "registry" where other labs could use their data.

Follow-up:
There's no way to do this right now only using the GCS API, and I'll forward this feedback to the GCS team.

Although, you could do this using GCP + GCS by writing a registry software on top. The registry would maintain a list of available requester pays buckets shared by different labs. Effectively, having each lab pay storage costs for their data while other labs pay for request operations.

@vsoch
Copy link

vsoch commented Apr 14, 2018

Thanks, I'll come back here and check every so often for updates. I would want it scoped to a single bucket (for a consistently predictable namespace) so the second ideal would be a good workaround, but I'd much prefer the first (and will happily wait!)

@frankyn
Copy link
Member

frankyn commented Apr 26, 2018

@vsoch, spoke with the Product Manager, and there was mention of a project focused around this, but there's no set timeline in the foreseeable future at this time. I'd suggest to try workaround at this time.

I may lose this thread in the future, so I'd suggest keeping on eye on GCS Release Notes. Thanks for raising the question, cheers!

@vsoch
Copy link

vsoch commented Apr 26, 2018

Will do! Thanks muchly @frankyn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. backend status: blocked Resolving the issue is dependent on other work. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

6 participants