-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add virtual hosted-style support for signurl gen. #6183
Add virtual hosted-style support for signurl gen. #6183
Conversation
@frankyn Here is the implementation for generating signed URLs using virtual-hosted style hostnames. I haven't added any tests to the conformance tests library yet; I wasn't sure what order I should add things in, given that the conformance test data is shared across client libraries, but none of the other client libraries have this functionality yet. Please advise :) I manually tested this using a separate project: ...specifically, this module shows the new methods being used: |
Adds new methods to produce a SignUrlOption that results in generated signed URLs using virtual-hosted-style URLs (i.e. mybucket.storage.googleapis.com instead of storage.googleapis.com/mybucket). One option allows specifying the virtual hostname explicitly (for the case where someone might have a custom subdomain, with a bucket of the same name, CNAME'd to c.storage.googleapis.com), while the other will implicitly construct the hostname using the bucket from the passed-in BlobInfo. Addresses part of https://issuetracker.google.com/issues/130190655.
8251362
to
e112b08
Compare
Codecov Report
@@ Coverage Diff @@
## master #6183 +/- ##
============================================
- Coverage 47.51% 46.85% -0.67%
- Complexity 27460 28130 +670
============================================
Files 2526 2601 +75
Lines 274681 287320 +12639
Branches 31402 33375 +1973
============================================
+ Hits 130522 134629 +4107
- Misses 134527 142447 +7920
- Partials 9632 10244 +612
Continue to review full report at Codecov.
|
Friendly ping! Do either of you have any objections (usability, style, etc.) to the new methods? I've already gotten some user feedback after a previous draft, and as a result, I added in the method that allows creating the URL implicitly (not having to specify the full base URL as an argument). The only thing I thought about changing here was removing support for this for use with V2 signing, since it's not actually enforced for the signature (you always have to sign using the resource string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting discussion on design choice.
@@ -1035,7 +1035,8 @@ public static BlobListOption fields(BlobField... fields) { | |||
EXT_HEADERS, | |||
SERVICE_ACCOUNT_CRED, | |||
SIGNATURE_VERSION, | |||
HOST_NAME | |||
HOST_NAME, | |||
VIRTUAL_HOST_NAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@houglum, instead of creating a new option. Could you update the generated URL instead to use the virtual host name?
If it's considered better practice to use bucket.storage.googleapis.com/
, I'd push for implementations to use this format instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think changing the resulting URLs might break some users, e.g. those who have special networking rules set up? What if a user is utilizing VPC and has a rule to allow requests with the host set to storage.googleapis.com
? Suddenly, their requests to signed URLs begin to fail because they're using the host somebucket.storage.googleapis.com
.
I do agree that it's better practice to use the virtual-hosted style of URL though. If you think the potential to break existing users is minimal, I'd be fine with changing the default URL format to virtual-hosted style. If we do that, do we want to add an option to allow people to use the old path-style format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI for those following along: We've written up a doc to summarize our options and get input on what's the most desirable approach here.
This allows creating a signed URL in path-style (bucket in the path component of the URI instead of in the hostname, i.e. virtual hosted-style). Note that this was already the default behavior, but adding this method allows us to more easily switch the default behavior in the future; virtual hosted-style is considered best practice.
@frankyn - I added the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mainly have concerns with the naming of virtual host and path style. Otherwise it's looking solid to me.
PTAL and let me know what you think.
HOST_NAME | ||
HOST_NAME, | ||
PATH_STYLE, | ||
VIRTUAL_HOST_NAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the names, are these expected from product?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Followed up on Hangouts; pending feedback in the design doc. However, I've gone ahead and renamed these methods/options to be VIRTUAL_HOSTED_STYLE (and withVirtualHostedStyle) and PATH_STYLE (and withPathStyle).
* | ||
* @see <a href="https://cloud.google.com/storage/docs/request-endpoints">Request Endpoints</a> | ||
*/ | ||
public static SignUrlOption withVirtualHostName(String virtualHostName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced that this overload is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't need it unless someone asks; I was mostly trying to plan ahead in case anyone needed support for CNAME-based signed URLs. I went ahead and removed it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah gotcha, what about withHostName()
? Doesn't this cover this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I thought withHostName()
only was useful for devs, e.g. sending requests to staging/test endpoints... meaning this would be interchangeable with anywhere you'd use storage.googleapis.com
in a URL. But looking at #3140, it seems this was written to be used with CDNs.
But, what's confusing is... from all the info I can find (our docs, along with external CDN docs) on CNAME'ing and using CDNs with GCS, you're supposed to make the CDN or CNAME record to point at <bucket_name>.storage.googleapis.com
(virtual hosted-style URL)... but from reading the code, it's producing path-style URLs (https://myurl.somealias.com/this-bucket-name-is-in-the-path/object-name
).
From a semantics standpoint, this should be used in the manner I mentioned at the top. However, I'm not actually sure how it's being used by the user(s) that requested this functionality. I'll comment and ask on the issue they filed, but I'm not sure how likely I am to get a reply 🙃
Note that if they are using it in the manner that's being tested (bucket-in-path), then whenever we switch the default URL style to virtual hosted-style (in the future), it will break them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From looking at the tests again, when using withHostName(String)
, we're definitely expecting to replace the normal hostname and for the bucket to be in the path portion of the URL; this applies for scenarios where we're using withHostName()
and generating a path-style, V2-signed URL. None of the other scenarios are currently tested. That being said, I've done manual testing with other hostnames (i.e. against our test environment) and am confident that what I've got coded up is correct behavior. If users need a different host header to be used (but note that from all cases I can find, this would cause the signature to become invalid), they can override it using the withExtHeaders()
method.
- For V2 and V4 path-style URLs, the hostname should change, and the bucket should be in the path.
- It looks like we don't even bother signing the "host" header for V2 URLs. But for V4 URLs, the "host" header should be set to the new hostname. We haven't been doing that, but I just added a fix for that in another commit. This is another option combination that we need a test for.
- For V2 and V4 virtual hosted-style URLs, the hostname should have the bucket name (plus a period), as supplied in the
BlobInfo
, prepended to it, and the bucket should not appear in the path. We currently have no tests for virtual hosted-style URLs, but I've manually tested them with V2 and V4 signing, both with and without an alternative hostname (I used our test environment and its hostname for testing).- Again, V2 URLs don't care about the host since we don't sign it. But V4 URLs should set the host to the virtual hosted-style hostname; I've added a fix to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for finding the bug with cname usage with v4 signature and the host header.
IIUC, with the overloads you suggested the following would be simplified with the example below.
Without overloads
Storage storage = StorageOptions.getDefaultInstance().getService();
String bucketName = "my-bucket";
String objectName = "my-object";
// Define resource
BlobInfo blobinfo = BlobInfo.newBuilder(BlobId.of(bucketName, objectName)).build();
// Generate Signed URL
URL url =
storage.signUrl(blobinfo, 15, TimeUnit.MINUTES,
Storage.SignUrlOption.withV4Signature(),
Storage.SignUrlOption.withHostName("storage.example.com"),
Storage.SignUrlOption.withVirtualHostName());
// Effectively being:
// <bucket_name>.storage.example.com
With Overloads
Storage storage = StorageOptions.getDefaultInstance().getService();
String bucketName = "my-bucket";
String objectName = "my-object";
// Define resource
BlobInfo blobinfo = BlobInfo.newBuilder(BlobId.of(bucketName, objectName)).build();
// Generate Signed URL
URL url =
storage.signUrl(blobinfo, 15, TimeUnit.MINUTES,
Storage.SignUrlOption.withV4Signature(),
Storage.SignUrlOption.withVirtualHostName("storage.example.com"));
// Effectively using:
// <bucket_name>.storage.example.com
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "with overloads" part is currently incorrect; what I had before (supplying a hostname string arg to the withVirtualHostedStyle
method) would be the correct way to do the "with overloads" part, since the bucket name is the hostname, and you wouldn't want to prepend the bucket name again, e.g. ending up with subdomain.domain.tld.subdomain.domain.tld
instead of just subdomain.domain.tld
for your hostname.
I took out the overload in the last couple commits; the virtual hosted style currently assumes you won't be using a CNAME hostname. But, we can add the overload in a subsequent commit. Seems worth breaking that functionality out into its own PR. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acking we discussed this yesterday and splitting off overloads into a separate request is a follow-up action.
Per code review comments.
The Signature V2 doesn't require us to sign the "host" header, but the Signature V4 process requires it. We need to ensure we're populating and signing the same host header value as is used in the request URL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-- Additionally add an integration test.
* Generate a path-style URL, which places the bucket name in the path portion of the URL | ||
* instead of in the hostname. Note that this cannot be used alongside {@code | ||
* withVirtualHostName()}. Virtual hosted-style URLs, which can be used via the {@code | ||
* withVirtualHostName()} method, should generally be preferred instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an example of using withPathStyle()
in these docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added it in the signUrl
method's docs, since there were already examples of other options there.
* obtained from the resource passed in. Note that this cannot be used alongside {@code | ||
* withHostName()}. For V4 signing, this also sets the "host" header in the canonicalized | ||
* extension headers to the virtual hosted-style host, unless that header is supplied via the | ||
* {@code withExtHeaders()} method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an example of using withPathStyle()
in these docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added it in the signUrl
method's docs, since there were already examples of other options there.
I discovered a bug with one of my previous commits, where I was incorrectly determining whether to use Signature Version 2. This commit makes two new (private) methods that determine the correct default option to use, taking default values into account.
I amended the existing integration tests to loop over all the URL styles (path, virtual hosted) and test using each of them. The tests uncovered a small bug, which I added fixes for, encapsulated in a couple new methods with TODOs for issue #6362. |
Thanks @houglum, we'd need PM checkoff when v4 should be default and work backwards with the documented internal migration plan. |
Yep, I was just saying that for whenever we end up doing the switch, I've isolated a lot of the default-based logic into methods, so it should be easier down the line :) I think this is ready to be merged, unless you have any more questions/suggestions. Looks like the only presubmit failure is from the Cloud Spanner tests, which seem unrelated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One requested change, otherwise it lgtm.
* <p>Example of creating a signed URL that is valid for 2 weeks, using the default credentials | ||
* for signing the URL. | ||
* <p>Example of creating a signed URL that is valid for 1 week, using the default credentials for | ||
* signing the URL, the default signing method (V2), and the default URL style (path-style): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
V2 can go longer than 1 week. (also supported no expiration too)
* BlobInfo.newBuilder(bucketName, blobName).build(), | ||
* 1, TimeUnit.DAYS, | ||
* Storage.SignUrlOption.withVirtualHostedStyle()); | ||
* }</pre> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cold you provide an example of a result so people could distinguish easier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added example URLs in the other docstrings (for withPathStyle()
and withVirtualHostedStyle()
). It seems like the specifics belong there, rather than in the signUrl()
docstring. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @houglum for your patience.
googleapis/google-cloud-java#6183 was merged into google-cloud-java, and it changed some of the syntax (e.g. the method name for virtual hosted urls changed, and explicitly specifying the virtual hostname is no longer possible with that method). The syntax updates have been reflected in the main-module examples. This also updates the google-cloud-java submodule to use the merged "master" branch ("master" of my fork is up-to-date with "master" of the googleapis upstream as of today), along with changing main module to depend on the newly-updated version google-cloud-storage (1.93.1-SNAPSHOT).
Adds new methods to produce a
SignUrlOption
that results in generatedsigned URLs using virtual-hosted-style URLs (i.e.
mybucket.storage.googleapis.com/myobject
instead ofstorage.googleapis.com/mybucket/myobject
).One option allows specifying the virtual hostname explicitly (for the
case where someone might have a custom subdomain, with a bucket of the
same name, CNAME'd to c.storage.googleapis.com), while the other will
implicitly construct the hostname using the bucket from the passed-in
BlobInfo
.Addresses part of https://issuetracker.google.com/issues/130190655.