Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create signed URL V4 conformance tests #2250

Merged
merged 25 commits into from
Mar 28, 2019
Merged

Conversation

rsimion
Copy link
Contributor

@rsimion rsimion commented Mar 18, 2019

Fixes #2173 and fixes #2350.


This change is Reviewable

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Mar 18, 2019
Copy link
Contributor

@coryan coryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is headed in the right direction, but I think could be simplified a bit.

Reviewed 6 of 6 files at r1.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @rsimion)


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 37 at r1 (raw file):

/// Store the file names captured from the command-line arguments.
class ObjectTestEnvironment : public ::testing::Environment {

We recently changed the style of passing command-line arguments, it is even less code than before, please look at the other integration tests for examples.


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 76 at r1 (raw file):

    return;
  }
  nl::json jsonObj;

The naming convention for local variables is snake_case. Please fix this one and the ones below.


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 83 at r1 (raw file):

  std::string kJsonKeyfileContentsForV4 = file_contents;

  auto creds = oauth2::CreateServiceAccountCredentialsFromJsonContents(

I think there is one of these functions that creates directly from a file:

https://github.com/coryan/google-cloud-cpp/blob/1d5d58d223793e460bb0010f4a012ecec7ba0c4e/google/cloud/storage/oauth2/google_credentials.h#L77

You can save yourself some code I believe.


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 99 at r1 (raw file):

  nl::json jArray;
  nl::json jObj;

Why declare this variable outside the loop?


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 102 at r1 (raw file):

  jArray = nl::json::parse(ifstr);

  for (nl::json::iterator it = jArray.begin(); it != jArray.end(); ++it) {

I think you can say for (auto const& jObj : jArray) ?


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 104 at r1 (raw file):

  for (nl::json::iterator it = jArray.begin(); it != jArray.end(); ++it) {
    jObj = *it;
    if (jObj["method"] == "GET") {

Is there any reason to only test one verb per-function? Seems like the code below does not care about the verb at all.


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 108 at r1 (raw file):

      std::string const bucket_name = jObj["bucket"];
      std::string const object_name = jObj["object"];
      std::string const date = jObj["timestamp"];

You are ignoring the headers field on jObj. Do the tests even pass? In any case, we should use the headers via AddExtensionHeader... which I just realized might be hard.


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 123 at r1 (raw file):

      std::string const expected = jObj["expectedUrl"];
      EXPECT_EQ(expected, *actual);

It would be useful to print the number or even the contents of jObj on failure.


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 129 at r1 (raw file):

}

TEST_F(ObjectIntegrationTest, V4SignPut) {

Why is this called V4SignPut?


google/cloud/storage/tests/run_integration_tests_testbench.sh, line 104 at r1 (raw file):

echo "Running Storage integration tests against local servers."
./client_sign_url_integration_test "${TEST_ACCOUNT_FILE}" "${TEST_DATA_FILE}"
# The tests were successful, so disable dumping of test bench log during

Leave a blank line here please.

@codecov
Copy link

codecov bot commented Mar 18, 2019

Codecov Report

Merging #2250 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2250   +/-   ##
=======================================
  Coverage   92.57%   92.57%           
=======================================
  Files         307      307           
  Lines       19240    19240           
=======================================
  Hits        17811    17811           
  Misses       1429     1429

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bf6e35f...307a7a2. Read the comment docs.

@rsimion
Copy link
Contributor Author

rsimion commented Mar 18, 2019

@coryan

Is there any reason to only test one verb per-function? Seems like the code below does not care about the verb at all.
If it finds the method, then it breaks out of the for loop via return. We should probably use a different way to differentiate between all those objects

You are ignoring the headers field on jObj. Do the tests even pass? In any case, we should use the headers via AddExtensionHeader... which I just realized might be hard.
V4SignPost uses headers

Why is this called V4SignPut?
I meant V4SignPost. On my local I also added a V4SignPut but only on my local.

It would be useful to print the number or even the contents of jObj on failure.
Something like
if(expected != *actual){
std::cout << j_obj.dump(4) << '\n';
}
?

or is there any example to follow?

@rsimion
Copy link
Contributor Author

rsimion commented Mar 18, 2019

@coryan

Note:
There are 2 issues with this json library

  1. it does not parse comments
    Strip comments / Minify nlohmann/json#294

  2. it does not parse the last comma(if any)
    https://github.com/googleapis/google-cloud-dotnet/blob/master/apis/Google.Cloud.Storage.V1/Google.Cloud.Storage.V1.Tests/UrlSignerV4TestData.json

I removed all the comments and the last comma from that file.
Also for the timestamp I used
"timestamp": "2019-02-01T09:00:00Z"

instead of
"timestamp": "20190201T090000Z"

@coryan
Copy link
Contributor

coryan commented Mar 18, 2019

Can you script the work to remove the comments, and trailing commas? Or you can load the file into a string and do it all in C++, both are fine, but manually changing the file gets boring pretty fast.

Same thing with the date format. Can you either script the change or modify the program to convert the format to something we know how to parse?

@coryan
Copy link
Contributor

coryan commented Mar 18, 2019


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 87 at r2 (raw file):

      std::string const expected = j_obj["expectedUrl"];
      EXPECT_EQ(expected, *actual);
      return;

Oh, I just noticed that. We want to run all the tests in the file, not just the first one. Please remove this line.

@rsimion
Copy link
Contributor Author

rsimion commented Mar 18, 2019

I will try to load the file into a string and do it all in C++.
Same for the date format.

@rsimion
Copy link
Contributor Author

rsimion commented Mar 19, 2019

@coryan
Is that json file reliable?
https://github.com/googleapis/google-cloud-dotnet/blob/master/apis/Google.Cloud.Storage.V1/Google.Cloud.Storage.V1.Tests/UrlSignerV4TestData.json

because for the first two GET objects with only the expiration and description being different, the expectedUrl is different from the actual(the test fails for the second GET), that is why I added the description as a filter for now.
POST is fine.

Also some of the GET objects have headers(the test fails) so we have to account for them. Working on it...

Copy link
Contributor

@coryan coryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that json file reliable?

It is what the C# folks are using in their tests.

because for the first two GET objects with only the expiration and description being different, the expectedUrl is different from the actual(the test fails for the second GET), that is why I added the description as a filter for now.

The second GET:

https://github.com/googleapis/google-cloud-dotnet/blob/e918df5a988f53ed71cebf708a0dd06bed8bef43/apis/Google.Cloud.Storage.V1/Google.Cloud.Storage.V1.Tests/UrlSignerV4TestData.json#L42

Uses 20190301T090000Z as the date (note the 03 for the month), but your JSON file has 2019-02. I think you should revert all the manual changes to that file.

Also some of the GET objects have headers(the test fails) so we have to account for them. Working on it...

Yes.

Reviewed 1 of 2 files at r2, 2 of 2 files at r4.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @rsimion)


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 63 at r4 (raw file):

  }

  nl::json json_array;

Why not just:

nl::json json_array = nl::json::parse(ifstr);

google/cloud/storage/tests/client_sign_url_integration_test.cc, line 91 at r4 (raw file):

}

TEST_F(ObjectIntegrationTest, V4SignPost) {

You should be able to merge these two tests into a single one.


google/cloud/storage/tests/UrlSignerV4TestData.json, line 28 at r4 (raw file):

        "method": "POST",
        "expiration": 10,
        "headers": {

Note that some of the tests include "headers". You are not processing the headers in the loops above.


google/cloud/storage/tests/UrlSignerV4TestData.json, line 132 at r4 (raw file):

        "method": "GET",
        "expiration": 10,
        "timestamp": "2019-02-01T09:00:00Z",

I thought you were going to

@rsimion
Copy link
Contributor Author

rsimion commented Mar 19, 2019

@coryan

I rushed. My mistake.

@rsimion
Copy link
Contributor Author

rsimion commented Mar 19, 2019

@coryan

I broke it up into 3 functions . I can combine them later.
Functions with headers
Functions without headers
Special function with headers : "Customer-supplied encryption key"

I put comments all over the places.
There is also a function that does not work and is commented out : "List Objects"
but there is an explanation for that.

There are 2 functions there are not supposed to work, or I do not know how to pass the arguments
"Multi-value headers"
"Trimming of multiple header values"

https://github.com/googleapis/google-cloud-dotnet/blob/e918df5a988f53ed71cebf708a0dd06bed8bef43/apis/Google.Cloud.Storage.V1/Google.Cloud.Storage.V1.Tests/UrlSignerV4TestData.json#L42

I left a std::cout in each function to make it easy to understand when checking the json file.
Now I am working on loading the file into a string then turn it into a json object and change the date format.

@coryan
Copy link
Contributor

coryan commented Mar 19, 2019


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 321 at r5 (raw file):

/*
//This does not work the object is empty

What exactly does not work? Does our code return an error? The generated URL does not match?

@rsimion
Copy link
Contributor Author

rsimion commented Mar 19, 2019

yes, the generated URL does not match the expected one for that object.

@coryan
Copy link
Contributor

coryan commented Mar 19, 2019

Okay, let's consolidate the code to a single loop, and skip the List Objects case for now.

@rsimion
Copy link
Contributor Author

rsimion commented Mar 20, 2019

@coryan
Left std::cout in for now.

Copy link
Contributor

@coryan coryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 6 files reviewed, 17 unresolved discussions (waiting on @coryan and @rsimion)


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 41 at r6 (raw file):

    : public google::cloud::storage::testing::StorageIntegrationTest {
 protected:
  std::string trimmedString(std::string data_file) {

I think this is a good effort, but removing comments and trailing commas will quickly become a big effort. Essentially you need a JSON5 parser, which is not something that can be done in a small function. I think we should consider doing the python script approach, download the file and apply a JSON5 to JSON conversion using something like:

https://pypi.org/project/json5/

This one-liner does the trick (it assumes you did pip install json5):

python -c 'import json5; import json; d = open("UrlSignerV4TestData.json"); js5 = json5.load(d); print(json.dumps(js5, indent=4));'

it should be relatively easy to download the file, apply this JSON5 -> JSON conversion before committing it to the repository.

The rest of the review is superficial because I think we should change the approach.


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 46 at r6 (raw file):

      // If the file does not exist, or
      // if we were unable to open it for some other reason.
      std::cout << "Cannot open credentials file " + data_file + '\n';

Is this the right error message?


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 60 at r6 (raw file):

    bool check_comment = false;

    for (int i = 0; i < from_file_len; i++) {

Why not use the existing functions in C++:

std::string::size_type start = 0;
auto pos = from_file.find('//', 0);

That would avoid many of the problems below:


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 60 at r6 (raw file):

    bool check_comment = false;

    for (int i = 0; i < from_file_len; i++) {

The type to iterate over a string is std::string::size_type.


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 62 at r6 (raw file):

    for (int i = 0; i < from_file_len; i++) {
      // Check for //
      if (i == 0 && from_file[i] == '/' && from_file[i + 1] == '/') {

This is undefined behavior (UB) if i == from_file_len-1


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 76 at r6 (raw file):

      // Do not want to remove https://
      if (from_file[i] == '/' && from_file[i + 1] == '/' &&
          from_file[i - 1] != ':') {

This is UB is i == 0


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 92 at r6 (raw file):

    int final_comma_position = -1;
    char square_bracket = ']';

These should be char const, and they are a good idea, so why not char const slash = '/' earlier?


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 98 at r6 (raw file):

    for (int i = trimmed_len - 1; i > 0; i--) {
      if (isspace(trimmed[i])) {

I think that should be std::isspace(). And you need to #include <cctype>.


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 128 at r6 (raw file):

    // int0 "2019-02-01T09:00:00Z"
    std::string timestamp_string;
    int len = original_string.length();

I think you can change this to something like:

std::string TimestampToRfc3339(std::string ts) {
    if (ts.size() != 16) { throw ...; }
    return ts.substr(0, 4) + '-' + ts.substr(5, 2) + '-' + ts.substr(7, 2) + 'T' + ...;
}

google/cloud/storage/tests/client_sign_url_integration_test.cc, line 187 at r6 (raw file):

  for (auto const& j_obj : json_array) {
    std::string const method_name = j_obj["method"];  // GET

Could be more than just GET right?


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 199 at r6 (raw file):

    // Check for the 'headers' field in each j_obj
    for (auto& header : j_obj.items()) {

Consider refactoring all of this to a function that computes all the headers as a vector. Then the code becomes something like:

  std::vector<std::pair<std::string, std::string>> headers = ExtractHeaders(j_obj); 
  StatusOr<std::string> actual;
  if (headers.size() == 0) {
    actual =client.CreateV4SignedUrl(
                method_name, bucket_name, object_name,
                SignedUrlTimestamp(internal::ParseRfc3339(date)),
                SignedUrlDuration(valid_for),
                AddExtensionHeader("host", "storage.googleapis.com"));
  } else if (headers.size() == 1) {
   actual =client.CreateV4SignedUrl(
                method_name, bucket_name, object_name,
                SignedUrlTimestamp(internal::ParseRfc3339(date)),
                SignedUrlDuration(valid_for),
                AddExtensionHeader("host", "storage.googleapis.com"),
                AddExtensionHeader(headers[0].first, headers[0].second));
  }
  ... other cases here... fail the test if the number of headers is too large.

google/cloud/storage/tests/client_sign_url_integration_test.cc, line 204 at r6 (raw file):

      if (header_key == "headers") {
        has_headers = "yes";

If this is a bool why do you need a string for it?


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 257 at r6 (raw file):

          }

          // 2     key : [value] 	pairs

Remote the inline tab, spaces only please.

@rsimion
Copy link
Contributor Author

rsimion commented Mar 25, 2019

@coryan
Maybe on Wednesday we can talk for about 5 minutes after the meeting is over.

I replaced the json file with

https://github.com/googleapis/google-cloud-dotnet/blob/master/apis/Google.Cloud.Storage.V1/Google.Cloud.Storage.V1.Tests/UrlSignerV4TestData.json

Right now everything works(at least on my local) for headers like

"Headers should be trimmed"
        "headers": {
            "leading": [ "    xyz" ],
            "trailing": [ "abc    " ],
            "collapsed": [ "abc    def" ]
        },

with only one value per key. I added another function to check for headers only, using a different approach.

There are several objects for example that have more than one value per key:

"Multi-value headers"

        "headers": {
            "foo": [ "foo-value1", "foo-value2" ],
            "bar": [ "bar-value1", "bar-value2" ]
        },

and

"Trimming of multiple header values"
      "headers": {
            "foo": [ "  abc  ", "  def  ", "  ghi  jkl  " ]
        },

I am not sure how they are supposed to work for us. I tried concatenating their values and it did not work.

That is why I used

std::vector<std::pair<std::string, std::vector<std::string>>>

instead of

std::vector<std::pair<std::string, std::string>>

because
headers.front().second.size() == 1 allows me to filter out the all the headers with more than one value per key(see above)

Maybe on Wednesday we can talk for about 5 minutes after the meeting is over.

@frankyn
Copy link
Member

frankyn commented Mar 25, 2019

@rsimion short update:
We (me and @jskeet) discussed the multi-valued headers and @jskeet will send out a PR tomorrow to update them to simple key/value pairs (String/String).

Thank you for your patience.

@frankyn
Copy link
Member

frankyn commented Mar 26, 2019

@rsimion changes were made to conformance tests and are now merged into master. Please move forward.

Thank you for your patience.

@rsimion
Copy link
Contributor Author

rsimion commented Mar 27, 2019

@frankyn
thanks

@rsimion
Copy link
Contributor Author

rsimion commented Mar 27, 2019

@coryan
you were right the order does not matter, my mistake.

Copy link
Contributor

@coryan coryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r8, 2 of 2 files at r10, 1 of 1 files at r11.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @rsimion)


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 68 at r10 (raw file):

          // not in the order they are in the file

          //  Inside the file

We can remove the examples, maybe just say:

// The keys are returned in alphabetical order by nlohmann::json, but the order does not matter when creating signed urls.

google/cloud/storage/tests/client_sign_url_integration_test.cc, line 100 at r10 (raw file):

    // If the file does not exist, or
    // if we were unable to open it for some other reason.
    std::cout << "Cannot open credentials file " + data_file + '\n';

This is not a credentials file, is it? In any case, use something like ASSERT_TRUE(ifstr.is_open()) because we want the test to fail in this case.


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 118 at r10 (raw file):

    std::string const date = TimestampToRfc3339(j_obj["timestamp"]);

    int validInt = j_obj["expiration"];

Please remember the naming conventions, that should be valid_int, or better yet: expiration_seconds or something like that. You can also merge this line and the next I think.


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 132 at r10 (raw file):

          SignedUrlDuration(valid_for),
          AddExtensionHeader("host", "storage.googleapis.com"));
      if (object_name != "") {

Let's create a separate bug for why the case with object_name == "" does not work and put a TODO(#new-bug-number) here.


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 140 at r10 (raw file):

        EXPECT_EQ(expected, *actual);
      }
    } else if (headers.size() > 0) {

Why not simply:

if (headers.size() == 0) {
// stuff ...
} else if (headers.size() == 1) {
// stuff for 1 headers
} else if (headers.size() == 2) {
// stuff for 2 headers
} else if (headers.size() == 3) {
// stuff for 3 headers
} else {
// error, do not know how to deal with 4 headers.
}

EXPECT_STATUS_OK(actual);
// other tests about actual here.

@rsimion
Copy link
Contributor Author

rsimion commented Mar 27, 2019

@coryan
Created a new issue #2350
to address the bug when the object_name in CreateV4SignedURL() is empty.

@rsimion
Copy link
Contributor Author

rsimion commented Mar 27, 2019

@coryan
After you review my commit, please take a look at what @remyabel found in #2350.

Copy link
Contributor

@coryan coryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last nit. I will take a look at 2350, but we can commit this as is.

Reviewed 1 of 1 files at r12.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @rsimion)


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 64 at r12 (raw file):

          // the order does not matter when creating signed urls. in
          // alphabetical order

The "in alphabetical order" is now wrong?

Copy link
Contributor

@coryan coryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed a few things in the last review. I checked twice, I think these are it.

Reviewed 1 of 1 files at r13.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @rsimion)


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 54 at r13 (raw file):

    std::vector<std::pair<std::string, std::string>> headers;

    for (auto& header : j_obj.items()) {

Why do we need this for loop? We only use the j_obj["headers"] item, so we can just skip the loop altogether, right?


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 66 at r13 (raw file):

          key_name = x.key();
          value_name = x.value();
          headers.emplace_back(key_name, value_name);

Argh, really sorry I missed this earlier. Can you just write: headers.emplace_back(x.key(), x.value()) and we can remove the key_name and value_name variables?


google/cloud/storage/tests/client_sign_url_integration_test.cc, line 110 at r13 (raw file):

          AddExtensionHeader("host", "storage.googleapis.com"));

      // TODO(#2350)

Say what action to take // TODO(#2350) - when that bug is fixed we can remove this if() block.

Copy link
Contributor

@coryan coryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r14.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@ghost
Copy link

ghost commented Mar 28, 2019

@rsimion
Copy link
Contributor Author

rsimion commented Mar 28, 2019

@coryan
#2350 is fixed now thanks to @remyabel
I have updated both files. I am testing now and I will push my changes soon. Everything works on my local.

Copy link
Contributor

@coryan coryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r15.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@coryan coryan merged commit 88fefae into googleapis:master Mar 28, 2019
@rsimion rsimion deleted the storage_branch branch March 28, 2019 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement. 🚨 This issue needs some love.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

client.CreateV4SignedUrl() does not work when object_name == "" Create signed URL V4 conformance tests.
5 participants