Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ticket 269: SignatureDoesNotMatch when getting logs from ceph bucket. #714

Closed
jhamilton1 opened this issue Jul 30, 2018 · 14 comments
Closed
Assignees
Labels
Area/Documentation Enhancement/User End-User Enhancement to Velero Needs Product Blocked needing input or feedback from Product

Comments

@jhamilton1
Copy link
Contributor

What steps did you take and what happened:
I setup ark to utilize the ceph object gateway. I created a "backups" bucket and created an access_key_id and secret_access_key that I plugged into the ark config map along with the s3Url for the ceph gateway.

I can create a backup without any problem. I verified the backup completed with the "ark get backup command. I also verified the backup by posting a GET to the ceph bucket. However, when I run "ark backup logs , I get an error stating the signature did not match.

What did you expect to happen:
When executing "ark backup logs I expected to see the logs for the backup I created.

The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)

  • kubectl logs deployment/ark -n heptio-ark

level=debug msg="Running processDownloadRequest" key=heptio-ark/backup5-20180730111323 logSource="pkg/controller/download_request_controller.go:190"

  • ark backup describe <backupname> or kubectl get backup/<backupname> -n heptio-ark -o yaml

backup5 Completed 2018-07-28 21:43:22 -0500 CDT 28d

  • ark backup logs <backupname>

An error occurred: request failed: SignatureDoesNotMatchtx00000000000000000017a-005b5f34fb-10eb-default10eb-default-default

  • ark restore describe <restorename> or kubectl get restore/<restorename> -n heptio-ark -o yaml
  • ark restore logs <restorename>

Anything else you would like to add:
I have a test cluster built with ark already deployed to the customer specs. I can also provide the ceph object gateway endpoint for testing purposes.

I am currently using ceph v10.2.9

It looks like v10.2.11 has some fixes for the signatureDoesNotMatch error. I can upgrade ceph to the latest version if needed.

Environment:

  • Ark version (use ark version): v0.8.1
  • Kubernetes version (use kubectl version): v1.7.5
  • Kubernetes installer & version: Kops v1.9.1
  • Cloud provider or hardware configuration: aws
  • OS (e.g. from /etc/os-release): Debian GNU/Linux 8 (jessie)
@ncdc
Copy link
Contributor

ncdc commented Jul 30, 2018

We had a report of this in Slack a few months ago. The user reported being able to fix it:

This issue was the v4 signnatures that ark is using. While the version of ceph we are using supports v4, the keystore didn’t (https://github.com/ceph/ceph/blob/master/src/rgw/rgw_auth_s3.cc#L1080). We created a local non-keystone S3 account for ceph and ark is purring like a kitten

(https://kubernetes.slack.com/archives/C6VCGP4MT/p1526335304000425)

@jhamilton1 could you please see if this information is helpful?

@jhamilton1
Copy link
Contributor Author

@ncdc I will take a look and see. Thanks!

@nrb
Copy link
Contributor

nrb commented Jul 30, 2018

Thanks for this report!

After some digging, the error for logs commands is coming from https://github.com/heptio/ark/blob/master/pkg/cmd/util/downloadrequest/downloadrequest.go#L128. Dropping some debugging code in here and working with an affected cluster, I see we're getting 403 for the resp.StatusCode.

I think the error message could be made more friendly, such that the 403 is recognized and the user is informed that access to the requested file was denied. I don't know if we want to be too detailed on failures, given this codepath is meant to apply to multiple providers, not just S3-compatible APIs.

In terms of making it work, I'm not sure how much we can do on the Ark side. I'll investigate the AWS SDK to see if we can possibly negotiate the signature version, but I'm not very optimistic.

In the meantime, would you mind trying with Ceph v10.2.11, @jhamilton1 ?

@jhamilton1
Copy link
Contributor Author

Not a problem @nrb, I will try with both upgrading ceph and trying the fix listed by ncdc

@rosskukulinski rosskukulinski added Area/Documentation Enhancement/User End-User Enhancement to Velero labels Jul 31, 2018
@rosskukulinski
Copy link
Contributor

Related to #549 (review all CLI error handling)

@nrb
Copy link
Contributor

nrb commented Jul 31, 2018

Looking at some of the AWS docs:

  • Some libraries can specify a signature version to use, largely for supporting older AWS regions.
  • The Go library's AWS client confguration doesn't provide a property to change the signature algorithm version.

So I think the best course of action may be the local, non-Keystone Ceph user and/or upgrading.

@jhamilton1 Were you using Keystone in your test cluster?

@jhamilton1
Copy link
Contributor Author

@nrb I did not implement Keystone in the ceph cluster.

@jhamilton1
Copy link
Contributor Author

@nrb I went back through the ceph docs. The first attempt at implementing the v4 signature functionality was in the "Jewel" release and still had some bugs. This would explain why I still was getting the "Signature" errors after I upgraded to the latest Jewel stable. I read through the Luminous release notes and decided to upgrade to the latest stable for that release. There were some issues with some of the Luminous minors. The signature error has been resolved after this upgrade and I am able to get the logs as expected.

@nrb
Copy link
Contributor

nrb commented Aug 1, 2018

Thanks a ton for this investigative work, @jhamilton1! When you have time, could you get the version number you were successful with, so we can document it?

@jhamilton1
Copy link
Contributor Author

No problem @nrb I upgraded to ceph v12.2.7

@jhamilton1
Copy link
Contributor Author

jhamilton1 commented Aug 1, 2018

@nrb do we want also to include using a non-keystone account as well in the ceph doc PR?

I also had to add this parameter to the ark config manifest.
s3Url: http://10.0.0.x:7480/backups

@rosskukulinski rosskukulinski added the Needs Product Blocked needing input or feedback from Product label Aug 6, 2018
@nrb nrb self-assigned this Sep 6, 2018
nrb pushed a commit to nrb/velero that referenced this issue Sep 6, 2018
@nrb
Copy link
Contributor

nrb commented Sep 6, 2018

@jhamilton1 I've created #823 to address the error specifically, and Ceph v12.2.7 was added to our support matrix.

Let me know on the PR if anything should change.

nrb pushed a commit to nrb/velero that referenced this issue Sep 7, 2018
@nrb
Copy link
Contributor

nrb commented Sep 7, 2018

@jhamilton1 The updated docs are now live at https://heptio.github.io/ark/v0.9.0/troubleshooting

@jhamilton1
Copy link
Contributor Author

Awesome, thanks @nrb!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area/Documentation Enhancement/User End-User Enhancement to Velero Needs Product Blocked needing input or feedback from Product
Projects
None yet
Development

No branches or pull requests

4 participants