Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Snapshot support #25

Closed
3 of 4 tasks
leakingtapan opened this issue Sep 28, 2018 · 15 comments
Closed
3 of 4 tasks

Implement Snapshot support #25

leakingtapan opened this issue Sep 28, 2018 · 15 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@leakingtapan
Copy link
Contributor

leakingtapan commented Sep 28, 2018

Driver implementation should support following operations:

References:

@leakingtapan leakingtapan added the feature New feature to implement label Sep 28, 2018
@leakingtapan leakingtapan added this to the 0.3-alpha milestone Sep 28, 2018
@leakingtapan leakingtapan changed the title Add Snapshot support Implement Snapshot support Oct 3, 2018
@bertinatto
Copy link
Member

CC @tsmetana

@bertinatto
Copy link
Member

/assign @tsmetana

@k8s-ci-robot
Copy link
Contributor

@bertinatto: GitHub didn't allow me to assign the following users: tsmetana.

Note that only kubernetes-sigs members and repo collaborators can be assigned.
For more information please see the contributor guide

In response to this:

/assign @tsmetana

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@leakingtapan
Copy link
Contributor Author

Punt the feature to beta release

@leakingtapan leakingtapan modified the milestones: alpha, beta Nov 9, 2018
@leakingtapan leakingtapan added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Nov 16, 2018
@leakingtapan
Copy link
Contributor Author

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 19, 2018
@leakingtapan leakingtapan removed the feature New feature to implement label Nov 19, 2018
@tsmetana
Copy link
Contributor

First version (does not implement snapshot listing yet): https://github.com/tsmetana/aws-ebs-csi-driver/tree/snapshots-updated

It's based on PR#122 (CSI-1.0) so I will wait with my PR.

@bertinatto
Copy link
Member

@tsmetana: we just merged my PR into next branch. If you want to proceed with you PR you can point it against next as well.

@tsmetana
Copy link
Contributor

Thanks. Created the promised PR.

@leakingtapan
Copy link
Contributor Author

/assign @dkoshkin
Dimitri will help debugging issues related to snapshot

@tsmetana
Copy link
Contributor

tsmetana commented Feb 12, 2019

OK. I'm not sure which is the right place to discuss this... What is happening with the snapshots: in the waitForSnapshotCreate there is an truncated exponential backoff with a condition function that basically polls for the snapshot unitl it's "completed" or an error occurs. The polling loop uses ec2.DescribeSnapshotsWithContext(ctx, ...). After few iterations this call fails with GRPC error ("DeadlineExceeded" iirc).

Looks like the issue is that the CreateSnapshot CSI call should be blocking (by specification) but the GRPC connection times out prematurely -- it's unknown how long time a snapshot creation might take. The controllers (external snapshotter in our case) should be able to handle that -- for example external attacher distinguishes the fatal and transient errors like this: https://github.com/kubernetes-csi/external-attacher/blob/master/pkg/connection/connection.go#L256

I guess similar mechanism needs to be implemented also in the external snapshotter, otherwise this patch will not work...

Edit: In case of AWS it might be possible to avoid the error by simply not waiting for the snapshot to complete and return immediately with a snapshot in not ReadyToUse state. The snapshotter should then re-try until the snapshot is ready.

@tsmetana
Copy link
Contributor

The issues I mentioned in the previous comment were worked around in the patch that's been recently merged.

@leakingtapan
Copy link
Contributor Author

leakingtapan commented Feb 21, 2019

@tsmetana your snapshot change is merged into master branch

just to confirm that is ListSnapshot not used by k8s yet, so it is not implemented?

@tsmetana
Copy link
Contributor

No, ListSnapshot is not implemented yet. I'll add it to my TODO.

@leakingtapan
Copy link
Contributor Author

@tsmetana I created a follow up issue for the left task: #233

I am going to close this one since the major tasks are done.

/close

@k8s-ci-robot
Copy link
Contributor

@leakingtapan: Closing this issue.

In response to this:

@tsmetana I created a follow up issue for the left task: #233

I am going to close this one since the major tasks are done.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

5 participants