Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random failure when manually run e2e test for [Snapshot] #4159

Closed
reasonerjt opened this issue Sep 18, 2021 · 4 comments
Closed

Random failure when manually run e2e test for [Snapshot] #4159

reasonerjt opened this issue Sep 18, 2021 · 4 comments
Assignees
Labels
E2E Tests End to end test

Comments

@reasonerjt
Copy link
Contributor

reasonerjt commented Sep 18, 2021

What steps did you take and what happened:
When I manually run the e2e test I see random failure of case

[Snapshot] Velero tests on cluster using the plugin provider for object storage and snapshots for volume backups when kibishii is the sample workload
  should successfully back up and restore to an additional BackupStorageLocation with unique credentials

I see random failure. The restore has status PartiallyFailed but I don't see any errors in any logs.

The command I used

CLOUD_PROVIDER=aws \
    VSL_CONFIG=region=us-east-2 \
    BSL_CONFIG=region=us-east-2 \
    CREDS_FILE=REDACTED \
    BSL_BUCKET=REDACTED \
    ADDITIONAL_OBJECT_STORE_PROVIDER=aws \
    ADDITIONAL_BSL_CONFIG=region=us-east-2 \
    ADDITIONAL_BSL_BUCKET=REDACTED \
    ADDITIONAL_CREDS_FILE=REDACTED\
    GINKGO_FOCUS='\[Snapshot\] Velero tests on cluster' \
    VELERO_IMAGE=velero/velero:v1.7.0-rc.1 \
    REGISTRY_CREDENTIAL_FILE=REDACTED\
    VERSION=v1.7.0-rc.1 \
    GOPATH=/Users/jiangd/go/ make test-e2e

The output for the command:
output.txt

The complete debug bundle:
debug-bundle-1631946757127472000.tar.gz

Environment:

  • Velero version (use velero version): 1.7.0-rc.1
  • Kubernetes version (use kubectl version):

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"
@reasonerjt reasonerjt added the E2E Tests End to end test label Sep 18, 2021
@reasonerjt
Copy link
Contributor Author

I can't always reproduce it, but we need to double check if it's due to instability in e2e test or is it a bug in velero

@reasonerjt
Copy link
Contributor Author

see error in velero restore describe:

Name:         restore-default-86660c20-3a15-496f-80a9-d23c55ff6e03
Namespace:    velero
Labels:       <none>
Annotations:  <none>

Phase:                       PartiallyFailed (run 'velero restore logs restore-default-86660c20-3a15-496f-80a9-d23c55ff6e03' for more information)
Total items to be restored:  35
Items restored:              35

Started:    2021-09-18 21:55:59 +0800 CST
Completed:  2021-09-18 21:56:04 +0800 CST

Errors:
  Velero:     <none>
  Cluster:  error executing PVAction for persistentvolumes/pvc-269b3585-f53a-4208-b968-ec456c25c90e: rpc error: code = Unknown desc = IncorrectState: Snapshot is in invalid state - pending
  status code: 400, request id: ea25a4ac-ab88-47f2-840c-2f8bb784a2ef
    error executing PVAction for persistentvolumes/pvc-ee426267-81b5-4c00-a4fd-e54971cf7e19: rpc error: code = Unknown desc = IncorrectState: Snapshot is in invalid state - pending
  status code: 400, request id: 782f9c68-638a-4f72-957d-6513c20b5d66

@ywk253100
Copy link
Contributor

The case failure is introduced by #4058: the waiting logic for aws snapshots is removed
@danfengliu has submitted a PR to take it back #4160

@reasonerjt
Copy link
Contributor Author

This can be considered fixed as #4160 and #4161 are merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E2E Tests End to end test
Projects
None yet
Development

No branches or pull requests

2 participants