Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1812237: libvirt installer template: increase teardown timeout, add setup-fail marker #7568

Merged
merged 8 commits into from
Apr 9, 2020

Conversation

sallyom
Copy link
Contributor

@sallyom sallyom commented Mar 10, 2020

I've updated CI scripts and images:

I've pulled in this, too: #7572

@smarterclayton @ironcladlou As more teams are depending on libvirt CI, it makes sense to reconsider pulling the openshift4-libvirt-gcp repo into openshift, so this doesn't happen again - the scripts are way outdated - there's no chance of libvirt CI passing with them as/is. Once they are updated, that repo is very useful for anyone wanting to spin up a single gcp instance nested libvirt dev cluster. Solves the issue of not able to/not wanting to run a libvirt cluster on your own system.

/cc @praveenkumar

@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Mar 10, 2020
@patrickdillon
Copy link
Contributor

In general LGTM. I will wait for tests/give others a chance to review.

Is the problem that in the case of setup failure, the pods were hitting 4 hour time limit and teardown was never invoked?

@sallyom
Copy link
Contributor Author

sallyom commented Mar 11, 2020

In general LGTM. I will wait for tests/give others a chance to review.

Is the problem that in the case of setup failure, the pods were hitting 4 hour time limit and teardown was never invoked?

yes, that's the issue - I'm not sure this PR fixes the issue - the scripts were very outdated, have to update them to be able to test this: ironcladlou/openshift4-libvirt-gcp#25
until that merges I'll hold this

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 11, 2020
@praveenkumar
Copy link
Contributor

As more teams are depending on libvirt CI, it makes sense to reconsider pulling the openshift4-libvirt-gcp repo into openshift, so this doesn't happen again - the scripts are way outdated

I agree if we can make those scripts as part of openshift repo, it will get bit more attention and managed properly.

@sallyom sallyom force-pushed the bz1812237 branch 5 times, most recently from da0516f to f5a6964 Compare March 12, 2020 17:33
@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 12, 2020
@sallyom sallyom force-pushed the bz1812237 branch 2 times, most recently from 8c188ad to f07c7f2 Compare March 13, 2020 01:47
@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 13, 2020
@sallyom
Copy link
Contributor Author

sallyom commented Mar 13, 2020

/hold cancel

@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 13, 2020
@jstuever
Copy link
Contributor

/lgtm
I recommend working with test platform team to ensure these resources are cleaned up by the periodic-ipi-deprovision job.

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 13, 2020
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Mar 16, 2020
@sallyom
Copy link
Contributor Author

sallyom commented Mar 17, 2020

soo close to complete install
100% complete, waiting on kube-storage-version-migrator"
EDIT: turns out installs require 2 workers, bumping to 2 workers plus giving extra time to finish install beyond default installer timeout.

@sallyom
Copy link
Contributor Author

sallyom commented Apr 6, 2020

/retest

Setting --subnet-mode=auto is wasteful in this case because
auto sets a subnet in every region.  Setting --subnet-mode=custom
configures a subnet within a single region.
@sallyom sallyom force-pushed the bz1812237 branch 2 times, most recently from f18912f to c9a0798 Compare April 7, 2020 18:51
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 7, 2020

@sallyom: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/rehearse/openshift/installer/master/e2e-steps 8c188ad45ccdb6e61e73552e604e4cebd0809dd1 link /test pj-rehearse
ci/rehearse/openshift/installer/master/e2e-gcp-upgrade 8c188ad45ccdb6e61e73552e604e4cebd0809dd1 link /test pj-rehearse
ci/rehearse/openshift/installer/master/e2e-aws-upgrade 8c188ad45ccdb6e61e73552e604e4cebd0809dd1 link /test pj-rehearse
ci/rehearse/openshift/installer/master/images-build01 8c188ad45ccdb6e61e73552e604e4cebd0809dd1 link /test pj-rehearse
ci/rehearse/openshift/installer/master/e2e-azure-upi 8c188ad45ccdb6e61e73552e604e4cebd0809dd1 link /test pj-rehearse
ci/rehearse/openshift/installer/master/e2e-ovirt 8c188ad45ccdb6e61e73552e604e4cebd0809dd1 link /test pj-rehearse
ci/rehearse/openshift/installer/master/e2e-aws-disruptive 8c188ad45ccdb6e61e73552e604e4cebd0809dd1 link /test pj-rehearse
ci/rehearse/openshift/installer/master/e2e-vsphere 8c188ad45ccdb6e61e73552e604e4cebd0809dd1 link /test pj-rehearse
ci/rehearse/openshift/installer/master/e2e-metal 8c188ad45ccdb6e61e73552e604e4cebd0809dd1 link /test pj-rehearse
ci/rehearse/openshift/installer/master/e2e-gcp 8c188ad45ccdb6e61e73552e604e4cebd0809dd1 link /test pj-rehearse
ci/rehearse/openshift/installer/master/e2e-aws-scaleup-rhel7 8c188ad45ccdb6e61e73552e604e4cebd0809dd1 link /test pj-rehearse
ci/rehearse/openshift/installer/master/e2e-azure-shared-vpc 8c188ad45ccdb6e61e73552e604e4cebd0809dd1 link /test pj-rehearse

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sallyom
Copy link
Contributor Author

sallyom commented Apr 8, 2020

/retest

1 similar comment
@sallyom
Copy link
Contributor Author

sallyom commented Apr 8, 2020

/retest

@zeenix
Copy link
Contributor

zeenix commented Apr 9, 2020

@sallyom Awesome work!

/lgtm
/approve

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 9, 2020
@zeenix
Copy link
Contributor

zeenix commented Apr 9, 2020

/assign @smarterclayton

@crawford
Copy link
Contributor

crawford commented Apr 9, 2020

/approve

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: crawford, jstuever, praveenkumar, sallyom, zeenix

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 9, 2020
@openshift-merge-robot openshift-merge-robot merged commit a80ed8d into openshift:master Apr 9, 2020
@openshift-ci-robot
Copy link
Contributor

@sallyom: All pull requests linked via external trackers have merged: openshift/release#7568. Bugzilla bug 1812237 has been moved to the MODIFIED state.

In response to this:

Bug 1812237: libvirt installer template: increase teardown timeout, add setup-fail marker

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

@sallyom: Updated the following 7 configmaps:

  • prow-job-cluster-launch-installer-libvirt-e2e configmap in namespace ci at cluster app.ci using the following files:
    • key cluster-launch-installer-libvirt-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-libvirt-e2e.yaml
  • prow-job-cluster-launch-installer-libvirt-e2e configmap in namespace ci-stg at cluster app.ci using the following files:
    • key cluster-launch-installer-libvirt-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-libvirt-e2e.yaml
  • prow-job-cluster-launch-installer-libvirt-e2e configmap in namespace ci at cluster ci/api-build01-ci-devcluster-openshift-com:6443 using the following files:
    • key cluster-launch-installer-libvirt-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-libvirt-e2e.yaml
  • job-config-master configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-installer-master-presubmits.yaml using file ci-operator/jobs/openshift/installer/openshift-installer-master-presubmits.yaml
  • job-config-master configmap in namespace ci at cluster api.ci using the following files:
    • key openshift-installer-master-presubmits.yaml using file ci-operator/jobs/openshift/installer/openshift-installer-master-presubmits.yaml
  • prow-job-cluster-launch-installer-libvirt-e2e configmap in namespace ci at cluster api.ci using the following files:
    • key cluster-launch-installer-libvirt-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-libvirt-e2e.yaml
  • prow-job-cluster-launch-installer-libvirt-e2e configmap in namespace ci-stg at cluster api.ci using the following files:
    • key cluster-launch-installer-libvirt-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-libvirt-e2e.yaml

In response to this:

I've updated CI scripts and images:

I've pulled in this, too: #7572

@smarterclayton @ironcladlou As more teams are depending on libvirt CI, it makes sense to reconsider pulling the openshift4-libvirt-gcp repo into openshift, so this doesn't happen again - the scripts are way outdated - there's no chance of libvirt CI passing with them as/is. Once they are updated, that repo is very useful for anyone wanting to spin up a single gcp instance nested libvirt dev cluster. Solves the issue of not able to/not wanting to run a libvirt cluster on your own system.

/cc @praveenkumar

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants