Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated VCR re-recording #75

Merged
merged 17 commits into from
Dec 31, 2017
Merged

Automated VCR re-recording #75

merged 17 commits into from
Dec 31, 2017

Conversation

cben
Copy link
Contributor

@cben cben commented Dec 10, 2017

Something I've long wanted, a script for hands-off VCR re-recording 🎉

Can use minishift or (todo: untested) bring your own openshift setting OPENSHIFT_MASTER_HOST env var.

Steps I took for new VCR recording:

  1. Installed minishift v1.9.0, with manageiq plugin added according to https://github.com/ManageIQ/guides/blob/master/providers/openshift.md
  2. I run
    minishift start --vm-driver virtualbox --openshift-version v3.6.1 --metrics --memory 5G
    
    (metrics needed to use same VCR in manageiq-providers-kubernetes, at least as that spec is currently written. lots of RAM needed for metrics. --metrics for v3.7.0 doesn't work yet.)
  3. Then waited for metrics to come up, as confirmed by:
    oc get pods -n openshift-infra
    curl --insecure "https://$(oc get route -n openshift-infra hawkular-metrics --template '{{.spec.host}}')"
    
  4. Run the new script:
    ./spec/vcr_cassettes/manageiq/providers/openshift/container_manager/test_objects_record.sh
    
  5. git commit, profit :-)

Issues I didn't fix

  • Caught difference in number of images between old and new refresh! 🐛
    We already had a not-fully-explained off by 1 there.
    I would like to merge this as pending test and debug later (will open issue).
  • TODO: Object counts not yet reproducible between runs.
    However it's mostly stable
  • TODO: API_TOKEN gets stored in the VCR (it was in previous cassette, too).
    This may be a problem if somebody passes a long-lived env.
  • Sometimes project creation errors as it's still cleaning up from project deletion — despite the script waiting. I'll try to report openshift bug, but just re-running this script helps. I decided not to introduce retrying for this, would make script flow much more complicated.

cc @yaacov @enoodle @Ladas

Need this gaprindashvili/yes for improving quotas test
https://bugzilla.redhat.com/show_bug.cgi?id=1504560

@miq-bot miq-bot added the wip label Dec 10, 2017
Copy link
Member

@yaacov yaacov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, LGTM 👍

@miq-bot
Copy link
Member

miq-bot commented Dec 10, 2017

This pull request is not mergeable. Please rebase and repush.

Copy link

@enoodle enoodle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending green tests

@Ladas
Copy link
Contributor

Ladas commented Dec 11, 2017

@cben looking e.g at ContainerReplicator.count failure

(byebug) pp ContainerReplicator.pluck(:ems_ref, :name)
[["7b36741b-dd83-11e7-8319-169d0f7fea7d", "docker-registry-1"],
 ["7beb37af-dd83-11e7-8319-169d0f7fea7d", "router-1"],
 ["bf9b8058-dd84-11e7-8319-169d0f7fea7d", "my-replicationcontroller-2"],
 ["ea539a31-dd83-11e7-8319-169d0f7fea7d", "hawkular-cassandra-1"],
 ["e518dc66-dd83-11e7-8319-169d0f7fea7d", "hawkular-metrics"],
 ["e61f1832-dd83-11e7-8319-169d0f7fea7d", "heapster"]]

this looks correct, but spec thinks we deleted 2, so there should be 3 total. Is it possible the env. is either not deterministic, or there are more changes happening?

because

expect(ContainerReplicator.find_by(:name => "my-replicationcontroller-0")).to be_nil
expect(ContainerReplicator.find_by(:name => "my-replicationcontroller-1")).to be_nil

would pass, it's just something/somebody created a plenty more ContainerReplicator records in the meantime :-) I expect that is the reason why other counts do not match.

cben added 10 commits December 25, 2017 13:01
- compare together to get all differences printed together
- express counts after deletions as subtractions
Prevents ContainerProject.count "expected: 10 got: 11" failure.
Seems to avoid error that I saw in most runs with 3.7.0 (minishift):

Error from server (Forbidden): persistentvolumeclaims "my-persistentvolumeclaim-1" is forbidden: status unknown for quota: my-resource-quota-1

I guess if quota is created last, it's not really enforced,
but we don't care.
Env created with minishift v1.9.0:

minishift start --vm-driver virtualbox --openshift-version v3.6.1 --metrics --memory 5G

Then waited for metrics to come up, as confirmed by:

oc get pods -n openshift-infra
curl --insecure "https://$(oc get route -n openshift-infra hawkular-metrics --template '{{.spec.host}}')"
@cben cben changed the title [WIP] Automated VCR re-recording Automated VCR re-recording Dec 25, 2017
@miq-bot miq-bot removed the wip label Dec 25, 2017
@cben
Copy link
Contributor Author

cben commented Dec 25, 2017

PTAL. Still getting small count difference between recordings, but it's reasonably stable and I want to move on. Added text files which will make "what changed between recordings" much easier to track with future changes...

  • I still need to look into Ladislav's comment above. I've since re-recorded multiple times, but I probably have that recording in many-vcrs branch.
    => Obsolete, sounds like metrics was still coming up during that recording.
  • TODO: I'm not sure I have image example I promised Ladislav.
    => I don't, hard to capture podwith VCR, instead added refresh_parser_spec captured by watch
  • Added quotas with scopes to template. May need more changes later but don't know which yet.
  • I again see image count difference between old and graph refresh. Marked pending, I'm happy I have it caught, debugging and fixing are out of scope for this PR.

expect(ContainerProject.count).to eq(object_counts['ContainerProject'])
expect(ContainerProject.active.count).to eq(object_counts['ContainerProject'] - 1)

pending("why graph refresh DELETES 1 image from DB? why old refresh archives 2?")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

token = 'theToken'
# env vars for easier VCR recording, see test_objects_record.sh
hostname = ENV["API_HOST"] || "host.example.com"
token = ENV["API_TOKEN"] || "theToken"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have some automation around cluster creation and there these are named:

OPENSHIFT_MASTER_HOST
OPENSHIFT_MANAGEMENT_ADMIN_TOKEN

IT could prove useful to have the same names, up to you.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#!/bin/bash

set -e # abort on error

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this isn't using functions can we have section headers for readability? e.g

#
# Ensure API Server
#

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an echo, I'm using these as both headings in code and headings in script output.

echo 'or have minishift in $PATH and already running, e.g.'
echo ' minishift addons enable manageiq'
echo ' minishift start --vm-driver virtualbox --openshift-version v3.7.0'
exit 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could add a link to this page explaining deployment options:
http://manageiq.org/docs/guides/providers/openshift

if which minishift && minishift status | grep -i 'openshift:.*running'; then
export API_HOST="$(minishift ip)"
eval $(minishift oc-env --shell bash) # Ensure oc in PATH
oc login -u system:admin # With minishift, we know we can just do this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where do we login to a non minishift cluster?
or do we assume we are already logged in?
I'd say running on the openshift master is a good assumption (API_HOST=localhost)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for non-minishift BYO, script expects you to already have logged in.

Running on openshift master is inconvenient — you need manageiq checked out & working (at least test DB) to run this script, and script leaves updated files you'll want to git commit.
I should document how to oc adm policy add-cluster-role-to-user cluster-admin you for remote login.

I've never actually tested the script in BYO mode. Let me try...


VCR_DIR="$(dirname "$(realpath "$0")")"

cd "$(realpath "$0" | sed 's:\(manageiq-providers-openshift/\).*$:\1:')" # repo root

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cd "$(git rev-parse --show-toplevel)"

oc start-build my-build-config-$ind
done

while out="$(oc get build --all-namespaces)"; echo "$out"; [ "$(echo "$out" | egrep --count --word 'Complete|Failed|Error')" -ne 3 ]; do

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please upcase all the variables

@moolitayer
Copy link

Looks great and taking us a long way.
This is meant to be run from a developer machine right(not on an automation server)?

@cben
Copy link
Contributor Author

cben commented Dec 26, 2017

This is meant to be run from a developer machine right(not on an automation server)?

Right, however it's very close to what we'd need for an actually frequently run integration test, if we'll want that.

@miq-bot
Copy link
Member

miq-bot commented Dec 26, 2017

Checked commits cben/manageiq-providers-openshift@d80e778~...466a85c with ruby 2.3.3, rubocop 0.47.1, haml-lint 0.20.0, and yamllint 1.10.0
6 files checked, 1 offense detected

**

  • 💣 💥 🔥 🚒 - Linter/Yaml - missing config files

@cben
Copy link
Contributor Author

cben commented Dec 26, 2017

Oops, I was pushing to different branch, you didn't see changes I was claiming, sorry :-(
Pushed now.
AFAICT, last PR state you saw was cben/manageiq-providers-openshift@bf456cf...ab7707b — or rebased to current base 89a8936...cben:vcr-script-old-review-rebased
Can't give you a github link to see only new parts, but if you want you can locally (includes some rebase noise):

git remote add cben https://github.com/cben/manageiq-providers-openshift
git fetch cben
git diff vcr-script-old-review-rebased..cben/vcr-script

I believe all comments addressed (see recent commits).

@miq-bot add-label gaprindashvili/yes
will need new vcr for improving quotas test

sleep 3
done

for ind in 0 1 2; do

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upcase var please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


rm -v "$VCR_DIR"/refresher_after_deletions.{yml,txt} || true
describe_vcr > "$VCR_DIR"/refresher_after_deletions.txt
env RECORD_VCR=after_deletions bundle exec rspec "$SPEC" || echo "^^ FAILURES ARE POSSIBLE, YOU'LL HAVE TO EDIT THE SPEC"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are FAILURES expected due to the status issue above? something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to recordings not being 100% reproducible — it's quite stable now but still usually at least something is off by one. "An ideal world is left as an exercise to the reader."

Additionally, if you switch Openshift versions, or change template deliberately, you'll get certain differences.

In any case, this script should not abort on test fail — it should let you get both recordings, and then you can edit & run spec again and again...

Better wording welcome.

@cben
Copy link
Contributor Author

cben commented Dec 31, 2017

ping @moolitayer anything else needed here?

Copy link

@moolitayer moolitayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@moolitayer moolitayer merged commit 6e8842e into ManageIQ:master Dec 31, 2017
@moolitayer moolitayer added this to the Sprint 76 Ending Jan 1, 2018 milestone Dec 31, 2017
@moolitayer
Copy link

Awesome 💪

simaishi pushed a commit that referenced this pull request Jan 3, 2018
Automated VCR re-recording
(cherry picked from commit 6e8842e)
@simaishi
Copy link

simaishi commented Jan 3, 2018

Gaprindashvili backport details:

$ git log -1
commit 3731370ee732f07861b6d4598ff0befcbace3dbf
Author: Mooli Tayer <[email protected]>
Date:   Sun Dec 31 19:56:57 2017 +0200

    Merge pull request #75 from cben/vcr-script
    
    Automated VCR re-recording
    (cherry picked from commit 6e8842e4592433ea2ce7ade2fb2d8806da9b8053)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants