Skip to content

Commit

Permalink
Improve test runner to re-use instances and make provisioners pluggab…
Browse files Browse the repository at this point in the history
…le (elastic#3136)

Refactors the way integration test runner works to provide the following
benefits:

* Instance re-use - Call to any `mage integration:*` target that
requires the usage of a deployed instance will re-use an already
deployed instance. A specific call to `mage integration:clean` is
required to bring down created instances and stacks.
* Copy builds only if required - With instance re-use support the code
has been changed to only copy the Elastic Agent builds to an instance in
the case that the build has changed.
* Add `InstanceProvisioner` interface - This interface provides a
defined interface between the runner and the instance provisioner. This
will allow other instance provisioners to be added to the test runner.
* Add `StackProvisioner` interface - This interface provides a defined
interface between the runner and the stack provisioner. This will allow
other stack provisioners to be add to the test runner.
  • Loading branch information
blakerouse authored Aug 1, 2023
1 parent b60b8b0 commit 110a7fe
Show file tree
Hide file tree
Showing 21 changed files with 1,361 additions and 749 deletions.
1 change: 1 addition & 0 deletions .buildkite/scripts/steps/integration_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ DEV=true EXTERNAL=true SNAPSHOT=true PLATFORMS=linux/amd64,linux/arm64 PACKAGES=
set +e
SNAPSHOT=true mage integration:test
TESTS_EXIT_STATUS=$?
mage integration:clean
set -e

# HTML report
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Directories
/.agent-testing
/.integration-cache
/.ogc-cache
/.vagrant
/.idea
Expand Down
29 changes: 0 additions & 29 deletions NOTICE.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4687,35 +4687,6 @@ OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


--------------------------------------------------------------------------------
Dependency : github.com/rs/xid
Version: v1.3.0
Licence type (autodetected): MIT
--------------------------------------------------------------------------------

Contents of probable licence file $GOMODCACHE/github.com/rs/[email protected]/LICENSE:

Copyright (c) 2015 Olivier Poitrey <[email protected]>

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is furnished
to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.


--------------------------------------------------------------------------------
Dependency : github.com/rs/zerolog
Version: v1.27.0
Expand Down
29 changes: 22 additions & 7 deletions docs/test-framework-dev-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,15 +49,12 @@ pass `[testName]` to `go test` as `--run=[testName]`.

- `mage integration:matrix` to run all tests on the complete matrix of supported operating systems and architectures of the Elastic Agent.

### Manually running the tests
#### Cleaning up resources

If you want to run the tests manually, skipping the test runner, set the
`TEST_DEFINE_PREFIX` environment variable to any value and run your tests normally
with `go test`. E.g.:
The test run will keep provisioned resources (instances and stacks) around after the tests have been ran. This allows
following `mage integration:*` commands to re-use the already provisioned resources.

```shell
TEST_DEFINE_PREFIX=gambiarra go test -v -tags integration -run TestProxyURL ./testing/integration/
```
- `mage integration:clean` will de-provision the allocated resources and cleanup any local state.

Tests with external dependencies might need more environment variables to be set
when running them manually, such as `ELASTICSEARCH_HOST`, `ELASTICSEARCH_USERNAME`,
Expand Down Expand Up @@ -99,6 +96,16 @@ We pass a `-test.run` flag along with the names of the tests we want to run in O
Due to the way the parameters are passed to `devtools.GoTest` the value of the environment variable
is split on space, so not all combination of flags and their values may be correctly split.

## Manually running the tests

If you want to run the tests manually, skipping the test runner, set the
`TEST_DEFINE_PREFIX` environment variable to any value and run your tests normally
with `go test`. E.g.:

```shell
TEST_DEFINE_PREFIX=gambiarra go test -v -tags integration -run TestProxyURL ./testing/integration/
```

## Writing tests

Write integration and E2E tests by adding them to the `testing/integration`
Expand Down Expand Up @@ -154,6 +161,14 @@ the `mage package` command OR set the `AGENT_VERSION` environment variable to a
that includes the `-SNAPSHOT` suffix when running `mage integration:test` or
`mage integration:local`.

### Failures on reused resources
The integration framework tries to re-use resource when it can. This improves the speed at
which the tests can run, but also means its possible for a failed test to leave state behind
that can break future runs.

Run `mage integration:clean` before running `mage integration:test` to ensure the tests are
being run with fresh instances and stack.

### OGC-related errors
If you encounter any errors mentioning `ogc`, try running `mage integration:clean` and then
re-running whatever `mage integration:*` target you were trying to run originally when you
Expand Down
1 change: 0 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,6 @@ require (
github.com/otiai10/copy v1.11.0
github.com/pierrre/gotestcover v0.0.0-20160517101806-924dca7d15f0
github.com/pkg/errors v0.9.1
github.com/rs/xid v1.3.0
github.com/rs/zerolog v1.27.0
github.com/shirou/gopsutil/v3 v3.21.12
github.com/sirupsen/logrus v1.9.0
Expand Down
1 change: 0 additions & 1 deletion go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -1104,7 +1104,6 @@ github.com/rogpeppe/go-internal v1.5.2/go.mod h1:xXDCJY+GAPziupqXw64V24skbSoqbTE
github.com/rogpeppe/go-internal v1.6.1/go.mod h1:xXDCJY+GAPziupqXw64V24skbSoqbTEfhy4qGm1nDQc=
github.com/rogpeppe/go-internal v1.8.1 h1:geMPLpDpQOgVyCg5z5GoRwLHepNdb71NXb67XFkP+Eg=
github.com/rogpeppe/go-internal v1.8.1/go.mod h1:JeRgkft04UBgHMgCIwADu4Pn6Mtm5d4nPKWu0nJ5d+o=
github.com/rs/xid v1.3.0 h1:6NjYksEUlhurdVehpc7S7dk6DAmcKv8V9gG0FsVN2U4=
github.com/rs/xid v1.3.0/go.mod h1:trrq9SKmegXys3aeAKXMUTdJsYXVwGY3RLcfgqegfbg=
github.com/rs/zerolog v1.27.0 h1:1T7qCieN22GVc8S4Q2yuexzBb1EqjbgjSH9RohbMjKs=
github.com/rs/zerolog v1.27.0/go.mod h1:7frBqO0oezxmnO7GF86FY++uy8I0Tk/If5ni1G9Qc0U=
Expand Down
56 changes: 35 additions & 21 deletions magefile.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ import (

"github.com/elastic/elastic-agent/pkg/testing/define"
"github.com/elastic/elastic-agent/pkg/testing/ess"
"github.com/elastic/elastic-agent/pkg/testing/ogc"
"github.com/elastic/elastic-agent/pkg/testing/runner"
bversion "github.com/elastic/elastic-agent/version"

Expand Down Expand Up @@ -1436,12 +1437,13 @@ func majorMinor() string {
func (Integration) Clean() error {
_ = os.RemoveAll(".agent-testing")

// Clean out .ogc-cache always
// Clean out .integration-cache/.ogc-cache always
defer os.RemoveAll(".integration-cache")
defer os.RemoveAll(".ogc-cache")

_, err := os.Stat(".ogc-cache")
_, err := os.Stat(".integration-cache")
if err == nil {
// .ogc-cache exists; need to run `Clean` from the runner
// .integration-cache exists; need to run `Clean` from the runner
r, err := createTestRunner(false, "", "")
if err != nil {
return fmt.Errorf("error creating test runner: %w", err)
Expand Down Expand Up @@ -1511,19 +1513,16 @@ func (Integration) Auth(ctx context.Context) error {

// Test runs integration tests on remote hosts
func (Integration) Test(ctx context.Context) error {
mg.CtxDeps(ctx, Integration.Clean)
return integRunner(ctx, false, "")
}

// Matrix runs integration tests on a matrix of all supported remote hosts
func (Integration) Matrix(ctx context.Context) error {
mg.CtxDeps(ctx, Integration.Clean)
return integRunner(ctx, true, "")
}

// Single runs single integration test on remote host
func (Integration) Single(ctx context.Context, testName string) error {
mg.CtxDeps(ctx, Integration.Clean)
return integRunner(ctx, false, testName)
}

Expand Down Expand Up @@ -1701,26 +1700,41 @@ func createTestRunner(matrix bool, singleTest string, goTestFlags string, batche
essRegion = "gcp-us-central1"
}
timestamp := timestampEnabled()
r, err := runner.NewRunner(runner.Config{

cfg := runner.Config{
AgentVersion: agentVersion,
AgentStackVersion: agentStackVersion,
BuildDir: agentBuildDir,
GOVersion: goVersion,
RepoDir: ".",
ESS: &runner.ESSConfig{
APIKey: essToken,
Region: essRegion,
},
GCE: &runner.GCEConfig{
ServiceTokenPath: serviceTokenPath,
Datacenter: datacenter,
},
Matrix: matrix,
SingleTest: singleTest,
VerboseMode: mg.Verbose(),
Timestamp: timestamp,
TestFlags: goTestFlags,
}, batches...)
Matrix: matrix,
SingleTest: singleTest,
VerboseMode: mg.Verbose(),
Timestamp: timestamp,
TestFlags: goTestFlags,
}
ogcCfg := ogc.Config{
ServiceTokenPath: serviceTokenPath,
Datacenter: datacenter,
}
ogcProvisioner, err := ogc.NewProvisioner(ogcCfg)
if err != nil {
return nil, err
}
email, err := ogcCfg.ClientEmail()
if err != nil {
return nil, err
}
essProvisioner, err := ess.NewProvisioner(ess.ProvisionerConfig{
Identifier: fmt.Sprintf("at-%s", strings.Replace(strings.Split(email, "@")[0], ".", "-", -1)),
APIKey: essToken,
Region: essRegion,
})
if err != nil {
return nil, err
}

r, err := runner.NewRunner(cfg, ogcProvisioner, essProvisioner, batches...)
if err != nil {
return nil, fmt.Errorf("failed to create runner: %w", err)
}
Expand Down
170 changes: 170 additions & 0 deletions pkg/testing/ess/provisioner.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package ess

import (
"context"
"errors"
"fmt"
"strings"
"time"

"golang.org/x/sync/errgroup"

"github.com/elastic/elastic-agent/pkg/testing/runner"
)

// ProvisionerConfig is the configuration for the ESS provisioner.
type ProvisionerConfig struct {
Identifier string
APIKey string
Region string
}

// Validate returns an error if the information is invalid.
func (c *ProvisionerConfig) Validate() error {
if c.Identifier == "" {
return errors.New("field Identifier must be set")
}
if c.APIKey == "" {
return errors.New("field APIKey must be set")
}
if c.Region == "" {
return errors.New("field Region must be set")
}
return nil
}

type provisioner struct {
logger runner.Logger
cfg ProvisionerConfig
client *Client
}

// NewProvisioner creates the ESS provisioner
func NewProvisioner(cfg ProvisionerConfig) (runner.StackProvisioner, error) {
err := cfg.Validate()
if err != nil {
return nil, err
}
essClient := NewClient(Config{
ApiKey: cfg.APIKey,
})
return &provisioner{
cfg: cfg,
client: essClient,
}, nil
}

func (p *provisioner) SetLogger(l runner.Logger) {
p.logger = l
}

func (p *provisioner) Provision(ctx context.Context, requests []runner.StackRequest) ([]runner.Stack, error) {
results := make(map[runner.StackRequest]*CreateDeploymentResponse)
for _, r := range requests {
// allow up to 2 minutes for each create request
createCtx, createCancel := context.WithTimeout(ctx, 2*time.Minute)
resp, err := p.createDeployment(createCtx, r)
createCancel()
if err != nil {
return nil, err
}
results[r] = resp
}

// wait 15 minutes for all stacks to be ready
readyCtx, readyCancel := context.WithTimeout(ctx, 15*time.Minute)
defer readyCancel()

g, gCtx := errgroup.WithContext(readyCtx)
for req, resp := range results {
g.Go(func(req runner.StackRequest, resp *CreateDeploymentResponse) func() error {
return func() error {
ready, err := p.client.DeploymentIsReady(gCtx, resp.ID, 30*time.Second)
if err != nil {
return fmt.Errorf("failed to check for cloud %s to be ready: %w", req.Version, err)
}
if !ready {
return fmt.Errorf("cloud %s never became ready: %w", req.Version, err)
}
return nil
}
}(req, resp))
}
err := g.Wait()
if err != nil {
return nil, err
}

var stacks []runner.Stack
for req, resp := range results {
stacks = append(stacks, runner.Stack{
ID: req.ID,
Version: req.Version,
Elasticsearch: resp.ElasticsearchEndpoint,
Kibana: resp.KibanaEndpoint,
Username: resp.Username,
Password: resp.Password,
Internal: map[string]interface{}{
"deployment_id": resp.ID,
},
})
}
return stacks, nil
}

// Clean cleans up all provisioned resources.
func (p *provisioner) Clean(ctx context.Context, stacks []runner.Stack) error {
var errs []error
for _, s := range stacks {
err := p.destroyDeployment(ctx, s)
if err != nil {
errs = append(errs, fmt.Errorf("failed to destroy stack %s (%s): %w", s.Version, s.ID, err))
}
}
if len(errs) > 0 {
return errors.Join(errs...)
}
return nil
}

func (p *provisioner) createDeployment(ctx context.Context, r runner.StackRequest) (*CreateDeploymentResponse, error) {
ctx, cancel := context.WithTimeout(ctx, 1*time.Minute)
defer cancel()

p.logger.Logf("Creating stack %s (%s)", r.Version, r.ID)
name := fmt.Sprintf("%s-%s", strings.Replace(p.cfg.Identifier, ".", "-", -1), r.ID)
resp, err := p.client.CreateDeployment(ctx, CreateDeploymentRequest{
Name: name,
Region: p.cfg.Region,
Version: r.Version,
})
if err != nil {
p.logger.Logf("Failed to create ESS cloud %s: %s", r.Version, err)
return nil, fmt.Errorf("failed to create ESS cloud for version %s: %w", r.Version, err)
}
return resp, nil
}

func (p *provisioner) destroyDeployment(ctx context.Context, s runner.Stack) error {
if s.Internal == nil {
return fmt.Errorf("missing internal information")
}
deploymentIDRaw, ok := s.Internal["deployment_id"]
if !ok {
return fmt.Errorf("missing internal deployment_id")
}
deploymentID, ok := deploymentIDRaw.(string)
if !ok {
return fmt.Errorf("internal deployment_id not a string")
}

ctx, cancel := context.WithTimeout(ctx, 1*time.Minute)
defer cancel()

p.logger.Logf("Destroying stack %s (%s)", s.Version, s.ID)
return p.client.ShutdownDeployment(ctx, deploymentID)
}
Loading

0 comments on commit 110a7fe

Please sign in to comment.