Skip to content

Commit

Permalink
Merge #40997 #41029
Browse files Browse the repository at this point in the history
40997: roachtest: deflake bank/{node-restart,cluster-recovery} r=irfansharif a=irfansharif

Fixes #38785.
Fixes #35326.

Because everything roachprod does, it does through SSH, we're
particularly susceptible to network delays, packet drops, etc. We've
seen this before, or at least pointed to this being a problem before,
over at #37001. Setting timeouts around our calls to roachprod helps to
better surface these kind of errors.

The underlying issue in #38785 and in #35326 is the fact that we're
running roachprod commands that may (reasonably) fail due to connection
issues, and we're unable to retry them safely (the underlying commands
are non-idempotent). Presently we simply fail the entire test, when
really we should be able to retry the commands. This is left unaddressed.

Release justification: Category 1: Non-production code changes
Release note: None

41029: cli: fix the demo licensing code r=rohany a=knz

Fixes #40734.
Fixes #41024.

Release justification: fixes a flaky test, fixes UX of main new feature

Before this patch, there were multiple problems with the code:

- if the license acquisition was disabled by the env var config,
  the error message would not be clear.
- the licensing code would deadlock silently on OSS-only
  builds (because the license failure channel was not written in that
  control branch).
- the error/warning messages would be interleaved on the same line as
  the input line (missing newline at start of message).
- the test code would fail when the license server is not available.
- the set up of the example database and workload would be performed
  asynchronously, with unclear signalling of when the user
  can expect to use them interactively.

After this patch:
- it's possible to override the license acquisition URL with
  COCKROACH_DEMO_LICENSE_URL, this is used in tests.
- setting up the example database, partitioning and workload is done
  before presenting the interactive prompt.
- partitioning the example database, if requested by
  --geo-partitioned-replicas, waits for license acquisition to
  complete (license acquisition remains asynchronous otherwise).
- impossible configurations are reported early(earlier).

For example:

- OSS-only builds:

```
kena@kenax ~/cockroach % ./cockroach demo --geo-partitioned-replicas
*
* ERROR: enterprise features are required for this demo, cannot run from OSS-only binary
*
Failed running "demo"
```

For license acquisition failures:

```
kena@kenax ~/cockroach % ./cockroach demo --geo-partitioned-replicas
error while contacting licensing server:
Get https://192.168.2.170/api/license?clusterid=5548b310-14b7-46de-8c92-30605bfe95c4&kind=demo&version=v19.2: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
*
* ERROR: license acquisition was unsuccessful.
* Note: enterprise features are needed for --geo-partitioned-replicas.
*
Failed running "demo"
```

Additionally, this change fixes test flakiness that arises from an
unavailable license server.

Release note (cli change): To enable uses of `cockroach demo` with
enterprise features in firewalled network environments, it is now
possible to redirect the license acquisition with the environment
variable COCKROACH_DEMO_LICENSE_URL to a replacement server (for
example a suitably configured HTTP proxy).

Co-authored-by: irfan sharif <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
  • Loading branch information
3 people committed Sep 24, 2019
3 parents 5a73aa5 + 5296a27 + 66b9550 commit 4242276
Show file tree
Hide file tree
Showing 8 changed files with 235 additions and 154 deletions.
8 changes: 6 additions & 2 deletions pkg/ccl/cliccl/demo.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,17 @@ import (

"github.com/cockroachdb/cockroach/pkg/build"
"github.com/cockroachdb/cockroach/pkg/cli"
"github.com/cockroachdb/cockroach/pkg/util/envutil"
"github.com/cockroachdb/cockroach/pkg/util/log"
"github.com/cockroachdb/cockroach/pkg/util/uuid"
"github.com/cockroachdb/errors"
)

// This URL grants a license that is valid for 24 hours.
const licenseURL = "https://register.cockroachdb.com/api/license"
const licenseDefaultURL = "https://register.cockroachdb.com/api/license"

// We make licenseURL configurable for use in tests.
var licenseURL = envutil.EnvOrDefaultString("COCKROACH_DEMO_LICENSE_URL", licenseDefaultURL)

func getLicense(clusterID uuid.UUID) (string, error) {
client := &http.Client{
Expand Down Expand Up @@ -60,7 +64,7 @@ func getLicense(clusterID uuid.UUID) (string, error) {
func getAndApplyLicense(db *gosql.DB, clusterID uuid.UUID, org string) (bool, error) {
license, err := getLicense(clusterID)
if err != nil {
fmt.Fprintf(log.OrigStderr, "error when contacting licensing server: %+v\n", err)
fmt.Fprintf(log.OrigStderr, "\nerror while contacting licensing server:\n%+v\n", err)
return false, nil
}
if _, err := db.Exec(`SET CLUSTER SETTING cluster.organization = $1`, org); err != nil {
Expand Down
14 changes: 9 additions & 5 deletions pkg/cli/context.go
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,8 @@ func initCLIDefaults() {
demoCtx.runWorkload = false
demoCtx.localities = nil
demoCtx.geoPartitionedReplicas = false
demoCtx.disableTelemetry = false
demoCtx.disableLicenseAcquisition = false

initPreFlagsDefaults()

Expand Down Expand Up @@ -343,9 +345,11 @@ var sqlfmtCtx struct {
// demoCtx captures the command-line parameters of the `demo` command.
// Defaults set by InitCLIDefaults() above.
var demoCtx struct {
nodes int
useEmptyDatabase bool
runWorkload bool
localities demoLocalityList
geoPartitionedReplicas bool
nodes int
disableTelemetry bool
disableLicenseAcquisition bool
useEmptyDatabase bool
runWorkload bool
localities demoLocalityList
geoPartitionedReplicas bool
}
Loading

0 comments on commit 4242276

Please sign in to comment.