Skip to content

Commit

Permalink
Merge branch 'main' into f-consul-fp
Browse files Browse the repository at this point in the history
  • Loading branch information
shoenig authored Jun 3, 2021
2 parents b35fde4 + cfaf6a3 commit 7c6c23d
Show file tree
Hide file tree
Showing 16 changed files with 189 additions and 15,668 deletions.
13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,15 @@ IMPROVEMENTS:
* cli: Added success confirmation message for `nomad volume delete` and `nomad volume deregister`. [[GH-10591](https://github.com/hashicorp/nomad/issues/10591)]
* cli: Cross-namespace `nomad job` commands will now select exact matches if the selection is unambiguous. [[GH-10648](https://github.com/hashicorp/nomad/issues/10648)]
* client/fingerprint: Consul fingerprinter probes for additional enterprise and connect related attributes [[GH-10699](https://github.com/hashicorp/nomad/pull/10699)]
* csi: Validate that `volume` blocks for CSI volumes include the required `attachment_mode` and `access_mode` fields. [[GH-10651](https://github.com/hashicorp/nomad/issues/10651)]

BUG FIXES:
* api: Fixed event stream connection initialization when there are no events to send [[GH-10637](https://github.com/hashicorp/nomad/issues/10637)]
* cli: Fixed a bug where `quota status` and `namespace status` commands may panic if the CLI targets a pre-1.1.0 cluster [[GH-10620](https://github.com/hashicorp/nomad/pull/10620)]
* cli: Fixed a bug where `alloc exec` may fail with "unexpected EOF" without returning the exit code after a command [[GH-10657](https://github.com/hashicorp/nomad/issues/10657)]
* csi: Fixed a bug where `mount_options` were not passed to CSI controller plugins for validation during volume creation and mounting. [[GH-10643](https://github.com/hashicorp/nomad/issues/10643)]
* drivers/exec: Fixed a bug where `exec` and `java` tasks inherit the Nomad agent's `oom_score_adj` value [[GH-10698](https://github.com/hashicorp/nomad/issues/10698)]
* quotas (Enterprise): Fixed a bug where stopped allocations for a failed deployment can be double-credited to quota limits, resulting in a quota limit bypass. [[GH-10694](https://github.com/hashicorp/nomad/issues/10694)]
* ui: Fixed a bug where exec would not work across regions. [[GH-10539](https://github.com/hashicorp/nomad/issues/10539)]

## 1.1.0 (May 18, 2021)
Expand Down Expand Up @@ -92,6 +96,15 @@ BUG FIXES:
* server: Fixed a panic that may arise on submission of jobs containing invalid service checks [[GH-10154](https://github.com/hashicorp/nomad/issues/10154)]
* ui: Fixed the rendering of interstitial components shown after processing a dynamic application sizing recommendation. [[GH-10094](https://github.com/hashicorp/nomad/pull/10094)]

## 1.0.7 (Unreleased)

BUG FIXES:
* api: Fixed event stream connection initialization when there are no events to send [[GH-10637](https://github.com/hashicorp/nomad/issues/10637)]
* cli: Fixed a bug where `alloc exec` may fail with "unexpected EOF" without returning the exit code after a command [[GH-10657](https://github.com/hashicorp/nomad/issues/10657)]
* quotas (Enterprise): Fixed a bug where stopped allocations for a failed deployment can be double-credited to quota limits, resulting in a quota limit bypass. [[GH-10694](https://github.com/hashicorp/nomad/issues/10694)]
* drivers/exec: Fixed a bug where `exec` and `java` tasks inherit the Nomad agent's `oom_score_adj` value [[GH-10698](https://github.com/hashicorp/nomad/issues/10698)]
* ui: Fixed a bug where exec would not work across regions. [[GH-10539](https://github.com/hashicorp/nomad/issues/10539)]

## 1.0.6 (May 18, 2021)

BUG FIXES:
Expand Down
4 changes: 4 additions & 0 deletions drivers/shared/executor/executor_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -764,6 +764,10 @@ func newLibcontainerConfig(command *ExecCommand) (*lconfigs.Config, error) {

configureCapabilities(cfg, command)

// children should not inherit Nomad agent oom_score_adj value
oomScoreAdj := 0
cfg.OomScoreAdj = &oomScoreAdj

if err := configureIsolation(cfg, command); err != nil {
return nil, err
}
Expand Down
56 changes: 56 additions & 0 deletions drivers/shared/executor/executor_linux_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -465,6 +465,62 @@ func TestExecutor_EscapeContainer(t *testing.T) {
require.NoError(err)
}

// TestExecutor_DoesNotInheritOomScoreAdj asserts that the exec processes do not
// inherit the oom_score_adj value of Nomad agent/executor process
func TestExecutor_DoesNotInheritOomScoreAdj(t *testing.T) {
t.Parallel()
testutil.ExecCompatible(t)

oomPath := "/proc/self/oom_score_adj"
origValue, err := os.ReadFile(oomPath)
require.NoError(t, err, "reading oom_score_adj")

err = os.WriteFile(oomPath, []byte("-100"), 0644)
require.NoError(t, err, "setting temporary oom_score_adj")

defer func() {
err := os.WriteFile(oomPath, origValue, 0644)
require.NoError(t, err, "restoring oom_score_adj")
}()

testExecCmd := testExecutorCommandWithChroot(t)
execCmd, allocDir := testExecCmd.command, testExecCmd.allocDir
defer allocDir.Destroy()

execCmd.ResourceLimits = true
execCmd.Cmd = "/bin/bash"
execCmd.Args = []string{"-c", "cat /proc/self/oom_score_adj"}

executor := NewExecutorWithIsolation(testlog.HCLogger(t))
defer executor.Shutdown("SIGKILL", 0)

_, err = executor.Launch(execCmd)
require.NoError(t, err)

ch := make(chan interface{})
go func() {
executor.Wait(context.Background())
close(ch)
}()

select {
case <-ch:
// all good
case <-time.After(5 * time.Second):
require.Fail(t, "timeout waiting for exec to shutdown")
}

expected := "0"
tu.WaitForResult(func() (bool, error) {
output := strings.TrimSpace(testExecCmd.stdout.String())
if output != expected {
return false, fmt.Errorf("oom_score_adj didn't match: want\n%v\n; got:\n%v\n", expected, output)
}
return true, nil
}, func(err error) { require.NoError(t, err) })

}

func TestExecutor_Capabilities(t *testing.T) {
t.Parallel()
testutil.ExecCompatible(t)
Expand Down
6 changes: 4 additions & 2 deletions nomad/job_endpoint_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -811,8 +811,10 @@ func TestJobEndpoint_Register_ACL(t *testing.T) {
ReadOnly: readonlyVolume,
},
"csi": {
Type: structs.VolumeTypeCSI,
Source: "prod-db",
Type: structs.VolumeTypeCSI,
Source: "prod-db",
AttachmentMode: structs.CSIVolumeAttachmentModeBlockDevice,
AccessMode: structs.CSIVolumeAccessModeSingleNodeWriter,
},
}

Expand Down
2 changes: 2 additions & 0 deletions nomad/structs/structs_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1127,6 +1127,8 @@ func TestTaskGroup_Validate(t *testing.T) {
err = tg.Validate(&Job{})
require.Contains(t, err.Error(), `volume has an empty source`)
require.Contains(t, err.Error(), `volume cannot be per_alloc when canaries are in use`)
require.Contains(t, err.Error(), `CSI volumes must have an attachment mode`)
require.Contains(t, err.Error(), `CSI volumes must have an access mode`)

tg = &TaskGroup{
Volumes: map[string]*VolumeRequest{
Expand Down
12 changes: 12 additions & 0 deletions nomad/structs/volumes.go
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,18 @@ func (v *VolumeRequest) Validate(canaries int) error {
mErr.Errors = append(mErr.Errors,
fmt.Errorf("host volumes cannot have an access mode"))
}
if v.Type == VolumeTypeHost && v.MountOptions != nil {
mErr.Errors = append(mErr.Errors,
fmt.Errorf("host volumes cannot have mount options"))
}
if v.Type == VolumeTypeCSI && v.AttachmentMode == CSIVolumeAttachmentModeUnknown {
mErr.Errors = append(mErr.Errors,
fmt.Errorf("CSI volumes must have an attachment mode"))
}
if v.Type == VolumeTypeCSI && v.AccessMode == CSIVolumeAccessModeUnknown {
mErr.Errors = append(mErr.Errors,
fmt.Errorf("CSI volumes must have an access mode"))
}

if v.AccessMode == CSIVolumeAccessModeSingleNodeReader || v.AccessMode == CSIVolumeAccessModeMultiNodeReader {
if !v.ReadOnly {
Expand Down
68 changes: 68 additions & 0 deletions scheduler/reconcile_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -578,6 +578,74 @@ func TestReconciler_Inplace_ScaleDown(t *testing.T) {
assertNamesHaveIndexes(t, intRange(5, 9), stopResultsToNames(r.stop))
}

// TestReconciler_Inplace_Rollback tests that a rollback to a previous version
// generates the expected placements for any already-running allocations of
// that version.
func TestReconciler_Inplace_Rollback(t *testing.T) {
job := mock.Job()
job.TaskGroups[0].Count = 4
job.TaskGroups[0].ReschedulePolicy = &structs.ReschedulePolicy{
DelayFunction: "exponential",
Interval: time.Second * 30,
Delay: time.Hour * 1,
Attempts: 3,
Unlimited: true,
}

// Create 3 existing allocations
var allocs []*structs.Allocation
for i := 0; i < 3; i++ {
alloc := mock.Alloc()
alloc.Job = job
alloc.JobID = job.ID
alloc.NodeID = uuid.Generate()
alloc.Name = structs.AllocName(job.ID, job.TaskGroups[0].Name, uint(i))
allocs = append(allocs, alloc)
}
// allocs[0] is an allocation from version 0
allocs[0].ClientStatus = structs.AllocClientStatusRunning

// allocs[1] and allocs[2] are failed allocations for version 1 with
// different rescheduling states
allocs[1].ClientStatus = structs.AllocClientStatusFailed
allocs[1].TaskStates = map[string]*structs.TaskState{
"web": &structs.TaskState{FinishedAt: time.Now().Add(-10 * time.Minute)}}
allocs[2].ClientStatus = structs.AllocClientStatusFailed

// job is rolled back, we expect allocs[0] to be updated in-place
allocUpdateFn := allocUpdateFnMock(map[string]allocUpdateType{
allocs[0].ID: allocUpdateFnInplace,
}, allocUpdateFnDestructive)

reconciler := NewAllocReconciler(testlog.HCLogger(t), allocUpdateFn,
false, job.ID, job, nil, allocs, nil, uuid.Generate())
r := reconciler.Compute()

// Assert the correct results
assertResults(t, r, &resultExpectation{
createDeployment: nil,
deploymentUpdates: nil,
place: 2,
inplace: 1,
stop: 1,
destructive: 1,
attributeUpdates: 1,
desiredTGUpdates: map[string]*structs.DesiredUpdates{
job.TaskGroups[0].Name: {
Place: 2,
Stop: 1,
InPlaceUpdate: 1,
DestructiveUpdate: 1,
},
},
})

assert.Len(t, r.desiredFollowupEvals, 1, "expected 1 follow-up eval")
assertNamesHaveIndexes(t, intRange(0, 0), allocsToNames(r.inplaceUpdate))
assertNamesHaveIndexes(t, intRange(2, 2), stopResultsToNames(r.stop))
assertNamesHaveIndexes(t, intRange(2, 3), placeResultsToNames(r.place))
}

// Tests the reconciler properly handles destructive upgrading allocations
func TestReconciler_Destructive(t *testing.T) {
job := mock.Job()
Expand Down
2 changes: 1 addition & 1 deletion website/.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
NEXT_PUBLIC_ALGOLIA_APP_ID=YY0FFNI7MF
NEXT_PUBLIC_ALGOLIA_INDEX=product_NOMAD
NEXT_PUBLIC_ALGOLIA_SEARCH_ONLY_API_KEY=5037da4824714676226913c65e961ca0
NEXT_PUBLIC_ALGOLIA_SEARCH_ONLY_API_KEY=9bfec34ea54e56a11bd50d6bfedc5e71
1 change: 1 addition & 0 deletions website/content/docs/commands/volume/snapshot-create.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -54,3 +54,4 @@ Completed snapshot of volume ebs_prod_db1 with snapshot ID snap-12345.
[csi]: https://github.com/container-storage-interface/spec
[csi_plugin]: /docs/job-specification/csi_plugin
[registered]: /docs/commands/volume/register
[csi_plugins_internals]: /docs/internals/plugins/csi#csi-plugins
1 change: 1 addition & 0 deletions website/content/docs/commands/volume/snapshot-delete.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,4 @@ Deleted snapshot snap-12345.
[csi]: https://github.com/container-storage-interface/spec
[csi_plugin]: /docs/job-specification/csi_plugin
[registered]: /docs/commands/volume/register
[csi_plugins_internals]: /docs/internals/plugins/csi#csi-plugins
1 change: 1 addition & 0 deletions website/content/docs/commands/volume/snapshot-list.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,4 @@ snap-67890 vol-fedcba 50GiB 2021-01-04T15:45:00Z true
[csi]: https://github.com/container-storage-interface/spec
[csi_plugin]: /docs/job-specification/csi_plugin
[registered]: /docs/commands/volume/register
[csi_plugins_internals]: /docs/internals/plugins/csi#csi-plugins
7 changes: 4 additions & 3 deletions website/content/docs/job-specification/service.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -229,9 +229,10 @@ scripts.
- `grpc_use_tls` `(bool: false)` - Use TLS to perform a gRPC health check. May
be used with `tls_skip_verify` to use TLS but skip certificate verification.

- `initial_status` `(string: <enum>)` - Specifies the originating status of the
service. Valid options are the empty string, `passing`, `warning`, and
`critical`.
- `initial_status` `(string: <enum>)` - Specifies the starting status of the
service. Valid options are `passing`, `warning`, and `critical`. Omitting
this field (or submitting an empty string) will result in the Consul default
behavior, which is `critical`.

- `success_before_passing` `(int:0)` - The number of consecutive successful checks
required before Consul will transition the service status to [`passing`][consul_passfail].
Expand Down
Loading

0 comments on commit 7c6c23d

Please sign in to comment.