Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: GCE/AWS nightly and release roachtest for FIPS #99224

Merged
merged 1 commit into from
Mar 30, 2023

Conversation

rail
Copy link
Member

@rail rail commented Mar 22, 2023

Previously, roachtest nightly tests were limited to run on regular VMs. In order to check FIPS functionality, we need to run the tests on FIPS-enabled VMs.

This PR adds a flag in order to select FIPS-enabled GCE or AWS VMs.

  • Added --fips flag.
  • Added wrapper scripts to selectively run FIPS-enabled tests.
  • Updated the AWS configs to have an alternative AMIs for FIPS.

Epic: DEVINF-478
Release note: None

@rail rail added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-testing Testing tools and infrastructure T-dev-inf labels Mar 22, 2023
@rail rail requested review from a team as code owners March 22, 2023 14:26
@rail rail self-assigned this Mar 22, 2023
@rail rail requested review from herkolategan and smg260 and removed request for a team March 22, 2023 14:26
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rail rail requested review from healthy-pod and renatolabs March 23, 2023 18:20
@rail rail added the backport-23.1.x Flags PRs that need to be backported to 23.1 label Mar 27, 2023
Copy link
Contributor

@smg260 smg260 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 9 of 18 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @healthy-pod, @herkolategan, @rail, and @renatolabs)


pkg/cmd/roachtest/spec/cluster_spec.go line 177 at r1 (raw file):

	}
	opts.TerminateOnMigration = terminateOnMigration
	if enableFIPS {

For aws, we set the image in aws.go. Could this be moved to gcloud.go for consistency?

Copy link
Member Author

@rail rail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @healthy-pod, @herkolategan, @renatolabs, and @smg260)


pkg/cmd/roachtest/spec/cluster_spec.go line 177 at r1 (raw file):

Previously, smg260 (Miral Gadani) wrote…

For aws, we set the image in aws.go. Could this be moved to gcloud.go for consistency?

Good point. Let me move it there.
Maybe we could add a constructor, similar toDefaultProviderOpts() that accepts options to make it clearer. 🤔

Copy link
Member Author

@rail rail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @healthy-pod, @herkolategan, @renatolabs, and @smg260)


pkg/cmd/roachtest/spec/cluster_spec.go line 177 at r1 (raw file):

Previously, rail (Rail Aliiev) wrote…

Good point. Let me move it there.
Maybe we could add a constructor, similar toDefaultProviderOpts() that accepts options to make it clearer. 🤔

Done

Copy link
Contributor

@smg260 smg260 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Minor comments

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @healthy-pod, @herkolategan, @rail, and @renatolabs)


pkg/roachprod/vm/gce/gcloud.go line 698 at r2 (raw file):

	imageProject := defaultImageProject
	if opts.EnableFIPS {
		// NB: if FISP is enabled, it overrides the image passed via CLI (--gce-image)

nit: typo s/FISP/FIPS

I would also repeat the comment in roachtest/main.go where the flag is defined.

Copy link
Member Author

@rail rail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @healthy-pod, @herkolategan, @renatolabs, and @smg260)


pkg/roachprod/vm/gce/gcloud.go line 698 at r2 (raw file):

Previously, smg260 (Miral Gadani) wrote…

nit: typo s/FISP/FIPS

I would also repeat the comment in roachtest/main.go where the flag is defined.

Doh.
I updated the help message.

smg260

This comment was marked as outdated.

Copy link
Contributor

@smg260 smg260 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @healthy-pod, @herkolategan, and @renatolabs)

@rail
Copy link
Member Author

rail commented Mar 29, 2023

TFTR!

pkg/cmd/roachtest/spec/cluster_spec.go Outdated Show resolved Hide resolved
pkg/roachprod/vm/gce/utils.go Show resolved Hide resolved
@rail rail force-pushed the fips-nightly branch 2 times, most recently from 2fe8358 to c9485e8 Compare March 30, 2023 18:40
Previously, roachtest nightly tests were limited to run on regular VMs.
In order to check FIPS functionality, we need to run the tests on
FIPS-enabled VMs.

This PR adds a flag in order to select FIPS-enabled GCE or AWS VMs.

* Added `--fips` flag.
* Added wrapper scripts to selectively run FIPS-enabled tests.
* Updated the AWS configs to have an alternative AMIs for FIPS.

Epic: DEVINF-478
Release note: None
srosenberg added a commit to srosenberg/cockroach that referenced this pull request May 25, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request May 30, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request May 31, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request May 31, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 1, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 1, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 1, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
craig bot pushed a commit that referenced this pull request Jun 2, 2023
103710: roachtest: metamorphic ARM64 and FIPS clusters r=smg260,herkolategan a=srosenberg

Previously, all roachtests used (cloud) machine types with the AMD64 (cpu) architecture. Recently [1], new CI infrastructure was added to run a clone of all the nightly roachtests, configured with FIPS; i.e., same AMD64 machine types, different AMI and crdb binary, patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled via the two CLI args: `metamorphic-arm64-probability` and `metamorphic-fips-probability`. The former denotes the probability (over the uniform distribution) of a new cluster provisioned using ARM64 VMs. The latter denotes the probability of a new AMD64 cluster provisioned with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1. E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64 clusters are chosen ~40% of the time, whereas of the remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both 
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters 
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which 
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning for AWS in roachprod. We add ARM64 provisioning for GCE, i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter isn't formally a CPU architecture; however, it simplifies provisioning and binary staging.
We also modify roachprod.List to display CPU architecture, other than AMD64, with the machine type; this should make it easier to see which clusters are running ARM64 and FIPS configurations, as we ramp up their testing.

Epic: none
Release note: None

Resolves: #94957
Informs: #94986

[1] #99224
[2] #103243

Co-authored-by: Stan Rosenberg <[email protected]>
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 9, 2023
As of [1], terraform is used in conjunction with roachprod in AWS.
In addition to defining VPCs (and their pair-wise peerings), the generated TF scripts
assign `ami_id` for every region. This information is made available to roachprod
via config.json, generated from TF and embedded into roachprod binary.

Recently, as of [2], `ami_id_fips` was added. This PR adds `ami_id_arm64` in
preparation for arm64-based nightly roachtests. Furthermore, TF is refactored
to make it compatible with versions 0.14 and onwards. Both AWS resources and
the resulting config.json were generated using TF 1.4.6 (latest at this time).

[1] cockroachdb#36418
[2] cockroachdb#99224

Epic: none
Release note: None
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 9, 2023
As of [1], terraform is used in conjunction with roachprod in AWS.
In addition to defining VPCs (and their pair-wise peerings), the generated TF scripts
assign `ami_id` for every region. This information is made available to roachprod
via config.json, generated from TF and embedded into roachprod binary.

Recently, as of [2], `ami_id_fips` was added. This PR adds `ami_id_arm64` in
preparation for arm64-based nightly roachtests. Furthermore, TF is refactored
to make it compatible with versions 0.14 and onwards. Both AWS resources and
the resulting config.json were generated using TF 1.4.6 (latest at this time).

[1] cockroachdb#36418
[2] cockroachdb#99224

Epic: none
Release note: None
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 10, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 13, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testing Testing tools and infrastructure backport-23.1.x Flags PRs that need to be backported to 23.1 C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-dev-inf
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants