roachtest: use c5, not c5d, for restore 8tb test #98767

tbg · 2023-03-16T14:06:53Z

This works around #98783:

Instance type
c5.2xlarge

Now the roachtest runs on standard EBS volumes (provisioned to 125mb/s,
i.e. pretty weak ones):

$ df -h /mnt/data1/
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme1n1    2.0T  4.0G  2.0T   1% /mnt/data1
$ sudo nvme list | grep nvme1n1
/dev/nvme1n1     vol065ed9110066bb362 Amazon Elastic Block Store               1           2.15  TB /   2.15  TB    512   B +  0 B   1.0

Let's see how this fares.

The theory is that the test previously failed failed due to RAID0
because some nodes would unpredictably be slower than others (depending
on the striping, etc, across the raided inhomogeneous volumes), which we
don't handle well. Now, there's symmetry and hopefully things will be
slower (since we only have 125mb/s per volume now) but functional, i.e.
no more OOMs.

I verified this via

./pkg/cmd/roachtest/roachstress.sh -c 10 restore/tpce/8TB/aws/nodes=10/cpus=8 -- --cloud aws --parallelism 1

Closes #97019.

Epic: CRDB-25503
Release note: None

cockroach-teamcity · 2023-03-16T14:07:03Z

This change is

This is a WIP because the behavior when machine types with local SSD are used is unclear. For example, on AWS, roachtest prefers the c5d family, which all come with local SST storage. But looking into `awsStartupScriptTemplate`, it seems unclear how to make sure that the EBS disk(s) get mounted as /mnt/data1 (which is probably what the default should be). We could also entertain straight-up preventing combinations that would lead to an inhomogeneous RAID0. I imagine we'd have to take a round of failures to find all of the places in which it happens, but perhaps a "snitch" can be inserted instead so that we can detect all such callers and fix them up before arming the check. By the way, EBS disks on AWS come with a default of 125mb/s which is less than this RAID0 gets "most of the time" - so we can expect some tests to behave differently after this change. I still believe this is worth it - debugging is so much harder when you're on top of a storage that's hard to predict and doesn't resemble any production deployment. ---- I wasted weeks of my life on this before, and it almost happened again! When you run a roachtest that asks for an AWS cXd machine (i.e. compute optimized with NVMe local disk), and you specify a VolumeSize, you also get an EBS volume. Prior to these commit, these would be RAID0'ed together. This isn't something sane - the resulting gp3 EBS volume is very different from the local NVMe volume in every way, and it lead to hard-to-understand write throughput behavior. This commit defaults to *not* using RAID0. Touches cockroachdb#98767. Touches cockroachdb#98576. Touches cockroachdb#97019. Epic: none Release note: None

msbutler · 2023-03-16T17:04:39Z

huh, given this finding, do you think we should recommend to customers running on aws with ebs with greater than a couple TBs of data that they should use more than 8vcpu's per node?

tbg · 2023-03-20T12:29:10Z

I'm not sure what to recommend yet, since the test was also running on a zombie RAID0 that striped over EBS and local NVMe, see #98782
I think that RAID0 introduced asymmetry, i.e. some nodes falling behind in throughput, and this is ultimately what's causing the problems.

This works around cockroachdb#98783: ``` Instance type c5.2xlarge ``` Now the roachtest runs on standard EBS volumes (provisioned to 125mb/s, i.e. pretty weak ones): ``` $ df -h /mnt/data1/ Filesystem Size Used Avail Use% Mounted on /dev/nvme1n1 2.0T 4.0G 2.0T 1% /mnt/data1 $ sudo nvme list | grep nvme1n1 /dev/nvme1n1 vol065ed9110066bb362 Amazon Elastic Block Store 1 2.15 TB / 2.15 TB 512 B + 0 B 1.0 ``` Let's see how this fares. The theory is that the test previously failed failed due to RAID0 because some nodes would unpredictably be slower than others (depending on the striping, etc, across the raided inhomogeneous volumes), which we don't handle well. Now, there's symmetry and hopefully things will be slower (since we only have 125mb/s per volume now) but functional, i.e. no more OOMs. I verified this via ``` ./pkg/cmd/roachtest/roachstress.sh -c 10 restore/tpce/8TB/aws/nodes=10/cpus=8 -- --cloud aws --parallelism 1 ``` Epic: CRDB-25503 Release note: None

tbg · 2023-03-22T11:40:46Z

Updated this PR @msbutler, it's different from what it was before (the machine type bump wasn't actually changing anything since the problem was RAID0 and not VM-to-EBS bandwidth).

I'm happy to merge this if the DR team would like me to. It could take a little while until @srosenberg has the proper fix in (i.e. #98783 is closed), but then the test also doesn't fail too often and you might just want to sit it out.

tbg mentioned this pull request Mar 16, 2023

[wip] roachprod: don't use RAID0 by default #98782

Closed

tbg force-pushed the restore-8tb-more-ebs-bandwidth branch 5 times, most recently from 13ca12f to e8d7477 Compare March 20, 2023 20:36

tbg force-pushed the restore-8tb-more-ebs-bandwidth branch from e8d7477 to abd6420 Compare March 20, 2023 20:43

tbg changed the title ~~roachtest: use m2d.4xlarge for 8tb restore test~~ roachtest: use m5, not m5d, for restore 8tb test Mar 21, 2023

tbg changed the title ~~roachtest: use m5, not m5d, for restore 8tb test~~ roachtest: use c5, not c5d, for restore 8tb test Mar 21, 2023

This was referenced Mar 22, 2023

slow 8tb restore #99206

Closed

roachtest: restore/tpce/8TB/aws/nodes=10/cpus=8 failed #97019

Closed

tbg closed this Mar 22, 2023

msbutler mentioned this pull request Mar 27, 2023

roachtest: restore/tpce/8TB/aws/nodes=10/cpus=8 failed #99608

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roachtest: use c5, not c5d, for restore 8tb test #98767

roachtest: use c5, not c5d, for restore 8tb test #98767

tbg commented Mar 16, 2023 •

edited

Loading

cockroach-teamcity commented Mar 16, 2023

msbutler commented Mar 16, 2023

tbg commented Mar 20, 2023

tbg commented Mar 22, 2023

roachtest: use c5, not c5d, for restore 8tb test #98767

roachtest: use c5, not c5d, for restore 8tb test #98767

Conversation

tbg commented Mar 16, 2023 • edited Loading

cockroach-teamcity commented Mar 16, 2023

msbutler commented Mar 16, 2023

tbg commented Mar 20, 2023

tbg commented Mar 22, 2023

tbg commented Mar 16, 2023 •

edited

Loading