Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update CockroachDB to v21.2.9 #988

Merged
merged 2 commits into from
Apr 28, 2022
Merged

update CockroachDB to v21.2.9 #988

merged 2 commits into from
Apr 28, 2022

Conversation

davepacheco
Copy link
Collaborator

This change updates Omicron from CockroachDB v21.1.10 to v21.2.9. Fixes #980, which should unblock #955. There are a few pieces:

  • Changing the CockroachDB version and SHAs
  • As @bnaecker mentioned on #980, CockroachDB now includes the default database name in the listen URL that it reports to us. This PR relaxes the code that parses this to be fine with that as long as the database name is "defaultdb". (The only reason this code is so paranoid in the first place is that I didn't want to accidentally throw away something important that was being provided in the listening URL.)
  • I also changed the check in has_omicron_schema as mentioned here. From @mgartner's reply, I gather that we should always expect the undefined database error for this case now, and that does appear to be true.

Is there anything else we need to do before updating CockroachDB used by Omicron?

Then there's this last bit. In #980 @bnaecker mentioned:

Another possible oddity. I've noticed that the /tmp directory fills up extremely quickly. Like within two runs of the full test suite, whereas I'm not sure I'd ever hit an out-of-space error before. Something about the on-disk format may have changed that warrants further investigation.

This makes me a little nervous. I did some cursory digging, didn't find an issue, and I'm still thinking about how best to go ahead with this.

First of all, I think /tmp here is probably specific to @bnaecker's system (and probably the default behavior). I don't see any omicron-related files in there at all on my system. That's probably because I set TMPDIR=/dangerzone/omicron_tmp in my environment.

As far as I know, what goes in there during a clean cargo test is:

  • whatever Cargo, rustc, ld, and other build tools put into temporary directories (which is unchanged by this change)
  • log files from tests in progress (unchanged by this change)
  • CockroachDB databases, copied from the seed in "target/debug" (could be changed by this change)

On success, I believe these should all be cleaned up -- there should normally be nothing left over. We've definitely had bugs where things forgot to clean up. On failures, this is all left around.

So I looked at the seed directory. After this change, mine is 2.71 MiB:

$ find target/debug -name crdb-base | xargs du -sh
2.71M	target/debug/build/nexus-test-utils-b8b82f169af47163/out/crdb-base

In other workspaces I have lying around (prior to this change), it's usually under 1 MiB. But that's small enough that I can't imagine it's filling up your /tmp.

Another possibility is that CockroachDB itself is using more memory (which comes from the same pool as /tmp, and so I believe could shrink the total size of /tmp) -- but again, that should get cleaned up after a successful run.

My inclination is to proceed with this change provided it passes CI, then separately check that we're cleaning up $TMPDIR properly after test runs. @bnaecker, I'm also happy to help dig into the /tmp problem when you have time. Let me know if you think it should block this, though.

@davepacheco davepacheco requested a review from smklein April 28, 2022 19:15
Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PSA on the potential /tmp issue - we can keep an eye on it as we roll forward, and see if other devs / actions notice failures.

@davepacheco davepacheco enabled auto-merge (squash) April 28, 2022 20:19
@davepacheco davepacheco merged commit bfd825d into main Apr 28, 2022
@davepacheco davepacheco deleted the cockroachdb-upgrade branch April 28, 2022 22:25
@bnaecker
Copy link
Collaborator

bnaecker commented Apr 28, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

apparent cockroachdb data corruption due to cockroachdb issue 74475
3 participants