Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: Use CPut when writing to system.descriptor table #88844

Conversation

Xiang-Gu
Copy link
Contributor

@Xiang-Gu Xiang-Gu commented Sep 27, 2022

Each individual commit has a rather thorough message, so I encourage
reviewers to read them as well.

Commit 1: code pattern, non-functional changes
Commit 2: Added a []byte field in descriptor builder
and immutable struct. We also modified existing functions
to carry over this field between descriptor and Mutable/immutable
descriptor, and vice versa.

Commit 3: Upgrade WriteDescToBatch to use CPut rather than
Put to prevent clobbering of the system.descriptor table.
We get the expected value for the CPut when first reading the
descriptor into collection from storage, as well as keep booking
the descriptor's latest change inside the uncommitted descriptor
set, in case more we write to the same descriptor more than once
in the same transaction.

I didn't add any tests in this PR since this changed function
WriteDescToBatch is heavily invoked for many sql stmts,
so I think we should be pretty confident about its correctness
if it passes all existing tests (unit, logic, and roachtest).

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@@ -49,7 +49,7 @@ import (
// Mutable is a custom type for TableDescriptors
// going through schema mutations.
type Mutable struct {
wrapper
immutable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change in this whole commit should, in my opinion. go completely in the other way. We shouldn't name things as immutable when, in fact, they're not immutable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! For now, I've reverted this commit.

While I'm on this, I think I can have a PR that clears this part of code a bit, maybe after your #88786 is merged where we use the following pattern for all types of descriptors

type wrapper struct {
    descpb.(Table|Schema|Database|Type|Function)Descriptor
    other_fields
}

type immutable struct {
   wrapper
   // no other fields should be here!
}

type Mutable struct {
  wrapper
  original *immutable
}

And, wrapper will be the base implementation for catalog.Descriptor, which means we should only have functions whose receivers are *wrapper and *Mutable but not *immutable.

@@ -88,6 +88,8 @@ type DescriptorBuilder interface {
// upgrade migrations
RunRestoreChanges(descLookupFn func(id descpb.ID) Descriptor) error

SetRawBytesInDescriptor(rawBytes []byte)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass the raw bytes to the DescriptorBuilder constructors instead? Either you create the builder from another descriptor and that byte slice is available there, or you create it from a protobuf and the bytes are usually not far. In any case, we have to pass them along.

I think this PR might benefit from being rebased on #88786 which tightens how these builders can be created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difficulty would be when we upsert a descriptor to the uncommitted descriptor set, we will need to update this rawBytesInStorage field, instead of passing it along. This is the main reason I choose to have a setter on this field.

pkg/sql/catalog/descs/uncommitted_descriptors.go Outdated Show resolved Hide resolved
@Xiang-Gu Xiang-Gu force-pushed the use_cput_when_writing_system_descriptor_table branch 4 times, most recently from a7a01d0 to df0d869 Compare September 29, 2022 17:05
This will be needed when I later added a field to the function
builder which we wish to be able to set it.
@Xiang-Gu Xiang-Gu force-pushed the use_cput_when_writing_system_descriptor_table branch from df0d869 to 77932b0 Compare October 3, 2022 02:21
@Xiang-Gu
Copy link
Contributor Author

Xiang-Gu commented Oct 3, 2022

@postamar I rebased and polished a bit more on your recent merge. I fixed a few test failures discovered from CI -- they're because I forgot to properly initialize this field for unsafe_upsert_descriptor.

I think this is RFAL :)

@Xiang-Gu Xiang-Gu marked this pull request as ready for review October 3, 2022 02:27
@Xiang-Gu Xiang-Gu requested a review from a team October 3, 2022 02:27
@Xiang-Gu Xiang-Gu force-pushed the use_cput_when_writing_system_descriptor_table branch from 77932b0 to 8f1bea1 Compare October 3, 2022 13:19
Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM conditional on addressing the minor comments. Nicely done.

return nil
}
return desc.rawBytesInStorage
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider deep-copying the byte slice, either here or in the builder. Notice that the builder already deep-copies the protobuf in almost all cases except for an edge case on the hot path. The memory overhead is not a concern outside of this edge case.

// which means it has existed in `tc.uncommitted`, we will retrieve the expected
// bytes from there.
var expected []byte
if exist := tc.uncommitted.uncommitted.GetByID(desc.GetID()); exist == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use getUncommittedByID here instead

uber nit: perhaps it's just me but evaluating the nil case first in this if-block feels weird to me.

@@ -284,7 +284,7 @@ func TestAddUncommittedDescriptorAndMutableResolution(t *testing.T) {
immByNameAfter, err := descriptors.GetImmutableDatabaseByName(ctx, txn, "new_name", flags)
require.NoError(t, err)
require.Equal(t, db.GetVersion(), immByNameAfter.GetVersion())
require.Equal(t, mut.ImmutableCopy(), immByNameAfter)
require.Equal(t, mut.ImmutableCopy().(catalog.DatabaseDescriptor).DatabaseDesc(), immByNameAfter.DatabaseDesc())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if you want to avoid the cast here, you can also compare the results of .DescriptorProto() to the same effect.

if err = val.SetProto(mut.DescriptorProto()); err != nil {
return err
}
rawBytesInStorage := val.TagAndDataBytes()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rename this to rawBytesInStorageAfterPendingWrite perhaps?

// compute what the raw bytes of this descriptor in storage is going to
// look like when that write happens, so that any further update to this
// descriptor, which will be retrieved from the "uncommitted" set, will
// carry the correct raw bytes in storage with it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this, it's rather elegant.

func (desc *Mutable) GetRawBytesInStorage() []byte {
if desc == nil {
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

desc can never be nil here, or in other Descriptor implementationgs

This PR added a `[]byte` filed (called `rawBytesInStorage`) in the
[table|schema|database|type|function] immutable descriptor struct and
the [table|schema|database|type|function] descriptor builder struct.

The purpose is to faciliate the next commit where we want to do book
keeping of what's actually in storage of each descriptor, so that we can
use a CPut when updating a descriptor in storage. See next commit for
more details.

Accompanying this added field is two methods added:
 - `GetRawBytesInStorage() []byte` in `catalog.Descriptor`, and
 - `SetRawBytesInStorage([]byte)` in `catalog.DescriptorBuilder`
They serve as the getter and setter of this field.

We also modified method that gives a builder from a descriptor to carry
over this field.

The way this byte slice field will be initialized and carried over in
a builder and a descriptor will be something like the following graph,
where `XXX` denotes an entity, `xxx` denotes an action, and `==>`
denotes the direction of action.

      DESCPB.DESCRIPTOR
            | |
            | |
     construct a builder
            | |
            | |
             V
       CATALOG.BUILDER
            | |
            | |
      set byte slice field
            | |
            | |
             V
       CATALOG.BUILDER (with the byte slice field set)
            | |
            | |
        build a desc
            | |
            | |
             V
  MUTABLEDESCRIPTOR or DESCRIPTOR
  (with the byte slice field carried over from the builder)
            | |
            | |
         new builder
            | |
            | |
             V
      CATALOG.BUILDER
  (with the byte slice field carried over from the descriptor)
            | |
            | |
            ...
Previously, `WriteDescToBatch`, which is called to
update a descriptor in storage by a `desc.Collection`, uses `Put`
primitive to directly modify the storage layer. We would like to tighten
it to use a `CPut` to prevent (in)advertent clobbering of that table.

This PR does that by book-keeping the raw bytes of the to-be-updated
descriptor in the descriptor, acquired when we first read it into the
`desc.Collection` from the storage layer. Then, the infrastructural
work done in the previous commit allows us to carry over these raw bytes
as we are preparing the `MutableDescriptor` that we pass into this
`WriteDescToBatch` method.

One additional difficulty is that what if we will need to call
`WriteDescToBatch` more than once in one transaction. For example,
in the new schema changer, statement phase and precommit phase will both
be in one transaction, but we call `WriteDescToBatch` at the end of
each stage. Hence, for some DDL stmts (e.g. `DROP COLUMN`), we will
call `WriteDescToBatch` twice in one transaction. The first call will
modify the descriptor in storage and also added this descriptor to
`desc.Collection.uncommitted` set, so, the second call will get it from
there. To make `CPut` work correctly for the second call, we will need
to get the expected byte slice from the `uncommitted` descriptor set.
This motivates the change to update a descriptor's byte slice field when
it's added to the `uncommitted` descriptor sett.
@Xiang-Gu Xiang-Gu force-pushed the use_cput_when_writing_system_descriptor_table branch from 8f1bea1 to aa3c59c Compare October 3, 2022 18:17
Copy link
Contributor Author

@Xiang-Gu Xiang-Gu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! I've addressed the feedback.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @postamar)


pkg/sql/catalog/dbdesc/database_desc.go line 352 at r6 (raw file):

Previously, postamar (Marius Posta) wrote…

Consider deep-copying the byte slice, either here or in the builder. Notice that the builder already deep-copies the protobuf in almost all cases except for an edge case on the hot path. The memory overhead is not a concern outside of this edge case.

done.
I've made it such that we deep copy this field when we

  • construct the builder from protobuf,
  • build a descriptor from the builder, and
  • construct a new builder from a descriptor

pkg/sql/catalog/descs/collection.go line 280 at r6 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: use getUncommittedByID here instead

uber nit: perhaps it's just me but evaluating the nil case first in this if-block feels weird to me.

done for both nits!

Code quote:

if exist := tc.uncommitted.uncommitted.GetByID(desc.GetID())

pkg/sql/catalog/descs/uncommitted_descriptors.go line 196 at r3 (raw file):

Previously, postamar (Marius Posta) wrote…

Turns out I got it wrong a bit, we discussed this offline.

resolved!


pkg/sql/catalog/descs/uncommitted_descriptors.go line 195 at r6 (raw file):

Previously, postamar (Marius Posta) wrote…

I'm fine with this, it's rather elegant.

thx


pkg/sql/catalog/descs/uncommitted_descriptors.go line 200 at r6 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: rename this to rawBytesInStorageAfterPendingWrite perhaps?

good suggestion! Done!


pkg/sql/catalog/tabledesc/structured.go line 1018 at r6 (raw file):

Previously, postamar (Marius Posta) wrote…

desc can never be nil here, or in other Descriptor implementationgs

I guess I was just paranoid when writing this -- I've removed the if block.

Code quote:

(desc *Mutable) GetRawBytesInStorage() []byte {

pkg/sql/catalog/descs/collection_test.go line 287 at r6 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: if you want to avoid the cast here, you can also compare the results of .DescriptorProto() to the same effect.

done, and I've also changed two other places where I did this.

Code quote:

require.Equal(t, mut.ImmutableCopy().(catalog.DatabaseDescriptor).DatabaseDesc()

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! My nits can be ignored or done later if you agree with them

Reviewed 1 of 21 files at r4, 1 of 8 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @Xiang-Gu)


pkg/sql/catalog/dbdesc/database_desc_builder.go line 151 at r8 (raw file):

// SetRawBytesInStorage implements the catalog.DescriptorBuilder interface.
func (ddb *databaseDescriptorBuilder) SetRawBytesInStorage(rawBytes []byte) {
	ddb.rawBytesInStorage = append([]byte(nil), rawBytes...) // deep-copy

nit: you can do make([]byte, 0, len(rawBytes)) to allocate the right size array underneath here and elsewhere.


pkg/sql/catalog/descs/uncommitted_descriptors.go line 201 at r8 (raw file):

	}
	rawBytesInStorageAfterPendingWrite := val.TagAndDataBytes()
	uncommittedBuilder := mut.NewBuilder()

another option instead of the setter is a NewBuilderWithRawBytes method. I think I prefer that if you're open to it.

Copy link
Contributor Author

@Xiang-Gu Xiang-Gu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)


pkg/sql/catalog/dbdesc/database_desc_builder.go line 151 at r8 (raw file):

Previously, ajwerner wrote…

nit: you can do make([]byte, 0, len(rawBytes)) to allocate the right size array underneath here and elsewhere.

the reason I didn't do this is if rawBytes is nil, then we will end up with a 0-length slice, rather than also a nil.


pkg/sql/catalog/descs/uncommitted_descriptors.go line 201 at r8 (raw file):

Previously, ajwerner wrote…

another option instead of the setter is a NewBuilderWithRawBytes method. I think I prefer that if you're open to it.

If we want to completely get rid of this setter, we will need to change two places:

  1. building from a protobuf: the current method is NewBuilderWithMVCCTimestamp. To replace it, we will need to plumb in the raw bytes there to construct a builder from (descpb.descriptor, mvccTimestamp, rawBytes);
  2. building from a catalog.Descriptor (which is what we need here specifically): as you suggested, we can create a NewBuilderWithRawBytes(rawBytes) to replace the old NewBuilder() with a default value nil.

I'm open to it, in a different PR, if you prefer this.

@craig
Copy link
Contributor

craig bot commented Oct 3, 2022

Build failed (retrying...):

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @Xiang-Gu)


pkg/sql/catalog/descs/uncommitted_descriptors.go line 201 at r8 (raw file):

Previously, Xiang-Gu (Xiang Gu) wrote…

If we want to completely get rid of this setter, we will need to change two places:

  1. building from a protobuf: the current method is NewBuilderWithMVCCTimestamp. To replace it, we will need to plumb in the raw bytes there to construct a builder from (descpb.descriptor, mvccTimestamp, rawBytes);
  2. building from a catalog.Descriptor (which is what we need here specifically): as you suggested, we can create a NewBuilderWithRawBytes(rawBytes) to replace the old NewBuilder() with a default value nil.

I'm open to it, in a different PR, if you prefer this.

Certainly no need to do it in this PR. I don't hate the setter, it's just that it always seems paired with construction.

@craig craig bot merged commit fca6c42 into cockroachdb:master Oct 3, 2022
@craig
Copy link
Contributor

craig bot commented Oct 3, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants