- Feature Name: User IDs For Internal User Information
- Status: draft
- Start Date: 2022-03-07
- Authors: Fenil Patel
- RFC PR: #77453
- Cockroach Issue: #76079
A user and all of their privileges in CockroachDB are identified by their username. To implement ALTER ROLE ... RENAME TO ...
statement is difficult as you need to create a new role with the same privileges and then drop the old one
(mentioned in #50821).
We will support stable IDs to key internal user information instead of usernames. It requires a long-running migration to give existing users IDs. The type of the user IDs is yet to be decided.
Currently, all internal per-user information is keyed by the username. It's not ideal because of the complications to
implement ALTER ROLE ... RENAME TO ...
statement and usernames can be considered personal data, but we need to use
them in many places in logging. Implementing this in CockroachDB will enhance compatibility with the privilege system of
Postgres, where pg_user
has users have IDs in usesysid
column.
The idea is to convert code dependencies of username and lookups to use the IDs instead in 3 steps. First adding the ID columns to system tables and update related syntax statements. Second, updating the user privileges objects and lookup based on usernames. Finally, migrating the existing instances of Cockroach.
The ids will be randomly generated UUID when creating a new user during insert into system.users. SQL statements like insert, lookups and delete for all other tables will be updated such that the IDs from the users table are used.
[email protected]:26257/defaultdb> \d system.users
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
-----------------+-----------+-------------+----------------+-----------------------+-----------+------------
username | STRING | false | NULL | | {primary} | false
hashedPassword | BYTES | true | NULL | | {primary} | false
isRole | BOOL | false | false | | {primary} | false
(3 rows)
[email protected]:26257/defaultdb> \d system.role_options
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+-----------+------------
username | STRING | false | NULL | | {primary} | false
option | STRING | false | NULL | | {primary} | false
value | STRING | true | NULL | | {primary} | false
(3 rows)
[email protected]:26257/defaultdb> \d system.role_members
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+---------------------------------------------------------+------------
role | STRING | false | NULL | | {primary,role_members_member_idx,role_members_role_idx} | false
member | STRING | false | NULL | | {primary,role_members_member_idx,role_members_role_idx} | false
isAdmin | BOOL | false | NULL | | {primary} | false
(3 rows)
[email protected]:26257/defaultdb> \d system.database_role_settings
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+-----------+------------
database_id | OID | false | NULL | | {primary} | false
role_name | STRING | false | NULL | | {primary} | false
settings | STRING[] | false | NULL | | {primary} | false
(3 rows)
[email protected]:26257/movr> \d system.users
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
-----------------+-----------+-------------+-------------------+-----------------------+---------------------------------+------------
username | STRING | false | NULL | | {primary,users_username_id_idx} | false
hashedPassword | BYTES | true | NULL | | {primary} | false
isRole | BOOL | false | false | | {primary} | false
user_id | UUID | false | gen_random_uuid() | | {primary,users_username_id_idx} | false
(4 rows)
[email protected]:26257/movr> \d system.role_options
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+-----------+------------
username | STRING | false | NULL | | {primary} | false
option | STRING | false | NULL | | {primary} | false
value | STRING | true | NULL | | {primary} | false
user_id | UUID | false | NULL | | {primary} | false
(4 rows)
[email protected]:26257/movr> \d system.role_members
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+---------------------------------------------------------+------------
role | STRING | false | NULL | | {primary,role_members_member_idx,role_members_role_idx} | false
member | STRING | false | NULL | | {primary,role_members_member_idx,role_members_role_idx} | false
isAdmin | BOOL | false | NULL | | {primary} | false
role_id | UUID | false | NULL | | {primary,role_members_member_idx,role_members_role_idx} | false
member_id | UUID | false | NULL | | {primary,role_members_member_idx,role_members_role_idx} | false
(5 rows)
[email protected]:26257/movr> \d system.database_role_settings
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+-----------+------------
database_id | OID | false | NULL | | {primary} | false
role_name | STRING | false | NULL | | {primary} | false
settings | STRING[] | false | NULL | | {primary} | false
role_id | UUID | false | NULL | | {primary} | false
(4 rows)
-
UUID allows for randomly generated IDs during insert using
gen_random_uuid()
as default value for the column.-
- It requires no collision handling as it is handled by
gen_random_uuid()
. - Adding a new column with a default value makes it easy to migrate older versions.
- It requires no collision handling as it is handled by
-
- Users can't make use of ids themselves in other statements as they are hard to remember. Ideally, they should access it programmatically from the pg_user table but postgres allows users to specify uid when creating new users
-
-
OID type that CRDB already has and mimics what postgres does with uid when creating a user (the type of
usesysid
is OID). This is useful when allowing a user to specify the uid in statements for future updates. For existing users, a hash function could be used to generate their ID. We can also take an approach similar to postgres and have the new ID be the highest ID + 1.-
- Matches postgres use of UID (see example below) and migration from postrges to CRDB will be easier by generating a
pg-compatible
pg_user
virtual table, with a usefulusesysid
column. - User can refer to them and possibly use them in other statements when specifying uid.
- Currently, OID is uint64 and a column can be added easily during long running migration:
alter table t1 add column y oid default oid(unique_rowid());
- Matches postgres use of UID (see example below) and migration from postrges to CRDB will be easier by generating a
pg-compatible
-
- Due to #41904, OID might be converted to uint32 to match
postgres and this means
unique_rowid()
can no longer be used as it generates uint64 values. So without resolving this issue, we can migrate from postgres to CRDB, but it may not be possible vice versa. - To atomically add a NOT NULL column during long running migration after issue is resolved, we have to come up
with another function to generate the ID here instead of
unique_rowid
. This will most likely be less efficient.
- Due to #41904, OID might be converted to uint32 to match
postgres and this means
-
-
INT - it would be similar to OID type, and we can migrate from postgres to CRDB but not vice versa. Adding columns with a default value of
unique_rowid()
also simplifies long running migration between node versions. The only difference is that there will be no changes to INT in the near future.
postgres=# create user a;
CREATE ROLE
postgres=# SELECT * FROM pg_user;
usename | usesysid | usecreatedb | usesuper | userepl | usebypassrls | passwd | valuntil | useconfig
----------+----------+-------------+----------+---------+--------------+----------+----------+-----------
postgres | 10 | t | t | t | t | ******** | |
a | 16385 | f | f | f | f | ******** | |
(2 rows)
postgres=# create user b;
CREATE ROLE
postgres=# create user aa;
CREATE ROLE
postgres=# create user ba;
CREATE ROLE
postgres=# create user ab;
CREATE ROLE
postgres=# SELECT * FROM pg_user;
usename | usesysid | usecreatedb | usesuper | userepl | usebypassrls | passwd | valuntil | useconfig
----------+----------+-------------+----------+---------+--------------+----------+----------+-----------
postgres | 10 | t | t | t | t | ******** | |
a | 16385 | f | f | f | f | ******** | |
b | 16386 | f | f | f | f | ******** | |
aa | 16387 | f | f | f | f | ******** | |
ba | 16388 | f | f | f | f | ******** | |
ab | 16389 | f | f | f | f | ******** | |
(6 rows)
*Require more discussion on this
An ID field will be added to UserPrivileges and any objects that contain a username. The Privilege Descriptors store user privilege objects in an array in sorted order based on usernames currently.
- Refactor the code such that the privileges array is sorted by IDs instead. Or have an additional array in privilege descriptors sorted by IDs.
- Use a map to lookup user privileges by IDs. Not sure why this wasn't used for lookups by username. Possibly because it may not have been supported by protobuf initially.
Code in privilege.pb.go
:
// PrivilegeDescriptor describes a list of users and attached
// privileges. The list should be sorted by user for fast access.
type PrivilegeDescriptor struct {
Users []UserPrivileges `protobuf:"bytes,1,rep,name=users" json:"users"`
OwnerProto github_com_cockroachdb_cockroach_pkg_security.SQLUsernameProto `protobuf:"bytes,2,opt,name=owner_proto,json=ownerProto,casttype=github.com/cockroachdb/cockroach/pkg/security.SQLUsernameProto" json:"owner_proto"`
Version PrivilegeDescVersion `protobuf:"varint,3,opt,name=version,casttype=PrivilegeDescVersion" json:"version"`
}
Since most SQL syntax takes in username/role, we also require a lookup function to get the user ID from system.users by a username. Additionally, add caching the lookups to reduce the overall runtime of statements and tests.
We need to add the new system table columns to existing instances and migrate all the existing users and generate their corresponding IDs. The migration will need to update the UserPriviliges to contain the new user IDs.
During the migration, we will need to update every field where a username is stored to also store an ID. The only places right now are the various system tables, user privileges and default privileges. Update the existing Privilege descriptors using one of the lookup implementation options mentioned above in order to support the lookups by user IDs.
We’ll need to version gate all lookups to ensure the use of username until all the nodes have been migrated to 22.1. It ensures that if the ID field does not exist in UserPrivileges of all nodes, it doesn’t result in conflicts during lookups. Once all nodes are upgraded from 21.2 to 22.1, the version gate will allow privilege lookups using IDs.
22.1 & Mixed Cluster | 22.2 |
---|---|
User privilege lookups are done using username. ID field in 22.2 version nodes is ignored. | ID is acquired from system.user and user privilege lookups are done using IDs instead. |