Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc: User IDs For Internal User Information #77453

Merged
merged 1 commit into from
Apr 25, 2022

Conversation

Fenil-P
Copy link
Contributor

@Fenil-P Fenil-P commented Mar 7, 2022

Release note: None

Release justification: non-code changes, rfc only

@Fenil-P Fenil-P requested a review from a team as a code owner March 7, 2022 20:33
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rafiss rafiss requested review from a team March 7, 2022 21:09
Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this proposal a lot. Would be curious to see more details about the proposed migration.

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Fenil-P)


docs/RFCS/20220307_user_ids.md, line 5 at r1 (raw file):

- Start Date: 2022-03-07
- Authors: Fenil Patel
- RFC PR: (PR # after acceptance of initial draft)

you can update this field now.


docs/RFCS/20220307_user_ids.md, line 6 at r1 (raw file):

- Authors: Fenil Patel
- RFC PR: (PR # after acceptance of initial draft)
- Cockroach Issue: [76079](https://github.com/cockroachdb/cockroach/issues/76079)

nit: we usually include # in front of the number.


docs/RFCS/20220307_user_ids.md, line 10 at r1 (raw file):

# Summary
Support stable IDs to key internal user information instead of username. Requires a long-running migration to give 

Please also include a motivating sentence in the summary.


docs/RFCS/20220307_user_ids.md, line 10 at r1 (raw file):

# Summary
Support stable IDs to key internal user information instead of username. Requires a long-running migration to give 

Please also say in the summary that those IDs will be UUIDs.


docs/RFCS/20220307_user_ids.md, line 11 at r1 (raw file):

# Summary
Support stable IDs to key internal user information instead of username. Requires a long-running migration to give 
existing users IDs

nit: missing period at end of sentence.


docs/RFCS/20220307_user_ids.md, line 16 at r1 (raw file):

Currently, all internal per-user information is keyed by the username. This is not ideal because it makes it difficult 
to rename a user without breaking things. It's also not ideal because usernames can be considered personal data, but we 

you could provide an example of "thing" that breaks.


docs/RFCS/20220307_user_ids.md, line 28 at r1 (raw file):

# Technical design
### Adding ID columns to tables with username columns
The ids will be randomly generated UUID when creating a new user during insert into system.users. The SQL syntax for all 

What is "the SQL syntax for tables"?


docs/RFCS/20220307_user_ids.md, line 81 at r1 (raw file):

hashedPassword | BYTES     |    true     | NULL              |                       | {primary}                       |   false
isRole         | BOOL      |    false    | false             |                       | {primary}                       |   false
id             | UUID      |    false    | gen_random_uuid() |                       | {primary,users_username_id_idx} |   false

NB: column names based off a generic word (in this case, "id") should qualify this word to become more specific.

What's why we don't use just name and say username instead (which arguably would be better called user_name).
Ditto database_id instead of just id, etc.

So, here and throughout: user_id.


docs/RFCS/20220307_user_ids.md, line 103 at r1 (raw file):

member      | STRING    |    false    | NULL           |                       | {primary,role_members_member_idx,role_members_role_idx} |   false
isAdmin     | BOOL      |    false    | NULL           |                       | {primary}                                               |   false
roleId      | UUID      |    false    | NULL           |                       | {primary,role_members_member_idx,role_members_role_idx} |   false

role_id

member_id

let's avoid capitals in new column names. isAdmin, isRole and hashedPasswords are regrettable historical accidents, no need to perpetrate them again.


docs/RFCS/20220307_user_ids.md, line 122 at r1 (raw file):

### Type of User ID
- **UUID** allows for randomly generated IDs during insert using `gen_random_uuid()` as default value for the column.
  #### Pros:

This breaks the formatting in the rendered page. Please use markdown formatting that makes sense.


docs/RFCS/20220307_user_ids.md, line 123 at r1 (raw file):

- **UUID** allows for randomly generated IDs during insert using `gen_random_uuid()` as default value for the column.
  #### Pros:
  - It requires no collision handling 

nit: here and below, make the bullet points full sentences, with a period at the end.


docs/RFCS/20220307_user_ids.md, line 126 at r1 (raw file):

  - Adding a new column with default value makes it easy to migrate older versions
  #### Cons:
  - Users can't make use of ids themselves in other statements as they are hard to remember

I don't understand this point: why would users would ever need to manipulate things by ID in interactive sessions? Why is it important for them to remember anything?


docs/RFCS/20220307_user_ids.md, line 129 at r1 (raw file):

- **INT** is closer to what [postgres does](https://www.postgresql.org/docs/8.0/sql-createuser.html) with uid when creating a user. This is useful when allowing a user to specify the 

nit: wrap text at 80-100 columns.


docs/RFCS/20220307_user_ids.md, line 132 at r1 (raw file):

uid in future updates. For existing users, a hash function could be used to generated their ID. We can also take an approach similar to postgres and have the new ID be the highest ID + 1
  #### Pros:
  - Matches postgres use of UID (see example below)

I think you're underselling the utility of doing user IDs using a (pseudo-)sequence that fits integers.

The ability to generate a pg-compatible pg_user virtual table, with a useful usesysid column, may be more important than we think. Have you discussed this with our AppRel experts?


docs/RFCS/20220307_user_ids.md, line 174 at r1 (raw file):

An ID field will be added to UserPriviliges object. The Privilege Descriptors store user privilege objects in an array 
in sorted order based on usernames currently. We’ll need to refactor the code such that the privileges array is sorted 
by IDs instead.
  1. why is the sorting important?

  2. can we perhaps add a 2nd field alongside the usernames, during the migration period?


docs/RFCS/20220307_user_ids.md, line 195 at r1 (raw file):

### Long-running migration
We need to add the new system table columns to existing instances and migrate all the existing users and generate their 
corresponding IDs.The migration will need to update the UserPriviliges to contain the new user IDs.

nit: missing space after period.

Copy link
Contributor

@RichardJCai RichardJCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Fenil-P and @knz)


docs/RFCS/20220307_user_ids.md, line 16 at r1 (raw file):

Previously, knz (kena) wrote…

you could provide an example of "thing" that breaks.

I think breaking things might not be the right phrasing but rather it makes it difficult to support an operation like renaming a username difficult.


docs/RFCS/20220307_user_ids.md, line 126 at r1 (raw file):

Previously, knz (kena) wrote…

I don't understand this point: why would users would ever need to manipulate things by ID in interactive sessions? Why is it important for them to remember anything?

+1
users shouldn't have to remember / use their internal ID.


docs/RFCS/20220307_user_ids.md, line 135 at r1 (raw file):

  - User can refer to them and possibly use in other statements
  #### Cons:
  - Migration will need to atomically add a NOT NULL column and fill in all the ids

We'd have to atomically add a NOT NULL column in the case of using UUID as well. The difference is we have to come up with another function to generate the ID here instead of gen_random_uuid. The con we have to come up with this function and the function will likely be less efficient.


docs/RFCS/20220307_user_ids.md, line 198 at r1 (raw file):

During the migration, we will need to update the existing Privilege descriptors and resort the UserPriviliges arrays by 
ID in order to support the lookups by user IDs.

We also have to update default privileges.

I'd note it here that we have to update every field where a username is stored to also store an ID and then note that the only places right now are the various system tables, user privileges and default privileges.

Copy link
Contributor

@RichardJCai RichardJCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Fenil-P and @knz)


docs/RFCS/20220307_user_ids.md, line 126 at r1 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

+1
users shouldn't have to remember / use their internal ID.

Actually scratch the use part, they might want to use it but they should access it programmatically from the pg_user table if so.

Copy link
Contributor Author

@Fenil-P Fenil-P left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Fenil-P, @knz, and @RichardJCai)


docs/RFCS/20220307_user_ids.md, line 5 at r1 (raw file):

Previously, knz (kena) wrote…

you can update this field now.

Done.


docs/RFCS/20220307_user_ids.md, line 10 at r1 (raw file):

Previously, knz (kena) wrote…

Please also include a motivating sentence in the summary.

Done.


docs/RFCS/20220307_user_ids.md, line 10 at r1 (raw file):

Previously, knz (kena) wrote…

Please also say in the summary that those IDs will be UUIDs.

Done.


docs/RFCS/20220307_user_ids.md, line 16 at r1 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

I think breaking things might not be the right phrasing but rather it makes it difficult to support an operation like renaming a username difficult.

Updated


docs/RFCS/20220307_user_ids.md, line 28 at r1 (raw file):

Previously, knz (kena) wrote…

What is "the SQL syntax for tables"?

Updated, I was referring to inserts, select and etc...


docs/RFCS/20220307_user_ids.md, line 81 at r1 (raw file):

Previously, knz (kena) wrote…

NB: column names based off a generic word (in this case, "id") should qualify this word to become more specific.

What's why we don't use just name and say username instead (which arguably would be better called user_name).
Ditto database_id instead of just id, etc.

So, here and throughout: user_id.

Done.


docs/RFCS/20220307_user_ids.md, line 103 at r1 (raw file):

Previously, knz (kena) wrote…

role_id

member_id

let's avoid capitals in new column names. isAdmin, isRole and hashedPasswords are regrettable historical accidents, no need to perpetrate them again.

Noted!


docs/RFCS/20220307_user_ids.md, line 122 at r1 (raw file):

Previously, knz (kena) wrote…

This breaks the formatting in the rendered page. Please use markdown formatting that makes sense.

Done.


docs/RFCS/20220307_user_ids.md, line 135 at r1 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

We'd have to atomically add a NOT NULL column in the case of using UUID as well. The difference is we have to come up with another function to generate the ID here instead of gen_random_uuid. The con we have to come up with this function and the function will likely be less efficient.

Updated


docs/RFCS/20220307_user_ids.md, line 174 at r1 (raw file):

Previously, knz (kena) wrote…
  1. why is the sorting important?

  2. can we perhaps add a 2nd field alongside the usernames, during the migration period?

Added two options for lookup implementation, please take a look and let me know what you think!


docs/RFCS/20220307_user_ids.md, line 198 at r1 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

We also have to update default privileges.

I'd note it here that we have to update every field where a username is stored to also store an ID and then note that the only places right now are the various system tables, user privileges and default privileges.

Done.

Fenil-P added a commit to Fenil-P/cockroach that referenced this pull request Mar 23, 2022
Previously we used usernames to access user information and privileges.
We will transition to using IDs by first adding user ids and updating the code
to use IDs where possible.

Release justification: Justified in RFC cockroachdb#77453
Release note: None
Fenil-P added a commit to Fenil-P/cockroach that referenced this pull request Mar 28, 2022
Previously we used usernames to access user information and privileges.
We will transition to using IDs by first adding user ids and updating the code
to use IDs where possible.

Release justification: Justified in RFC cockroachdb#77453
Release note: None
Fenil-P added a commit to Fenil-P/cockroach that referenced this pull request Mar 29, 2022
Previously we used usernames to access user information and privileges.
We will transition to using IDs by first adding user ids and updating the code
to use IDs where possible.

Release justification: Justified in RFC cockroachdb#77453
Release note: None
RichardJCai pushed a commit to RichardJCai/cockroach that referenced this pull request Apr 7, 2022
Previously we used usernames to access user information and privileges.
We will transition to using IDs by first adding user ids and updating the code
to use IDs where possible.

Release justification: Justified in RFC cockroachdb#77453
Release note: None
Copy link
Contributor

@RichardJCai RichardJCai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this RFC


# Technical design
### Adding ID columns to tables with username columns
The ids will be randomly generated UUID when creating a new user during insert into system.users. SQL statements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDs are oids.

username | STRING | false | NULL | | {primary,users_username_id_idx} | false
hashedPassword | BYTES | true | NULL | | {primary} | false
isRole | BOOL | false | false | | {primary} | false
user_id | UUID | false | gen_random_uuid() | | {primary,users_username_id_idx} | false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oid for this and below



### Updating User Privileges and Lookups
An ID field will be added to UserPrivileges and any objects that contain a username. The Privilege Descriptors store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that in 23.1, we can remove the username field from privileges.

Fenil-P added a commit to Fenil-P/cockroach that referenced this pull request Apr 21, 2022
Previously we used usernames to access user information and privileges.
We will transition to using IDs by first adding user ids and updating the code
to use IDs where possible.

Release justification: Justified in RFC cockroachdb#77453
Release note: None
Release justification:
Release note: None
Copy link
Contributor Author

@Fenil-P Fenil-P left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Fenil-P, @knz, and @RichardJCai)


docs/RFCS/20220307_user_ids.md line 33 at r3 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

IDs are oids.

Done.


docs/RFCS/20220307_user_ids.md line 86 at r3 (raw file):

Previously, RichardJCai (Richard Cai) wrote…

oid for this and below

Done.

@Fenil-P
Copy link
Contributor Author

Fenil-P commented Apr 25, 2022

bors r+

@craig
Copy link
Contributor

craig bot commented Apr 25, 2022

Build succeeded:

RichardJCai added a commit to RichardJCai/cockroach that referenced this pull request May 24, 2022
In preparation for keying roles by ids.
See RFC: cockroachdb#77453

Release note: None

Co-authored-by: Fenil-P <[email protected]>
RichardJCai added a commit to RichardJCai/cockroach that referenced this pull request Jul 14, 2022
In preparation for keying roles by ids.
See RFC: cockroachdb#77453

Release note: None

Co-authored-by: Fenil-P <[email protected]>
RichardJCai added a commit to RichardJCai/cockroach that referenced this pull request Jul 26, 2022
In preparation for keying roles by ids.
See RFC: cockroachdb#77453

Release note: None

Co-authored-by: Fenil-P <[email protected]>
RichardJCai added a commit to RichardJCai/cockroach that referenced this pull request Jul 26, 2022
In preparation for keying roles by ids.
See RFC: cockroachdb#77453

Release note: None

Co-authored-by: Fenil-P <[email protected]>
RichardJCai added a commit to RichardJCai/cockroach that referenced this pull request Jul 27, 2022
In preparation for keying roles by ids.
See RFC: cockroachdb#77453

Release note: None

Co-authored-by: Fenil-P <[email protected]>
RichardJCai added a commit to RichardJCai/cockroach that referenced this pull request Jul 28, 2022
In preparation for keying roles by ids.
See RFC: cockroachdb#77453

Release note: None

Co-authored-by: Fenil-P <[email protected]>
RichardJCai added a commit to RichardJCai/cockroach that referenced this pull request Jul 28, 2022
In preparation for keying roles by ids.
See RFC: cockroachdb#77453

Release note: None

Co-authored-by: Fenil-P <[email protected]>
RichardJCai added a commit to RichardJCai/cockroach that referenced this pull request Aug 1, 2022
In preparation for keying roles by ids.
See RFC: cockroachdb#77453

Release note: None

Co-authored-by: Fenil-P <[email protected]>
RichardJCai added a commit to RichardJCai/cockroach that referenced this pull request Aug 3, 2022
In preparation for keying roles by ids.
See RFC: cockroachdb#77453

Release note: None

Co-authored-by: Fenil-P <[email protected]>
RichardJCai added a commit to RichardJCai/cockroach that referenced this pull request Aug 4, 2022
In preparation for keying roles by ids.
See RFC: cockroachdb#77453

Release note: None

Co-authored-by: Fenil-P <[email protected]>
craig bot pushed a commit that referenced this pull request Aug 5, 2022
81457: sql: add user id column to system.users table and perform migration r=RichardJCai a=RichardJCai

Previously we used usernames to access user information and privileges.
We will transition to using IDs by first adding user ids and updating the code
to use IDs where possible.

Release justification: Justified in RFC #77453
Release note: None

Co-authored-by: richardjcai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants