-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
importccl: the userfile table name generation algorithm breaks with special usernames, and confused client-side and server-side usernames #55389
Comments
Additionally, the This is conceptually mistaken, because the client-side username may not be the same as the server-side username. The code should be modified to extract the server-side username using the |
tldr: the conversions between "external" strings and internal usernames was unprincipled, and it turns out, incorrect in some cases. This patch cleans this up by introducing a strict conversion API. **Background** CockroachDB currently performs case folding and unicode NFC normalization upon receiving usernames, specifically usernames received from as SQL login principals. Internally, usernames are often—but not always—considered pre-normalized for the purpose of comparisons, privilege checks, role membership checks and the like. Finally, sometimes usernames are reported "back to outside". In error messages, log files etc, but also: - in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc. - to generate OS-level file names for exported files. **New API** This patch introduces a new data type `security.SQLUsername`. It is incomparable and non-convertible with the Go `string`. Engineers must now declare intent about how to do the conversion: - `security.MakeSQLUsernameFromUserInput` converts an "outside" string to a username. - `security.MakeSQLUsernameFromPreNormalizedString` promises that its argument has already been previously normalized. To output usernames, the following APIs are also available. - `username.Normalized()` produces the username itself, without decorations. These corresponds to the raw string (after normalization). - `username.SQLIdentifier()` produces the username in valid SQL syntax, so that it can be injected safely in a SQL statement. - `(*tree.FmtCtx).FormatUsername()` takes a username and properly handles quoting and anonymization, like `FormatName()` does for `tree.Name` already. Likewise, conversion from/to protobuf is now regulated, via the new APIs `username.EncodeProto()` and `usernameproto.Decode()`. **Problems being solved** - the usernames "from outside" were normalized sometimes, *but not consistently*: 1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was not changed. 2. they were not consistently converted in `cockroach cert`. This was corrected. 3. they were not in the `cockroach userfile` commands. This has been adjusted with a reference to issue cockroachdb#55389. 4. they are *not* in GRANT/REVOKE. This patch does not change this behavior, but highlights it by spelling out `MakeSQLUsernameFromPreNormalizedString()` in the implementation. 5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO 6. they were in the argument to `LoginRequest`. This was not changed. 7. they were not in the argument of the other API requests that allow usernames, for example `ListSessions` or `CancelQuery`. This was not changed, but is now documented in the API. - the usernames "inside" were incorrectly directly injected in SQL statements, even though they may contain special characters that create SQL syntax errors. This has been corrected by suitable uses of the new `SQLIdentifier()` method. - There was an outright bug in a call to `CreateGCJobRec` (something about GCing jobs), where a `Description` field was passed in lieu of a username for a `User` field. The implications of this are unclear. **Status after this change** The new API makes it possible to audit exactly where "sensitive" username/string conversion occurs. After this patch, we find the following uses: - `MakeSQLUsernameFromUserInput`: - pgwire user auth - CLI URL parsing - `cockroach userfile` - `cockroach cert` - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing) - CREATE/DROP/ALTER ROLE - when using literal strings as `role_spec` in the SQL grammar - `MakeSQLUsernameFromPreNormalizedString`: - when intepreting the username in API query parameters, for those API documented as using pre-normalized usernames. - role membership checks inside SQL based on data read from `system` tables. - in GRANT/REVOKE (this is surprising, see above) Release note: None
Had some exchange with @dt on this:
|
Ok so a good next step here would be:
I am thinking about an algorithm that looks like this:
This would escape the entire username to hexadecimal if not a lowercase normalized identifier already. Examples:
Regardless of the algorithm used, the end result must ensure that:
The hexification of the username combined with a |
Thanks for this @knz, I haven't been able to get to this yet but I'm clear on the behaviour we expect now. I'll write up a PR and add you to it! |
tldr: the conversions between "external" strings and internal usernames was unprincipled, and it turns out, incorrect in some cases. This patch cleans this up by introducing a strict conversion API. **Background** CockroachDB currently performs case folding and unicode NFC normalization upon receiving usernames, specifically usernames received from as SQL login principals. Internally, usernames are often—but not always—considered pre-normalized for the purpose of comparisons, privilege checks, role membership checks and the like. Finally, sometimes usernames are reported "back to outside". In error messages, log files etc, but also: - in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc. - to generate OS-level file names for exported files. **New API** This patch introduces a new data type `security.SQLUsername`. It is incomparable and non-convertible with the Go `string`. Engineers must now declare intent about how to do the conversion: - `security.MakeSQLUsernameFromUserInput` converts an "outside" string to a username. - `security.MakeSQLUsernameFromPreNormalizedString` promises that its argument has already been previously normalized. To output usernames, the following APIs are also available. - `username.Normalized()` produces the username itself, without decorations. These corresponds to the raw string (after normalization). - `username.SQLIdentifier()` produces the username in valid SQL syntax, so that it can be injected safely in a SQL statement. - `(*tree.FmtCtx).FormatUsername()` takes a username and properly handles quoting and anonymization, like `FormatName()` does for `tree.Name` already. Likewise, conversion from/to protobuf is now regulated, via the new APIs `username.EncodeProto()` and `usernameproto.Decode()`. **Problems being solved** - the usernames "from outside" were normalized sometimes, *but not consistently*: 1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was not changed. 2. they were not consistently converted in `cockroach cert`. This was corrected. 3. they were not in the `cockroach userfile` commands. This has been adjusted with a reference to issue cockroachdb#55389. 4. they are *not* in GRANT/REVOKE. This patch does not change this behavior, but highlights it by spelling out `MakeSQLUsernameFromPreNormalizedString()` in the implementation. 5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO 6. they were in the argument to `LoginRequest`. This was not changed. 7. they were not in the argument of the other API requests that allow usernames, for example `ListSessions` or `CancelQuery`. This was not changed, but is now documented in the API. - the usernames "inside" were incorrectly directly injected in SQL statements, even though they may contain special characters that create SQL syntax errors. This has been corrected by suitable uses of the new `SQLIdentifier()` method. - There was an outright bug in a call to `CreateGCJobRec` (something about GCing jobs), where a `Description` field was passed in lieu of a username for a `User` field. The implications of this are unclear. **Status after this change** The new API makes it possible to audit exactly where "sensitive" username/string conversion occurs. After this patch, we find the following uses: - `MakeSQLUsernameFromUserInput`: - pgwire user auth - CLI URL parsing - `cockroach userfile` - `cockroach cert` - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing) - CREATE/DROP/ALTER ROLE - when using literal strings as `role_spec` in the SQL grammar - `MakeSQLUsernameFromPreNormalizedString`: - when intepreting the username in API query parameters, for those API documented as using pre-normalized usernames. - role membership checks inside SQL based on data read from `system` tables. - in GRANT/REVOKE (this is surprising, see above) Release note: None
tldr: the conversions between "external" strings and internal usernames was unprincipled, and it turns out, incorrect in some cases. This patch cleans this up by introducing a strict conversion API. **Background** CockroachDB currently performs case folding and unicode NFC normalization upon receiving usernames, specifically usernames received from as SQL login principals. Internally, usernames are often—but not always—considered pre-normalized for the purpose of comparisons, privilege checks, role membership checks and the like. Finally, sometimes usernames are reported "back to outside". In error messages, log files etc, but also: - in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc. - to generate OS-level file names for exported files. **New API** This patch introduces a new data type `security.SQLUsername`. It is incomparable and non-convertible with the Go `string`. Engineers must now declare intent about how to do the conversion: - `security.MakeSQLUsernameFromUserInput` converts an "outside" string to a username. - `security.MakeSQLUsernameFromPreNormalizedString` promises that its argument has already been previously normalized. To output usernames, the following APIs are also available. - `username.Normalized()` produces the username itself, without decorations. These corresponds to the raw string (after normalization). - `username.SQLIdentifier()` produces the username in valid SQL syntax, so that it can be injected safely in a SQL statement. - `(*tree.FmtCtx).FormatUsername()` takes a username and properly handles quoting and anonymization, like `FormatName()` does for `tree.Name` already. Likewise, conversion from/to protobuf is now regulated, via the new APIs `username.EncodeProto()` and `usernameproto.Decode()`. **Problems being solved** - the usernames "from outside" were normalized sometimes, *but not consistently*: 1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was not changed. 2. they were not consistently converted in `cockroach cert`. This was corrected. 3. they were not in the `cockroach userfile` commands. This has been adjusted with a reference to issue cockroachdb#55389. 4. they are *not* in GRANT/REVOKE. This patch does not change this behavior, but highlights it by spelling out `MakeSQLUsernameFromPreNormalizedString()` in the implementation. 5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO 6. they were in the argument to `LoginRequest`. This was not changed. 7. they were not in the argument of the other API requests that allow usernames, for example `ListSessions` or `CancelQuery`. This was not changed, but is now documented in the API. - the usernames "inside" were incorrectly directly injected in SQL statements, even though they may contain special characters that create SQL syntax errors. This has been corrected by suitable uses of the new `SQLIdentifier()` method. - There was an outright bug in a call to `CreateGCJobRec` (something about GCing jobs), where a `Description` field was passed in lieu of a username for a `User` field. The implications of this are unclear. **Status after this change** The new API makes it possible to audit exactly where "sensitive" username/string conversion occurs. After this patch, we find the following uses: - `MakeSQLUsernameFromUserInput`: - pgwire user auth - CLI URL parsing - `cockroach userfile` - `cockroach cert` - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing) - CREATE/DROP/ALTER ROLE - when using literal strings as `role_spec` in the SQL grammar - `MakeSQLUsernameFromPreNormalizedString`: - when intepreting the username in API query parameters, for those API documented as using pre-normalized usernames. - role membership checks inside SQL based on data read from `system` tables. - in GRANT/REVOKE (this is surprising, see above) Release note: None
tldr: the conversions between "external" strings and internal usernames was unprincipled, and it turns out, incorrect in some cases. This patch cleans this up by introducing a strict conversion API. **Background** CockroachDB currently performs case folding and unicode NFC normalization upon receiving usernames, specifically usernames received from as SQL login principals. Internally, usernames are often—but not always—considered pre-normalized for the purpose of comparisons, privilege checks, role membership checks and the like. Finally, sometimes usernames are reported "back to outside". In error messages, log files etc, but also: - in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc. - to generate OS-level file names for exported files. **New API** This patch introduces a new data type `security.SQLUsername`. It is incomparable and non-convertible with the Go `string`. Engineers must now declare intent about how to do the conversion: - `security.MakeSQLUsernameFromUserInput` converts an "outside" string to a username. - `security.MakeSQLUsernameFromPreNormalizedString` promises that its argument has already been previously normalized. To output usernames, the following APIs are also available. - `username.Normalized()` produces the username itself, without decorations. These corresponds to the raw string (after normalization). - `username.SQLIdentifier()` produces the username in valid SQL syntax, so that it can be injected safely in a SQL statement. - `(*tree.FmtCtx).FormatUsername()` takes a username and properly handles quoting and anonymization, like `FormatName()` does for `tree.Name` already. Likewise, conversion from/to protobuf is now regulated, via the new APIs `username.EncodeProto()` and `usernameproto.Decode()`. **Problems being solved** - the usernames "from outside" were normalized sometimes, *but not consistently*: 1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was not changed. 2. they were not consistently converted in `cockroach cert`. This was corrected. 3. they were not in the `cockroach userfile` commands. This has been adjusted with a reference to issue cockroachdb#55389. 4. they are *not* in GRANT/REVOKE. This patch does not change this behavior, but highlights it by spelling out `MakeSQLUsernameFromPreNormalizedString()` in the implementation. 5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO 6. they were in the argument to `LoginRequest`. This was not changed. 7. they were not in the argument of the other API requests that allow usernames, for example `ListSessions` or `CancelQuery`. This was not changed, but is now documented in the API. - the usernames "inside" were incorrectly directly injected in SQL statements, even though they may contain special characters that create SQL syntax errors. This has been corrected by suitable uses of the new `SQLIdentifier()` method. - There was an outright bug in a call to `CreateGCJobRec` (something about GCing jobs), where a `Description` field was passed in lieu of a username for a `User` field. The implications of this are unclear. **Status after this change** The new API makes it possible to audit exactly where "sensitive" username/string conversion occurs. After this patch, we find the following uses: - `MakeSQLUsernameFromUserInput`: - pgwire user auth - CLI URL parsing - `cockroach userfile` - `cockroach cert` - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing) - CREATE/DROP/ALTER ROLE - when using literal strings as `role_spec` in the SQL grammar - `MakeSQLUsernameFromPreNormalizedString`: - when intepreting the username in API query parameters, for those API documented as using pre-normalized usernames. - role membership checks inside SQL based on data read from `system` tables. - in GRANT/REVOKE (this is surprising, see above) Release note: None
tldr: the conversions between "external" strings and internal usernames was unprincipled, and it turns out, incorrect in some cases. This patch cleans this up by introducing a strict conversion API. **Background** CockroachDB currently performs case folding and unicode NFC normalization upon receiving usernames, specifically usernames received from as SQL login principals. Internally, usernames are often—but not always—considered pre-normalized for the purpose of comparisons, privilege checks, role membership checks and the like. Finally, sometimes usernames are reported "back to outside". In error messages, log files etc, but also: - in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc. - to generate OS-level file names for exported files. **New API** This patch introduces a new data type `security.SQLUsername`. It is incomparable and non-convertible with the Go `string`. Engineers must now declare intent about how to do the conversion: - `security.MakeSQLUsernameFromUserInput` converts an "outside" string to a username. - `security.MakeSQLUsernameFromPreNormalizedString` promises that its argument has already been previously normalized. - `security.MakeSQLUsernameFromPreNormalizedStringChecked` also checks that its argument has already been previously normalized. To output usernames, the following APIs are also available. - `username.Normalized()` produces the username itself, without decorations. These corresponds to the raw string (after normalization). - `username.SQLIdentifier()` produces the username in valid SQL syntax, so that it can be injected safely in a SQL statement. - `(*tree.FmtCtx).FormatUsername()` takes a username and properly handles quoting and anonymization, like `FormatName()` does for `tree.Name` already. Likewise, conversion from/to protobuf is now regulated, via the new APIs `username.EncodeProto()` and `usernameproto.Decode()`. **Problems being solved** - the usernames "from outside" were normalized sometimes, *but not consistently*: 1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was not changed. 2. they were not consistently converted in `cockroach cert`. This was corrected. 3. they were not in the `cockroach userfile` commands. This has been adjusted with a reference to issue cockroachdb#55389. 4. they are *not* in GRANT/REVOKE. This patch does not change this behavior, but highlights it by spelling out `MakeSQLUsernameFromPreNormalizedString()` in the implementation. 5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO 6. they were in the argument to `LoginRequest`. This was not changed. 7. they were not in the argument of the other API requests that allow usernames, for example `ListSessions` or `CancelQuery`. This was not changed, but is now documented in the API. - the usernames "inside" were incorrectly directly injected in SQL statements, even though they may contain special characters that create SQL syntax errors. This has been corrected by suitable uses of the new `SQLIdentifier()` method. - There was an outright bug in a call to `CreateGCJobRec` (something about GCing jobs), where a `Description` field was passed in lieu of a username for a `User` field. The implications of this are unclear. **Status after this change** The new API makes it possible to audit exactly where "sensitive" username/string conversion occurs. After this patch, we find the following uses: - `MakeSQLUsernameFromUserInput`: - pgwire user auth - CLI URL parsing - `cockroach userfile` - `cockroach cert` - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing) - CREATE/DROP/ALTER ROLE - when using literal strings as `role_spec` in the SQL grammar - `MakeSQLUsernameFromPreNormalizedString`: - role membership checks inside SQL based on data read from `system` tables. - in GRANT/REVOKE (this is surprising, see above) - `MakeSQLUsernameFromPreNormalizedStringChecked`: - when intepreting the username in API query parameters, for those API documented as using pre-normalized usernames. Release note: None
55398: *: make conversions between `string` and usernames more explicit r=aaron-crl,arulajmani a=knz Informs (but does not fix) #55396 Informs #54696 Informs #55389 tldr: the conversions between "external" strings and internal usernames was unprincipled, and it turns out, incorrect in some cases. This patch cleans this up by introducing a strict conversion API. **Background** CockroachDB currently performs case folding and unicode NFC normalization upon receiving usernames, specifically usernames received from as SQL login principals. Internally, usernames are often—but not always—considered pre-normalized for the purpose of comparisons, privilege checks, role membership checks and the like. Finally, sometimes usernames are reported "back to outside". In error messages, log files etc, but also: - in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc. - to generate OS-level file names for exported files. **New API** This patch introduces a new data type `security.SQLUsername`. It is incomparable and non-convertible with the Go `string`. Engineers must now declare intent about how to do the conversion: - `security.MakeSQLUsernameFromUserInput` converts an "outside" string to a username. - `security.MakeSQLUsernameFromPreNormalizedString` promises that its argument has already been previously normalized. - `security.MakeSQLUsernameFromPreNormalizedStringChecked` also checks that its argument has already been previously normalized. To output usernames, the following APIs are also available. - `username.Normalized()` produces the username itself, without decorations. These corresponds to the raw string (after normalization). - `username.SQLIdentifier()` produces the username in valid SQL syntax, so that it can be injected safely in a SQL statement. - `(*tree.FmtCtx).FormatUsername()` takes a username and properly handles quoting and anonymization, like `FormatName()` does for `tree.Name` already. Likewise, conversion from/to protobuf is now regulated, via the new APIs `username.EncodeProto()` and `usernameproto.Decode()`. **Problems being solved** - the usernames "from outside" were normalized sometimes, *but not consistently*: 1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was not changed. 2. they were not consistently converted in `cockroach cert`. This was corrected. 3. they were not in the `cockroach userfile` commands. This has been adjusted with a reference to issue #55389. 4. they are *not* in GRANT/REVOKE. This patch does not change this behavior, but highlights it by spelling out `MakeSQLUsernameFromPreNormalizedString()` in the implementation. 5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO 6. they were in the argument to `LoginRequest`. This was not changed. 7. they were not in the argument of the other API requests that allow usernames, for example `ListSessions` or `CancelQuery`. This was not changed, but is now documented in the API. - the usernames "inside" were incorrectly directly injected in SQL statements, even though they may contain special characters that create SQL syntax errors. This has been corrected by suitable uses of the new `SQLIdentifier()` method. - There was an outright bug in a call to `CreateGCJobRec` (something about GCing jobs), where a `Description` field was passed in lieu of a username for a `User` field. The implications of this are unclear. **Status after this change** The new API makes it possible to audit exactly where "sensitive" username/string conversion occurs. After this patch, we find the following uses: - `MakeSQLUsernameFromUserInput`: - pgwire user auth - CLI URL parsing - `cockroach userfile` - `cockroach cert` - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing) - CREATE/DROP/ALTER ROLE - when using literal strings as `role_spec` in the SQL grammar - `MakeSQLUsernameFromPreNormalizedString`: - role membership checks inside SQL based on data read from `system` tables. - in GRANT/REVOKE (this is surprising, see above) - `MakeSQLUsernameFromPreNormalizedStringChecked`: - when intepreting the username in API query parameters, for those API documented as using pre-normalized usernames. Release note: None 55986: opt: use paired joins for left semi inverted joins r=sumeerbhola a=sumeerbhola Semi joins using the inverted index used to be converted to an inner inverted join followed by an inner lookup join and a distinct for de-duplication. They are now converted to paired-joins consisting of an inner inverted join followed by a left semi lookup join, which is more efficient. Release note: None 55997: sql: fix object lookup for ALTER TABLE ... RENAME TO ... r=ajwerner a=jayshrivastava Previously, ALTER TABLE schema.table RENAME TO schema.existing_table would fail because the statement would look up the destination table in the public schema only. This commit fixes this behavior by using the correct destination schema. Fixes #55985 Release note: None Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: Jayant Shrivastava <[email protected]>
Previously, userfile would break if a user had a username with special characters, which is otherwise supported by CRDB. This is because, unless specified, userfile uses the username to generate the underlying storage table names. This change introduces a new name generation scheme which accounts for all usernames supported by the database. The details are explained in: cockroachdb#55389 (comment) Fixes: cockroachdb#55389 Release note: None
Previously, userfile would break if a user had a username with special characters, which is otherwise supported by CRDB. This is because, unless specified, userfile uses the username to generate the underlying storage table names. This change introduces a new name generation scheme which accounts for all usernames supported by the database. The details are explained in: cockroachdb#55389 (comment) Fixes: cockroachdb#55389 Release note: None
56046: cli: fix username semantics for userfile r=knz a=adityamaru Previously, userfile would break if a user had a username with special characters, which is otherwise supported by CRDB. This is because, unless specified, userfile uses the username to generate the underlying storage table names. This change introduces a new name generation scheme which accounts for all usernames supported by the database. The details are explained in: #55389 (comment) Fixes: #55389 Release note: None Co-authored-by: Aditya Maru <[email protected]>
This code:
(
external_storage.go
)breaks when
user
starts with digits, or when it contains periods. Username consisting of only digits, or containing periods, have been allowed since crdb v19.2.The code should instead quote the username properly.
The text was updated successfully, but these errors were encountered: