server: maybe surface unix socket #31113

mberhault · 2018-10-08T22:13:58Z

Requested on gitter.

The unix socket can be a reasonably way to communicate with a CockroachDB node when running on the same machine. We added it a while back for easier testing (easier than dynamic port allocation) but have never advertised or documented it. The flag (--socket) is currently hidden.

We should consider surfacing the flag in the CLI and documenting it. We first need to add some additional testing for it, including exercising in test clusters.

CC: @bdarnell for thoughts.

The text was updated successfully, but these errors were encountered:

bdarnell · 2018-10-09T15:46:45Z

I'm generally in favor of exposing/encouraging unix sockets as an option. I think the biggest remaining work for this flag has more to do with documentation (especially how to get various clients to connect via the unix socket, which tends to involve a lot of magic) than with testing.

Also, we should think about the big picture of how the flag would be used. Unix sockets are great for single-node clusters, in which case we might want to be able to disable the TCP ports completely when they are used. Does it ever really make sense to use unix sockets in a multi-node cluster (where the TCP ports are necessary for inter-node traffic and therefore you're probably not using dynamic ports)?

knz · 2018-10-09T16:08:09Z

it's possible unix sockets are faster to communicate with. so a local client for dump/restore eg would get more throughput using the unix socket than over TCP.

inieves · 2018-10-16T20:17:05Z

Please don't make usage of unix domain sockets and TCP/IP sockets mutually exclusive. They have different use cases that might co-exist.

To add some more detail:

nginx -> {insert your favorite cockroachdb compatible language} -> cockroachdb
all on a single machine/container/vm/whatHaveYou, efficiently over unix domain sockets

Then, regardless of how many instances of the above are running, could be 1 node or multinode, the cockroachdb should be able to take incoming TCP/IP for admin. And also for multi-node, they should be able to talk with each other over TCP/IP at least... possibly even unix domain sockets (for some super freaky use cases)

bdarnell · 2018-10-16T20:23:30Z

What specific use case do you have in mind for using both unix sockets and TCP?

inieves · 2018-10-16T20:24:07Z

heh... i was working on those, sorry, i realize I was to brief... updated above.

inieves · 2018-10-16T20:27:30Z

Worth mentioning, im actually doing this right now, literally coding it up right now, to make use of it... not he super freaky part, but the unix domain sockets mixed with TCP/IP.

Alot of devs use TCP/IP sockets just because that's all they know. But that is completely not the right way to glue processes together if they are guaranteed to exist on the same operating system. That would be like using UPS to send a note to a colleague down the hall.

Also worth mentioning, if latency is at all important, or high number of concurrent connections, than it is absolutely imperative to not push packets around TCP/IP stack if it can at all be avoided.

inieves · 2018-10-18T20:53:07Z

Heads up...

Unlike mysql (whereby the user can specify the full path of a socket file), postgres only allows the user to specify the directory in which the socket file will end up... And then in the case of postgres, the actual name of the socket file is chosen (through some heuristic) by postgres, not the user. This affects how the user will later connect to the db... the user will either supply the full path of the socket file in the case of mysql... or the path of the directory where the socket file is, in the case of postgres.

cockroach is attempting to be line compatible with postgres... that means that the user should not tell cockroach the full path of a socket file, but rather the path of the directory in which the socket file resides. But this is not the case... The user is currently required to give cockroach the full path of a socket file.

This makes connecting to cockroach via driver (such as PHP PDO) over socket nearly impossible...

e.g. I have told cockroach that I would like a socket to be created at: /var/run/cockroach/roach.sock
When I later try to connect to cockroach via PHP PDO, I get barfed on with the following message... notice the socket path that PDO goes looking for...

Failed Connection: SQLSTATE[08006] [7] could not connect to server: Not a directory Is the server running locally and accepting connections on Unix domain socket "/var/run/cockroach/roach.sock/.s.PGSQL.5432"?

By nearly, I mean that if I tell cockroach to create a socket for me with the path: /var/run/cockroach/.s.PGSQL.5432
And then I tell the PDO driver to look for a socket in /var/run/cockroach, then I might get lucky and it might all work... through sneaky naming.

You probably want to consider using the postgres socket naming semantics, not the mysql socket naming semantics.

inieves · 2018-10-18T21:32:20Z

I was able to work-around the socket naming issue.
I ran into a subsequent issue. Because I am running cockroach in secure mode, it is expecting connections to be SSL connections. But socket connections coming from PHP PDO are not SSL...

The socket connections should probably not require SSL, regardless of secure/insecure mode of cockroach.

inieves · 2018-10-19T12:59:30Z

I can add more detail, based on libpq official documentation:

I found this: "sslmode is ignored for Unix domain socket communication." at https://www.postgresql.org/docs/10/static/libpq-connect.html.

I am following up on Postgres slack channel about this... But actually it is somewhat reasonable... unix domain sockets are usually trusted, so SSL encryption would seem to be unnecessary security with a negative performance overhead...

To be more compatible with Postgres, I would think that socket connections to cockroach must not have SSL, even if operating in secure mode.

Fornax96 · 2019-06-26T14:34:46Z

I would like to use this too. The lack of TCP/IP and TLS overhead sounds like it could improve performance a lot.

jasobrown · 2019-09-18T18:14:26Z

Another use for the unix domain socket is when fronting CockroachDB with an on-instance TLS terminator (think envoy or haproxy) that processes custom x509 certs for specialized authN. Instead of attempting to put the custom cert handling into cockroach, we can have the proxy do that via a custom plugin, then pass all traffic through to cockroach over the domain socket on the same instance.

43701: sqlsmith: add support for interleaved tables r=mjibson a=mjibson This commit adds interleaved table support to sqlsmith. When running with the rand-tables configuration, there's a 50% chance of all tables but the first one to get interleaved into a random other table. Release note: None 43709: pgwire: use datadriven-based testing for HBA configs r=knz a=knz (I'm currently working on #31113 and updating this test makes my life easier.) This patch introduces a datadriven test runner for HBA config tests. It also replaces the previous `TestHBA` by more exhaustive datadriven input files, with comments that better explain the narrative of the test. Release note: None 43710: pgwire: improve some HBA error messages r=knz a=knz First commit from #43709. Before: ``` > set cluster setting server.host_based_authentication.configuration = 'host db all 0.0.0.0/32 cert'; ERROR: database must be specified as all ``` ``` > set cluster setting server.host_based_authentication.configuration = 'host all all myhost cert'; ERROR: host addresses not supported ``` ``` > set cluster setting server.host_based_authentication.configuration = 'host all all 0.0.0.0/32 sdfsf'; ERROR: unknown auth method "sdfsdf" ``` After: ``` > set cluster setting server.host_based_authentication.configuration = 'host db all 0.0.0.0/32 cert'; ERROR: unimplemented: per-database HBA rules are not supported SQLSTATE: 0A000 HINT: You have attempted to use a feature that is not yet implemented. -- Use the special value 'all' (without quotes) to match all databases. ``` ``` > set cluster setting server.host_based_authentication.configuration = 'host all all myhost cert'; ERROR: unimplemented: hostname-based HBA rules are not supported SQLSTATE: 0A000 HINT: You have attempted to use a feature that is not yet implemented. -- List the numeric CIDR notation instead, for example: 127.0.0.1/8. ``` ``` > set cluster setting server.host_based_authentication.configuration = 'host all all 0.0.0.0/32 sdfsdf' ERROR: unimplemented: unknown auth method "sdfsdf" SQLSTATE: 0A000 HINT: You have attempted to use a feature that is not yet implemented. -- Supported methods: cert, cert-password, password ``` Release note (sql change): CockroachDB will now provide more descriptive error messages and a error hint when an unsupported rule is provided via `server.host_based_authentication.configuration`. 43711: pgwire: split the authentication code in its own files r=knz a=knz First two commits from #43709 and #43710. This patch splits the pgwire authentication in its own files and adds missing explanatory comments. No functional changes. 43713: pgwire/hba: fix a bug in the parsing logic r=knz a=knz Release note (bug fix): There was a bug in the parsing logic for server.host_based_authentication.configuration, where both single-character strings, and quoted strings containing spaces and separated by commas were not properly parsed. This would cause e.g. rules for usernames consisting of a single characters, or usernames containing spaces, to apply improperly. Co-authored-by: Jordan Lewis <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]>

43726: pgwire: clarify the special cases as pre-defined HBA rules r=knz a=knz First commit from #43779. This PR is needed ahead of fixing #31113, which will require supporting `local` HBA rules. Prior to this patch, the authentication code was relatively complex to understand: - the code path to use the HBA configuration would only be activated if the cluster setting was set. There was a special case if it wasn't. The special case was in the per-conn auth logic. - the special case for the root login escape was also present in the per-conn auth logic. - every time the cluster setting would be updated, the result of parsing the setting was cached as-is, and the textual strings re-interpreted over and over upon every connection. This complexity can be interpreted as a defect, for the main reason that **the per-conn auth code should be as simple as possible** to facilitate security audits and future maintainance. Additionally, an argument could be made that the per-conn re-interpretation of the result was a performance mishap. This patch aims to simplify the per-conn auth logic by folding all the special casing as pre-defined rules in the HBA configuration: - the default logic (when the cluster setting is empty or invalid) becomes a predefined HBA configuration with just two rules: host all root all cert host all all all cert-password - each time the config is loaded from a cluster setting, the root escape is implemented by force-inserting `host all root all cert` at the start of the configuration. With this in place, the auth logic can be simplified to always and exclusively use the HBA rules. This special casing can also be inspected in the output of `/debug/hba_conf`. Additionally, this patch optimizes the code by pre-normalizing upfront when the setting is updated. Normalizing includes: - unicode-normalizing and case-folding usernames, since the username upon new connection is also normalized and case-folded. - expanding lists of multiple usernames into multiple rules, so that the checking code can be simplified to only check one username per rule. Release note (security): The authentication code for new SQL connections has been simplified to always use the HBA configuration defined per `server.host_based_authentication.configuration`. The format of this file generally follows that of `pg_hba.conf` as defined here: https://www.postgresql.org/docs/current/auth-pg-hba-conf.html. Upon each configuration change, CockroachDB auto-magically inserts the entry `host all root all cert` as a first rule, to ensure the root user can always log in with a valid client certificate. If the configuration is set to empty, or found to be invalid in the cluster setting, the following default configuration is automatically used: host all root all cert host all all all cert-password At any moment the current configuration on each node can be inspected using the `/debug/hba_conf` URL on the HTTP endpoint. The list of valid authentication methods is currently: - `cert`, for certificate-based authentication over a SSL connection exclusively; - `cert-password`, which allows either cert-based or password-based authentication over a SSL connection; - `password` for password-based authentication over a SSL connection; - `gss` for Kerberos-based authentication over a SSL connection, enabled when running a CCL binary and an Enterprise license. In effect CockroachDB treats all the `host` rules as `hostssl`, and behaves as per a default of `hostnossl all all all reject`. It is not currently possible to define authentication rules over non-SSL connections: as of this writing, non-SSL connections are only possible when running with `--insecure`, and on insecure nodes all the authentication logic is entirely disabled. This behavior remains equivalent to previous CockroachDB versions, and this change is only discussed here for clarity. Co-authored-by: Raphael 'kena' Poss <[email protected]>

43843: pgwire/hba: parse connection type as bit field, not string r=knz a=knz Ahead of #31113. This makes the in-memory data more compact and introduces proper conn type matching code. Release note: None Co-authored-by: Raphael 'kena' Poss <[email protected]>

43837: pgwire: refactor+simplify the connection set-up code r=knz a=knz Ahead of handling #31113. The early SQL connection start-up sequence was a bit too deep in nested conditionals. The conditionals historically grew as special cases on top of special cases; however taking a step back it turns out the overall flow of the start-up is quite simple and the code should reflect that. (This makes it both more readable and more easy to troubleshoot and extend.) No functional change. Co-authored-by: Raphael 'kena' Poss <[email protected]>

inieves · 2020-01-09T22:10:40Z

bravo!!

tbg removed the S-3 Medium-low impact: incurs increased costs for some users (incl lower avail, recoverable bad data) label Oct 9, 2018

knz added the A-cli label Sep 24, 2019

knz mentioned this issue Jan 3, 2020

pgwire: use datadriven-based testing for HBA configs #43709

Merged

knz mentioned this issue Jan 5, 2020

pgwire: clarify the special cases as pre-defined HBA rules #43726

Merged

This was referenced Jan 9, 2020

pgwire: refactor+simplify the connection set-up code #43837

Merged

pgwire/hba: parse connection type as bit field, not string #43843

Merged

knz mentioned this issue Jan 9, 2020

pgwire: properly support unix socket clients with authentication #43848

Merged

knz removed the E-easy Easy issue to tackle, requires little or no CockroachDB experience label Jan 9, 2020

knz self-assigned this Jan 9, 2020

craig bot closed this as completed in b1b3674 Jan 9, 2020

rafiss mentioned this issue Apr 23, 2020

[Tracking] Re-triage roachtest failures for closed issues cockroachdb/appdev-issues#20

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: maybe surface unix socket #31113

server: maybe surface unix socket #31113

mberhault commented Oct 8, 2018

bdarnell commented Oct 9, 2018

knz commented Oct 9, 2018

inieves commented Oct 16, 2018 •

edited

Loading

bdarnell commented Oct 16, 2018

inieves commented Oct 16, 2018

inieves commented Oct 16, 2018 •

edited

Loading

inieves commented Oct 18, 2018

inieves commented Oct 18, 2018

inieves commented Oct 19, 2018 •

edited

Loading

Fornax96 commented Jun 26, 2019

jasobrown commented Sep 18, 2019

inieves commented Jan 9, 2020

server: maybe surface unix socket #31113

server: maybe surface unix socket #31113

Comments

mberhault commented Oct 8, 2018

bdarnell commented Oct 9, 2018

knz commented Oct 9, 2018

inieves commented Oct 16, 2018 • edited Loading

bdarnell commented Oct 16, 2018

inieves commented Oct 16, 2018

inieves commented Oct 16, 2018 • edited Loading

inieves commented Oct 18, 2018

inieves commented Oct 18, 2018

inieves commented Oct 19, 2018 • edited Loading

Fornax96 commented Jun 26, 2019

jasobrown commented Sep 18, 2019

inieves commented Jan 9, 2020

inieves commented Oct 16, 2018 •

edited

Loading

inieves commented Oct 16, 2018 •

edited

Loading

inieves commented Oct 19, 2018 •

edited

Loading