Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: "default" DCollatedString should be DString objects #57255

Open
otan opened this issue Nov 30, 2020 · 3 comments
Open

sql: "default" DCollatedString should be DString objects #57255

otan opened this issue Nov 30, 2020 · 3 comments
Labels
A-sql-datatypes SQL column types usable in table descriptors. A-sql-pgcompat Semantic compatibility with PostgreSQL C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) X-anchored-telemetry The issue number is anchored by telemetry references.

Comments

@otan
Copy link
Contributor

otan commented Nov 30, 2020

In Postgres, COLLATE strings are just a typmod on the string type.
In CRDB, COLLATE is an entirely new type, with its own family and handling logic.

Unfortunately, this presents a problem when using "default" collation, which should store no typmod for the string object for collate. We however have pidgeonholed all our logic such that x COLLATE y forms a CollatedString type.

This is problematic for DEFAULT collations when a user specifies it, as it tries to evaluate the language tag. Since we introduced DEFAULT column definitions in #56598, this is only a problem when someone does x COLLATE "default" as a value.

Jira issue: CRDB-2845

@blathers-crl
Copy link

blathers-crl bot commented Nov 30, 2020

Hi @otan, please add a C-ategory label to your issue. Check out the label system docs.

While you're here, please consider adding an A- label to help keep our repository tidy.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@otan otan added A-sql-datatypes SQL column types usable in table descriptors. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels Nov 30, 2020
otan added a commit to otan-cockroach/cockroach that referenced this issue Nov 30, 2020
Release note (sql change): In 21.1, we introduced DEFAULT collation
types, which behave correctly as column definitions. However, getting
COLLATE to work as a value (e.g. `SELECT a::text COLLATE "default"`) is
trickier to make work as it involves a complex change to our type
system. This now returns an unimplemented error pointing to cockroachdb#57255.
@rafiss rafiss added the X-anchored-telemetry The issue number is anchored by telemetry references. label Dec 1, 2020
@rafiss
Copy link
Collaborator

rafiss commented Dec 1, 2020

Does this cover #50734 or should we keep that open separately?

@otan
Copy link
Contributor Author

otan commented Dec 1, 2020

kind of separate, this one covers a more general code change whereas sql: support COLLATE "default" in SELECT queries covers a specific use case working.

craig bot pushed a commit that referenced this issue Dec 1, 2020
57256: tree: block COLLATE "DEFAULT" from working during TypeCheck r=rafiss a=otan

Release note (sql change): In 21.1, we introduced DEFAULT collation
types, which behave correctly as column definitions. However, getting
COLLATE to work as a value (e.g. `SELECT a::text COLLATE "default"`) is
trickier to make work as it involves a complex change to our type
system. This now returns an unimplemented error pointing to #57255.

57320: Revert "sql: add support for CTAS AS OF SYSTEM TIME" r=jordanlewis a=jordanlewis

This reverts commit 72aa85a.

Co-authored-by: Oliver Tan <[email protected]>
Co-authored-by: Jordan Lewis <[email protected]>
@rafiss rafiss added the A-sql-pgcompat Semantic compatibility with PostgreSQL label Dec 8, 2020
@rafiss rafiss added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label May 12, 2021
craig bot pushed a commit that referenced this issue Feb 9, 2023
96031: sql: add mixed version test for system.role_members user ids upgrade r=rafiss a=andyyang890

This patch adds a mixed version logictest that ensures that GRANT ROLE
continues to work properly in a cluster with both 22.2 and 23.1 nodes
(i.e. nodes that have run the system.role_members user ids upgrade).

Part of #92342

Release note: None

96127: kvserver: introduce cpu rebalancing r=nvanbenschoten a=kvoli

This patch allows the store rebalancer to use CPU in place of QPS when
balancing load on a cluster. This patch adds `cpu` as an option with the
cluster setting:

`kv.allocator.load_based_rebalancing.objective`

When set to `cpu`, rather than `qps`. The store rebalancer will perform
a mostly identical function, however target balancing the sum of all
replica's cpu time on each store, rather than qps. The default remains
as `qps` here.

Similar to QPS, the rebalance threshold can be set to allow controlling
the range above and below the mean store CPU is considered imbalanced,
either overfull or underfull respectively:

`kv.allocator.cpu_rebalance_threshold`: 0.1

In order to manage with mixed versions during upgrade and some
architectures not supporting the cpu sampling method, a rebalance
objective manager is introduced in `rebalance_objective.go`. The manager
mediates access to the rebalance objective and overwrites it in cases
where the objective set in the cluster setting cannot be supported.

The results when using CPU in comparison to QPS can be found [here](https://docs.google.com/document/d/1QLhD20BTamjj3-dSG9F1gW7XMBy9miGPpJpmu2Dn3yo/edit#) (internal).


<details>
<summary>Results Summary</summary>

![image](https://user-images.githubusercontent.com/39606633/215580650-b12ff509-5cf5-4ffa-880d-8387e2ef0afa.png)

![image](https://user-images.githubusercontent.com/39606633/215580626-3d748ba1-e9a4-4abb-8acd-2c319203932e.png)

![image](https://user-images.githubusercontent.com/39606633/215580585-58e6000d-b6cf-430a-b4b7-d14a77eab3bd.png)

</details>


<details>
<summary>Detailed Allocbench Results</summary>

```
kv/r=0/access=skew
master
    median cost(gb):05.81 cpu(%):14.97 write(%):37.83
    stddev cost(gb):01.87 cpu(%):03.98 write(%):07.01
cpu rebalancing
    median cost(gb):08.76 cpu(%):14.42 write(%):36.61
    stddev cost(gb):02.66 cpu(%):01.85 write(%):04.80
kv/r=0/ops=skew
master
    median cost(gb):06.23 cpu(%):26.05 write(%):57.33
    stddev cost(gb):02.92 cpu(%):05.83 write(%):08.20
cpu rebalancing
    median cost(gb):04.28 cpu(%):11.45 write(%):31.28
    stddev cost(gb):02.25 cpu(%):02.51 write(%):06.68
kv/r=50/ops=skew
master
    median cost(gb):04.36 cpu(%):22.84 write(%):48.09
    stddev cost(gb):01.12 cpu(%):02.71 write(%):05.51
cpu rebalancing
    median cost(gb):04.64 cpu(%):13.49 write(%):43.05
    stddev cost(gb):01.07 cpu(%):01.26 write(%):08.58
kv/r=95/access=skew
master
    median cost(gb):00.00 cpu(%):09.51 write(%):01.24
    stddev cost(gb):00.00 cpu(%):01.74 write(%):00.27
cpu rebalancing
    median cost(gb):00.00 cpu(%):05.66 write(%):01.31
    stddev cost(gb):00.00 cpu(%):01.56 write(%):00.26
kv/r=95/ops=skew
master
    median cost(gb):0.00 cpu(%):47.29 write(%):00.93
    stddev cost(gb):0.09 cpu(%):04.30 write(%):00.17
cpu rebalancing
    median cost(gb):0.00 cpu(%):08.16 write(%):01.30
    stddev cost(gb):0.01 cpu(%):04.59 write(%):00.20
```


</details>

resolves: #95380

Release note (ops change) Add option to balance cpu time (cpu)
instead of queries per second (qps) among stores in a cluster. This is
done by setting `kv.allocator.load_based_rebalancing.objective='cpu'`.
`kv.allocator.cpu_rebalance_threshold` is also added, similar to
`kv.allocator.qps_rebalance_threshold` to control the target range for
store cpu above and below the cluster mean.

96440: ui: add execution insights to statement and transaction fingerprint details r=ericharmeling a=ericharmeling

This commit adds execution insights to the Statement Fingerprint and Transaction Fingerprint Details pages.

Part of #83780.

Loom: https://www.loom.com/share/98d2023b672e43fa8016829aa641a829

Note that the SQL queries against the `*_execution_insights` tables are updated to `SELECT DISTINCT ON (*_fingerprint_id, problems, causes)` (equivalent to `GROUP BY (*_fingerprint_id, problems, causes)`) from the latest results in the tables, rather than `row_number() OVER ( PARTITION BY stmt_fingerprint_id, problem, causes ORDER BY end_time DESC ) AS rank... WHERE rank = 1`. Both patterns return the same result, but one uses aggregation and the other uses a window function. I find the `DISTINCT ON/GROUP BY` pattern easier to understand, I'm not seeing much difference in the planning/execution time between the two over the same set of data, and I'm seeing `DISTINCT ON/GROUP BY` coming up as more performant in almost all the secondary sources I've encountered.

Release note (ui change): Added execution insights to the Statement Fingerprint Details and Transaction Fingerprint Details Pages.

96828: collatedstring: support default, C, and POSIX in expressions r=otan a=rafiss

fixes #50734
fixes #95667
informs #57255

---
### collatedstring: create new package 

Move the small amount of code from tree/collatedstring.go

---

### collatedstring: support C and POSIX in expressions

Release note (sql change): Expressions of the form `COLLATE "default"`,
`COLLATE "C"`, and `COLLATE "POSIX"` are now supported. Since the
default collation cannot be changed currently, these expressions are all
equivalent. The expressions are evaluated by treating the input as a
normal string, and ignoring the collation.

This means that comparisons between strings and collated strings that
use "default", "C", or "POSIX" are now supported.

Creating a column with the "C" or "POSIX" collations is still not
supported.

96870: kvserver: use replicasByKey addition func in snapshot path r=tbg a=pavelkalinnikov

This commit makes one step towards better code sharing between `Replica` initialization paths: split trigger and snapshot application. It makes both to use the same method to check and insert the initialized `Replica` to `replicasByKey` map.

Touches #94912

96874: roachtest: run scheduled backup only on clusters with enterprise license r=stevendanna a=msbutler

Epic: none

Release note: None

96883: go.mod: bump Pebble to 829675f94811 r=RaduBerinde a=RaduBerinde

829675f9 db: fix ObsoleteSize stat
2f086b74 db: refactor compaction splitting to reduce key comparisons

Release note: None
Epic: none

Co-authored-by: Andy Yang <[email protected]>
Co-authored-by: Austen McClernon <[email protected]>
Co-authored-by: Eric Harmeling <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Pavel Kalinnikov <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-datatypes SQL column types usable in table descriptors. A-sql-pgcompat Semantic compatibility with PostgreSQL C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) X-anchored-telemetry The issue number is anchored by telemetry references.
Projects
None yet
Development

No branches or pull requests

2 participants