Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dramatically improve postgresql performance #323

Merged
merged 4 commits into from
Sep 23, 2024

Conversation

sambhav
Copy link
Contributor

@sambhav sambhav commented Sep 20, 2024

Contributors

Co-Authored by @adriangonz who paired with me on this change and also tested/benched all the changes 🎉

This builds on top of the great work done in #151 by @moio (included as a co-author on this PR since it builds on top of the ideas introduced in the said PR)

Benchmarks

For a paginanted API Server list query (which uses CountRevision and List with a filter query) we were able to see speed-ups from 70+ seconds to 2 seconds (35x speed-up) for a kine DB with 1M+ resources that match the list query. The speed-up might be larger for larger kine DBs.

Individually we were able to bring the count query down to 1.5s from ~40s and the list query down to 0.5s from ~30s on our test bench.

This is fairly significant as it makes kine usable/scalable at such high volume of objects.

Without the above changes we would usually see timeouts and context cancelled errors most of the times.

Changes

Datatype Change

In Postgres, text tends to give better performance and use less space. When using varchar,

  • Values get padded to the max size (whereas text values can be sparse).
  • Each update / insert has an extra check to validate the length of the value.
  • The query planner casts varchar entries to text anyway.

The main benefit provided by varchar is making sure that all values are shorter than N characters. However, in the case of kine, that's already validated upstream by kube-apiserver.

See the Postgres docs for more info.

Collate Change

When using the C.UTF-8 locale (which comes as default on many Postgres setups), indices working on text (or varchar) columns can't be used for LIKE operations. The workaround in this case is to create the index with a text_pattern_ops operator. However, the resulting index can't then be used for < / > queries. It's possible to work around this second issue by creating a second index that uses the default operator for the text column, however that then introduces unnecessary overhead.

Instead, a simpler fix is to change the collation of the name column to use the C locale. This means that the strings saved in this column are treated as ASCII values (i.e. each character is a byte), which in turn lets queries use indices for all operations. Since the values in the name column will be a concatenation of DNS-like segments, we shouldn't get any UTF-8 values there.

See the Postgres docs on text indices for more info. This SO answer a lot of useful context around this issue.

Note: This might also explain the past MySQL v/s PostgreSQL differences that have been observed in kine as MySQL schema stores the name as an ASCII column.

name VARCHAR(630) CHARACTER SET ascii,

Query changes

Most of the original List/Count queries are derived from #151 and have been adapted to support the new CountCurrent and CountRevision. There were also some minor bugs in the original PR WRT to the order of filter queries which has been fixed.

Even without the query changes, the PR achieves a significant speed up with the existing queries as all of them start using the indices for name in the LIKE and > queries.

Co-authored-by: Adrian Gonzalez-Martin <[email protected]>
Co-authored-by: Adrian Gonzalez-Martin <[email protected]>
Co-authored-by: Sambhav Kothari <[email protected]>
Co-authored-by: Silvio Moioli <[email protected]>
Signed-off-by: Sambhav Kothari <[email protected]>
@sambhav sambhav requested a review from a team as a code owner September 20, 2024 22:23
Copy link
Member

@brandond brandond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! One nit on this, and a question: Does this require all servers to be upgraded to a new release of Kine before the schema can be changed, or is this change safe to make without updating all the nodes?

pkg/drivers/pgsql/pgsql.go Outdated Show resolved Hide resolved
Signed-off-by: Sambhav Kothari <[email protected]>
@sambhav
Copy link
Contributor Author

sambhav commented Sep 20, 2024

Does this require all servers to be upgraded to a new release of Kine before the schema can be changed, or is this change safe to make without updating all the nodes?

It is safe to make the change without updating all the nodes. The queries are b/w compatible since we didn't create any new columns.

With the new schema, even if we do not change the queries, the existing queries get significantly faster due to the index usage. Note, that it may take some time to rebuild the indices after the name schema change depending on the db size.

@sambhav
Copy link
Contributor Author

sambhav commented Sep 20, 2024

Looks like cockroachdb does not support per column collations?

@brandond
Copy link
Member

Yeah, was just noticing that too. I'm not sure how to best handle that, as we currently expect cockroachdb to support the same SQL features as postgres.

@sambhav
Copy link
Contributor Author

sambhav commented Sep 20, 2024

Actually on a closer look it looks like it supports column collation, it just does not support C as a collation?

@brandond
Copy link
Member

brandond commented Sep 20, 2024

It looks like it's using golang.org/x/text/language and ends up calling something like v, err := language.Parse("C")?

@sambhav
Copy link
Contributor Author

sambhav commented Sep 20, 2024

I might end up doing something along the lines of

select version();

and changing the column type based on whether that contains cockroach vs postgres. Is that reasonable? For unknown version strings, I will just assume postgres.

@brandond
Copy link
Member

Yeah, worth a try I guess? I don't know why they decided to parse that as a BCP47 language tag instead of a collation, that seems pretty broken. Not the first weird decision that project has made though.

@sambhav sambhav force-pushed the faster-pg-queries branch 2 times, most recently from ffdad1b to 0228965 Compare September 20, 2024 23:33
@sambhav
Copy link
Contributor Author

sambhav commented Sep 20, 2024

Ok, hopefully my little hack works as expected.

@brandond
Copy link
Member

You might consider opening an issue with cockroachdb, maybe they just want to special-case the C collation? The end result should just be bytewise string comparison.

@sambhav
Copy link
Contributor Author

sambhav commented Sep 20, 2024

Some comparisons from PR perf tests -

List Before

[postgres-12.16] [PERF] 0.034 average postgres-12.16 request duration (seconds): list configmaps
[postgres-12.16] [PERF] 0.005 [ 32] █▌
[postgres-12.16] [PERF] 0.025 [ 296] ██████████████▏
[postgres-12.16] [PERF] 0.05 [1045] ██████████████████████████████████████████████████
[postgres-12.16] [PERF] 0.1 [ 212] ██████████▏

List After

[postgres-12.16] [PERF] 0.027 average postgres-12.16 request duration (seconds): list configmaps
[postgres-12.16] [PERF] 0.005 [ 35] █▉
[postgres-12.16] [PERF] 0.025 [560] █████████████████████████████
[postgres-12.16] [PERF] 0.05 [964] ██████████████████████████████████████████████████
[postgres-12.16] [PERF] 0.1 [ 16] ▉

As mentioned in the PR, the gap gets wider and wider as we add more items, but we can still see that most of the queries now complete in under 0.05s (p99). Previously even the p90 was 0.1s.

@sambhav
Copy link
Contributor Author

sambhav commented Sep 21, 2024

Hmm, not sure why this is still happening, let me check

building kine: ERROR: at or near ",": syntax error: invalid locale C: language: tag is not well-formed (SQLSTATE 42601)

Signed-off-by: Sambhav Kothari <[email protected]>
@sambhav
Copy link
Contributor Author

sambhav commented Sep 21, 2024

Ok, looks like we are all good 🎉 I will deal with the cockroachdb issue creation later.

Copy link
Contributor

@moio moio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking up where I left this and taking it over the finish line 🏁

@brandond brandond merged commit 47d7636 into k3s-io:master Sep 23, 2024
3 checks passed
@sambhav sambhav deleted the faster-pg-queries branch September 23, 2024 18:25
@sambhav
Copy link
Contributor Author

sambhav commented Sep 23, 2024

Thanks @brandond for the merge. Can you also cut a release please? Should this be 0.13.0 given the schema migrations needed?

@brandond
Copy link
Member

I was waiting to merge another PR, but yes we can tag a 0.13.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants