Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: sqlparser faster formatting #7710

Merged
merged 9 commits into from
Mar 22, 2021
Merged

perf: sqlparser faster formatting #7710

merged 9 commits into from
Mar 22, 2021

Conversation

vmg
Copy link
Collaborator

@vmg vmg commented Mar 18, 2021

Description

Happy Friday! (national holiday tomorrow so it's Friday for me right now).

This week I'm bringing another important optimization to the sqlparser code. We're tackling the performance of TrackedBuffer, the data structure that lets us format SQL ASTs into their textual representation.

The existing implementation for formatting AST nodes implements a Format(buf *TrackedBuffer) method on every AST nodes. Inside of these methods, the node serializes itself into the TrackedBuffer by using a very helpful TrackedBuffer.astPrintf method. This lets us developers implement the formatting of all nodes in a very convenient way, because we can use printf-like syntax to generate the SQL output, but it has terrible performance implications.

  1. All calls to astPrintf allocate; a printf interface like func (buf *TrackedBuffer) astPrintf(currentNode SQLNode, format string, values ...interface{}) must necessarily use variable arguments. varargs are not cheap in Go, because they must be passed as interface{}, and we already know (or we learnt last week) that moving objects into interface{} allocates in most cases.
  2. All calls to astPrintf must perform parsing of the input string, which is not free. This is not ideal because for a given node, it's format string is always the same and doesn't change between calls, and
  3. All calls to astPrintf lose type information: when we call astPrintf from one of our SQLNode structs, we obviously know the type of our node, and we know if this kind of node would need special semantic handling (e.g. whether it requires being wrapped with parens) when serializing it. When we pass it through an interface{}, this information is lost, and we need to typecast the interface to figure out this semantic information -- we're doing that for every single astPrintf call.
  4. Most importantly: all calls to astPrintf cannot be inlined. AST serialization is a highly recursive operation, and there's a very significant amount of performance to be gained by inlining the recursive calls when serializing the fields of any given node. The Go compiler cannot inline through interface{} callsites.

So, how do we fix all this? These are all problems that could be trivially solved by verbosely and manually removing all calls to astPrintf and just writing the code to write directly into the TrackedBuffer, without any format strings. This results in very fast code, but the Format functions for our SQL nodes become essentially a maintenance nightmare; the printf syntax is very convenient to make this code manageable.

Because of this, I've come up with an alternative solution: a code rewriter that picks up all the formatting code for SQL nodes, finds all astPrintf calls, parses their static printf format strings, and statically replaces them with their decomposed forms. The resulting code is written into a separate method for every SQL node:

Before rewrite:

func (node *Update) Format(buf *TrackedBuffer) {
	buf.astPrintf(node, "update %v%s%v set %v%v%v%v",
		node.Comments, node.Ignore.ToString(), node.TableExprs,
		node.Exprs, node.Where, node.OrderBy, node.Limit)
}

After rewrite:

func (node *Update) formatFast(buf *TrackedBuffer) {
	buf.WriteString("update ")
	node.Comments.formatFast(buf)
	buf.WriteString(node.Ignore.ToString())
	node.TableExprs.formatFast(buf)
	buf.WriteString(" set ")
	node.Exprs.formatFast(buf)
	node.Where.formatFast(buf)
	node.OrderBy.formatFast(buf)
	node.Limit.formatFast(buf)
}

This is not a naive regexp replacement (I tried implementing that at first; didn't work out), it's a fully syntax and type aware rewriting which handles statically at compile time many of the calculations that astPrintf was doing at runtime, particularly when it comes to handling expression performance and grouping. The rewriter knows whether any of the fields in a node can contain expressions, and whether the expressions need special handling based on syntactic precedence:

func (node *AndExpr) Format(buf *TrackedBuffer) {
	buf.astPrintf(node, "%l and %r", node.Left, node.Right)
}

func (node *AndExpr) formatFast(buf *TrackedBuffer) {
	buf.printExpr(node, node.Left, true) // handled as left-expr
	buf.WriteString(" and ")
	buf.printExpr(node, node.Right, false) // handled as right-expr
}

The resulting formatFast code is used transparently by default when users create a TrackedBuffer without a custom formatter; the existing Format code is used when users use a custom formatter callback (the fast formatter doesn't support custom callbacks because they prevent inlining), so the system is fully backwards compatible.

The results are very exciting. Inlining and removing allocations is a very significant optimization in Go:

name                                old time/op    new time/op    delta
StringTraces/django_queries.txt-16    1.09ms ± 2%    0.43ms ± 2%  -60.72%  (p=0.008 n=5+5)
StringTraces/lobsters.sql.gz-16       45.4ms ± 1%    16.2ms ± 2%  -64.26%  (p=0.008 n=5+5)

name                                old alloc/op   new alloc/op   delta
StringTraces/django_queries.txt-16     220kB ± 0%     124kB ± 0%  -43.83%  (p=0.008 n=5+5)
StringTraces/lobsters.sql.gz-16       11.1MB ± 0%     6.3MB ± 0%  -43.04%  (p=0.008 n=5+5)

name                                old allocs/op  new allocs/op  delta
StringTraces/django_queries.txt-16     6.82k ± 0%     2.85k ± 0%  -58.29%  (p=0.008 n=5+5)
StringTraces/lobsters.sql.gz-16         310k ± 0%      105k ± 0%  -66.18%  (p=0.008 n=5+5)

More than twice as fast for our dataset of realistic queries. A normal vtgate query formats incoming queries into strings at least once (often several times), so I expect this to have a measurable impact in total query latency.

Related Issue(s)

Checklist

  • [] Should this PR be backported?
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

Impacted Areas in Vitess

Components that this PR will affect:

  • Query Serving
  • VReplication
  • Cluster Management
  • Build/CI
  • VTAdmin

@vmg vmg mentioned this pull request Mar 18, 2021
Copy link
Collaborator

@systay systay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow. This is really, really cool.

The only thing I'm missing is a verify mode that we can use as a commit hook, but we can add that as an issue and someone else can work on that, if that saves you time & focus.

@systay systay merged commit 38d5661 into vitessio:master Mar 22, 2021
@askdba askdba added this to the v10.0 milestone Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants