pgx-collect

pgx-collect is a nearly drop-in replacement for the pgx collection functions that offer lower overhead.

pgx offers a very convenient set of functions for parsing values from rows, while handling all of the boilerplate. However, these functions sometimes repeat significant amounts of work when processing each row, and cause unnecessary allocations that scale with O(rows) or even O(rows * columns), which puts more load on the garbage collector. By rewriting these functions to minimize repeated work and allocations, this library provides the same convenient functionality while improving performance.

Usage

For any Collect and RowTo function defined in pgx, pgx-collect has an equivalent with the same name that you can swap in for the same behavior.

For example:

import (
    "github.com/jackc/pgx/v5"
)
...
values, err := pgx.CollectRows(rows, pgx.RowToStructByName[Record])

can be replaced with:

import (
    "github.com/jackc/pgx/v5"
    pgxc "github.com/zolstein/pgx-collect"
)
...
values, err := pgxc.CollectRows(rows, pgxc.RowToStructByName[Record])

If, when migrating from code using the pgx Collect functions, you need to reuse a custom pgx.RowToFunc, you can use the Adapt function as follows. However, be aware that this may have sub-par performance and allocate more than using native pgx-collect constructs.

import (
    "github.com/jackc/pgx/v5"
    pgxc "github.com/zolstein/pgx-collect"
)
...
var customFunc pgx.RowToFunc[Record]
values, err := pgxc.CollectRows(rows, pgxc.Adapt(customFunc))

Who would benefit from this library?

On some level, everyone using the functions CollectRows or AppendRows could benefit. The biggest wins will be seen by people parsing queries with many rows / columns using the reflective struct-mapping RowTo functions. I.e. RowToStructBy....

By comparison, people querying small numbers of rows and parsing rows as primatives or maps, are unlikely to see much benefit (for now). pgx-collect support for these operations is provided mainly for convenience, rather than because it can provide a large speedup.

How much faster is it?

When using AppendRows to query 1000 rows and 4 columns in a benchmark, I've demonstrated a 3-4x reduction in total runtime, a 10x reduction in memory allocated, and a 100x reduction in objects allocated. These numbers will vary greatly by workload. Real-world benefits are unlikely to be this large, but they may still improve query throughput, especially in pathological cases.

I encourage anyone using these pgx's collect functions to try out pgx-collect, measure the difference, and share the results, particularly if there are further opportunities to optimize.

Why not merge into pgx? Why create a separate library?

I would love for this code to be merged into pgx. However, even though the pgx-collect API is designed as a drop-in replacement for all idiomatic use-cases, the performance improvements required changing the types of some values in the public API - thus, if these changes were integrated into pgx, some working code could stop compiling.

If at some time in the future pgx can make backward-incompatible changes, I hope it integrates these changes. I also hope they can integrate as many of the optimizations as possible, though as far as I can tell the API exposed by pgx makes this impossible. In lieu of this, I want this code to be available to anyone who wants to use it or learn from it.

Name	Name	Last commit message	Last commit date
Latest commit zolstein Cache type information between queries Mar 26, 2024 c46e0ea · Mar 26, 2024 History 8 Commits
.github/workflows	.github/workflows	Cache type information between queries	Mar 26, 2024
internal	internal	Cache type information between queries	Mar 26, 2024
.gitignore	.gitignore	Initial commit	Mar 20, 2024
LICENSE	LICENSE	Add pgx license	Mar 21, 2024
README.md	README.md	Update README.md	Mar 22, 2024
bench_test.go	bench_test.go	Cache type information between queries	Mar 26, 2024
go.mod	go.mod	Cache type information between queries	Mar 26, 2024
go.sum	go.sum	Cache type information between queries	Mar 26, 2024
pgx_collect.go	pgx_collect.go	Cache type information between queries	Mar 26, 2024
pgx_collect_test.go	pgx_collect_test.go	Cache type information between queries	Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pgx-collect

Usage

Who would benefit from this library?

How much faster is it?

Why not merge into pgx? Why create a separate library?

About

Releases

Packages

Languages

License

zolstein/pgx-collect

Folders and files

Latest commit

History

Repository files navigation

pgx-collect

Usage

Who would benefit from this library?

How much faster is it?

Why not merge into pgx? Why create a separate library?

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages