Schema generation performance and documentation #287

ccleve · 2021-11-10T17:46:22Z

Prior to version 0.2, schema generation for functions, types, etc. was quite fast, and you could examine the generated .sql files to see what pgx was doing under the hood.

The new schema generation code is very slow. Now running a basic test takes 30+ seconds.

Plus, it's not obvious where to look to see the generated .sql. (It looks like .pgx/13.4/pgx-install/share/postgresql/extension/my_extension.sql). Having it in the project itself would be a lot more convenient.

I wonder if we could kill two birds with one stone? Cache the .sql files in the project itself and avoid the generation phase for files that have not changed.

Update: I see that "cargo pgx schema" creates the file in /sql. Maybe that should happen by default.

eeeebbbbrrrr · 2021-11-12T16:34:13Z

If you know that your code changes don't require a new schema, you can do, for example cargo pgx test --no-schema pg12. That's going to be the biggest boost to performance you can do. That flag is available for test, run, and install.

Caching .sql files to avoid generation isn't really a thing with how the new schema generator works. Because the schema generator now uses fully-resolved rust types, the generation is compiled into the .so itself (and then statically linked to the binary sql-generator). It's no longer an external process that simply analyzes the source code.

Update: I see that "cargo pgx schema" creates the file in /sql. Maybe that should happen by default.

Or you can run cargo pgx schema -o /path/to/where/you/want/it.sql. Turns out there's automated build environments out there that don't let you write to your checkout directory, so we had to make some adjustments in this regard.

ccleve · 2021-11-12T19:06:57Z

--no-schema does not speed things up. Note the "28.91s" below.

It hangs on the line Building [=======================> ] 212/213: sql-generator(bin)

ccleve@ccleves-Mini pgxkestrel % cargo pgx test --no-schema pg14
"cargo" "test" "--features" " pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished test [unoptimized + debuginfo] target(s) in 3.60s
     Running unittests (target/debug/deps/pgxkestrel-23218445cb297060)

running 2 tests
building extension with features `pg14 pg_test`
"cargo" "build" "--features" "pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished dev [unoptimized + debuginfo] target(s) in 28.91s

eeeebbbbrrrr · 2021-11-12T19:15:46Z

hmm.

cargo pgx test is a two-step build process. One for the general cargo test framework side of things and one for the extension itself. I wonder if when I added --no-schema if I failed to push that through to the second build process.

Let me investigate.

eeeebbbbrrrr · 2021-11-12T19:25:17Z

I think maybe what's happening here is that the generic cargo build command we spawn is just blindly compiling the sql-generator .exe too. I see the same thing with cargo pgx run --no-schema as you're seeing with cargo pgx test.

I think maybe with --no-schema we also need to do cargo build --lib. Gonna try that now...

eeeebbbbrrrr · 2021-11-12T19:30:43Z

Yeah, that was it! I did properly push down the --no-schema flag for cargo pgx test, but I didn't realize that cargo would try to build/test the sql-generator .exe too.

$ touch src/lib.rs; cargo pgx test --no-schema pg12
"cargo" "test" "--features" " pg12 pg_test" "--no-default-features" "--lib"
   Compiling foo v0.0.0 (/Users/e_ridge/foo/foo)
    Finished test [unoptimized + debuginfo] target(s) in 2.65s
     Running unittests (target/debug/deps/foo-1904fb65e39f27b5)

running 1 test
building extension with features `pg12 pg_test`
"cargo" "build" "--lib" "--features" "pg12 pg_test" "--no-default-features"
   Compiling foo v0.0.0 (/Users/e_ridge/foo/foo)
    Finished dev [unoptimized + debuginfo] target(s) in 2.15s

installing extension
     Copying control file to `/Users/e_ridge/.pgx/12.8/pgx-install/share/postgresql/extension/foo.control`
     Copying shared library to `/Users/e_ridge/.pgx/12.8/pgx-install/lib/postgresql/foo.so`
    Skipping schema generation
    Finished installing foo
test tests::pg_test_hello_foo ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 3.20s

Stopping Postgres

I'll publish a 0.2.4 here in a few minutes and you can tell me if it's better.

eeeebbbbrrrr · 2021-11-12T19:38:01Z

I think we should just always use --lib, regardless of if --no-schema was specified? That seems to work and I think is more sane anyways. cargo-pgx shouldn't be trying to build other stuff that might be in the crate that it doesn't understand.

Looks like you can get yourself into trouble with this sequence tho:

$ cargo pgx run pg12
$ cargo pgx test --no-schema pg12

The latter will re-use the schema generated by the former, which means it won't have the test-specific schema. But I feel like that's on the user at that point.

When building the extension shared library for run/test/install modes, build it with `--lib` so that we're not needlessly trying to compile the sql-generator binary too.

eeeebbbbrrrr · 2021-11-12T19:53:19Z

Soon as CI passes I'll publish a new release. In an hour or so.

ccleve · 2021-11-12T20:02:35Z

I'll look at it right away.

eeeebbbbrrrr · 2021-11-12T20:17:12Z

v0.2.4 is live on crates.io. Remember to cargo install cargo-pgx first.

ccleve · 2021-11-12T21:02:56Z

Did cargo install cargo-pgx, then cargo upgrade (via the cargo-edit crate) which updated pgx, pgx-macros, pgx-tests = "0.2.4" Ran cargo pgx test pg14 once, which compiled a bunch of stuff, and then ran it a second time:

cargo pgx test pg14
"cargo" "test" "--lib" "--features" " pg14 pg_test" "--no-default-features"
    Finished test [unoptimized + debuginfo] target(s) in 0.28s
     Running unittests (target/debug/deps/pgxkestrel-10c6f5e593332159)

running 2 tests
building extension with features `pg14 pg_test`
"cargo" "build" "--lib" "--features" "pg14 pg_test" "--no-default-features"
    Finished dev [unoptimized + debuginfo] target(s) in 0.11s

Total success.

Then, I made a tiny source code change and ran it again:

cargo pgx test pg14
"cargo" "test" "--lib" "--features" " pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished test [unoptimized + debuginfo] target(s) in 6.31s
     Running unittests (target/debug/deps/pgxkestrel-10c6f5e593332159)

running 2 tests
building extension with features `pg14 pg_test`
"cargo" "build" "--lib" "--features" "pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished dev [unoptimized + debuginfo] target(s) in 4.85s

That took 11 seconds, which is way better than 30 seconds+.

Then, I added another [pg_extern] function, which forced a schema change:

cargo pgx test pg14
"cargo" "test" "--lib" "--features" " pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished test [unoptimized + debuginfo] target(s) in 5.73s
     Running unittests (target/debug/deps/pgxkestrel-10c6f5e593332159)

running 2 tests
building extension with features `pg14 pg_test`
"cargo" "build" "--lib" "--features" "pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished dev [unoptimized + debuginfo] target(s) in 3.33s

installing extension
     Copying control file to `/Users/ccleve/.pgx/14.0/pgx-install/share/postgresql/extension/pgxkestrel.control`
     Copying shared library to `/Users/ccleve/.pgx/14.0/pgx-install/lib/postgresql/pgxkestrel.so`
    Building SQL generator with features `pg14 pg_test`
"cargo" "build" "--bin" "sql-generator" "--features" "pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished dev [unoptimized + debuginfo] target(s) in 22.41s
 Discovering SQL entities
  Discovered 13 SQL entities: 2 schemas (1 unique), 10 functions, 1 types, 0 enums, 0 sqls, 0 ords, 0 hashes
running SQL generator with features `pg14 pg_test`
"cargo" "run" "--bin" "sql-generator" "--features" "pg14 pg_test" "--no-default-features" "--" "--sql" "/Users/ccleve/.pgx/14.0/pgx-install/share/postgresql/extension/pgxkestrel--1.0.sql"
    Finished dev [unoptimized + debuginfo] target(s) in 0.16s
     Running `target/debug/sql-generator --sql /Users/ccleve/.pgx/14.0/pgx-install/share/postgresql/extension/pgxkestrel--1.0.sql`
     Copying extension schema file to `/Users/ccleve/.pgx/14.0/pgx-install/share/postgresql/extension/pgxkestrel--1.0.sql`
     Copying extension schema upgrade file to `/Users/ccleve/.pgx/14.0/pgx-install/share/postgresql/extension/pgxkestrel--0.0.0.sql`
    Finished installing pgxkestrel
test pipeline::char_normalizer::tests::pg_a_small_test ... ok
test tests::pg_test_hello_pgxkestrel ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 28.16s

Stopping Postgres

Now we're back up to 30 seconds.

All in all, 0.2.x is now usable, but still way slower than 0.1.x if you make a schema change.

Thanks, I'm back in business.

eeeebbbbrrrr · 2021-11-12T21:36:27Z

All in all, 0.2.x is now usable, but still way slower than 0.1.x if you make a schema change.

Yeah. But it's at least way more correct and flexible than 0.1.x.

I suspect we'll find ways to make it faster or at least smarter about when a schema generation is necessary, but none of that is clear at the moment. Now that Rust's incremental compiler is actually live and working correctly, it makes what we're doing with the schema generator pretty tough. There might be some tricks we can play with codegen-units = 1 to help us know what we've done between cargo pgx runs, but it's gonna require a lot of thought and engineering.

ccleve · 2021-12-30T04:28:22Z

Maybe I should reopen this, so we have an open request. 30 seconds for every edit/compile/test cycle makes it hard to get work done.

Santa did not give me a $5,000 Threadripper this year. I'm stuck with an ordinary machine.

ccleve · 2022-01-21T19:30:27Z

Replaced by #391

eeeebbbbrrrr added a commit that referenced this issue Nov 12, 2021

fix issue #287

b8eed50

When building the extension shared library for run/test/install modes, build it with `--lib` so that we're not needlessly trying to compile the sql-generator binary too.

ccleve closed this as completed Nov 12, 2021

ccleve reopened this Dec 30, 2021

ccleve mentioned this issue Jan 21, 2022

Speed up sql-generator #391

Closed

ccleve closed this as completed Jan 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Schema generation performance and documentation #287

Schema generation performance and documentation #287

ccleve commented Nov 10, 2021 •

edited

Loading

eeeebbbbrrrr commented Nov 12, 2021

ccleve commented Nov 12, 2021

eeeebbbbrrrr commented Nov 12, 2021

eeeebbbbrrrr commented Nov 12, 2021

eeeebbbbrrrr commented Nov 12, 2021

eeeebbbbrrrr commented Nov 12, 2021 •

edited

Loading

eeeebbbbrrrr commented Nov 12, 2021

ccleve commented Nov 12, 2021

eeeebbbbrrrr commented Nov 12, 2021

ccleve commented Nov 12, 2021

eeeebbbbrrrr commented Nov 12, 2021

ccleve commented Dec 30, 2021

ccleve commented Jan 21, 2022

Schema generation performance and documentation #287

Schema generation performance and documentation #287

Comments

ccleve commented Nov 10, 2021 • edited Loading

eeeebbbbrrrr commented Nov 12, 2021

ccleve commented Nov 12, 2021

eeeebbbbrrrr commented Nov 12, 2021

eeeebbbbrrrr commented Nov 12, 2021

eeeebbbbrrrr commented Nov 12, 2021

eeeebbbbrrrr commented Nov 12, 2021 • edited Loading

eeeebbbbrrrr commented Nov 12, 2021

ccleve commented Nov 12, 2021

eeeebbbbrrrr commented Nov 12, 2021

ccleve commented Nov 12, 2021

eeeebbbbrrrr commented Nov 12, 2021

ccleve commented Dec 30, 2021

ccleve commented Jan 21, 2022

ccleve commented Nov 10, 2021 •

edited

Loading

eeeebbbbrrrr commented Nov 12, 2021 •

edited

Loading