Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema generation performance and documentation #287

Closed
ccleve opened this issue Nov 10, 2021 · 13 comments
Closed

Schema generation performance and documentation #287

ccleve opened this issue Nov 10, 2021 · 13 comments

Comments

@ccleve
Copy link
Contributor

ccleve commented Nov 10, 2021

Prior to version 0.2, schema generation for functions, types, etc. was quite fast, and you could examine the generated .sql files to see what pgx was doing under the hood.

The new schema generation code is very slow. Now running a basic test takes 30+ seconds.

Plus, it's not obvious where to look to see the generated .sql. (It looks like .pgx/13.4/pgx-install/share/postgresql/extension/my_extension.sql). Having it in the project itself would be a lot more convenient.

I wonder if we could kill two birds with one stone? Cache the .sql files in the project itself and avoid the generation phase for files that have not changed.

Update: I see that "cargo pgx schema" creates the file in /sql. Maybe that should happen by default.

@eeeebbbbrrrr
Copy link
Contributor

If you know that your code changes don't require a new schema, you can do, for example cargo pgx test --no-schema pg12. That's going to be the biggest boost to performance you can do. That flag is available for test, run, and install.

Caching .sql files to avoid generation isn't really a thing with how the new schema generator works. Because the schema generator now uses fully-resolved rust types, the generation is compiled into the .so itself (and then statically linked to the binary sql-generator). It's no longer an external process that simply analyzes the source code.

Update: I see that "cargo pgx schema" creates the file in /sql. Maybe that should happen by default.

Or you can run cargo pgx schema -o /path/to/where/you/want/it.sql. Turns out there's automated build environments out there that don't let you write to your checkout directory, so we had to make some adjustments in this regard.

@ccleve
Copy link
Contributor Author

ccleve commented Nov 12, 2021

--no-schema does not speed things up. Note the "28.91s" below.

It hangs on the line Building [=======================> ] 212/213: sql-generator(bin)

ccleve@ccleves-Mini pgxkestrel % cargo pgx test --no-schema pg14
"cargo" "test" "--features" " pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished test [unoptimized + debuginfo] target(s) in 3.60s
     Running unittests (target/debug/deps/pgxkestrel-23218445cb297060)

running 2 tests
building extension with features `pg14 pg_test`
"cargo" "build" "--features" "pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished dev [unoptimized + debuginfo] target(s) in 28.91s

@eeeebbbbrrrr
Copy link
Contributor

hmm.

cargo pgx test is a two-step build process. One for the general cargo test framework side of things and one for the extension itself. I wonder if when I added --no-schema if I failed to push that through to the second build process.

Let me investigate.

@eeeebbbbrrrr
Copy link
Contributor

I think maybe what's happening here is that the generic cargo build command we spawn is just blindly compiling the sql-generator .exe too. I see the same thing with cargo pgx run --no-schema as you're seeing with cargo pgx test.

I think maybe with --no-schema we also need to do cargo build --lib. Gonna try that now...

@eeeebbbbrrrr
Copy link
Contributor

Yeah, that was it! I did properly push down the --no-schema flag for cargo pgx test, but I didn't realize that cargo would try to build/test the sql-generator .exe too.

$ touch src/lib.rs; cargo pgx test --no-schema pg12
"cargo" "test" "--features" " pg12 pg_test" "--no-default-features" "--lib"
   Compiling foo v0.0.0 (/Users/e_ridge/foo/foo)
    Finished test [unoptimized + debuginfo] target(s) in 2.65s
     Running unittests (target/debug/deps/foo-1904fb65e39f27b5)

running 1 test
building extension with features `pg12 pg_test`
"cargo" "build" "--lib" "--features" "pg12 pg_test" "--no-default-features"
   Compiling foo v0.0.0 (/Users/e_ridge/foo/foo)
    Finished dev [unoptimized + debuginfo] target(s) in 2.15s

installing extension
     Copying control file to `/Users/e_ridge/.pgx/12.8/pgx-install/share/postgresql/extension/foo.control`
     Copying shared library to `/Users/e_ridge/.pgx/12.8/pgx-install/lib/postgresql/foo.so`
    Skipping schema generation
    Finished installing foo
test tests::pg_test_hello_foo ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 3.20s

Stopping Postgres

I'll publish a 0.2.4 here in a few minutes and you can tell me if it's better.

@eeeebbbbrrrr
Copy link
Contributor

eeeebbbbrrrr commented Nov 12, 2021

I think we should just always use --lib, regardless of if --no-schema was specified? That seems to work and I think is more sane anyways. cargo-pgx shouldn't be trying to build other stuff that might be in the crate that it doesn't understand.

Looks like you can get yourself into trouble with this sequence tho:

$ cargo pgx run pg12
$ cargo pgx test --no-schema pg12

The latter will re-use the schema generated by the former, which means it won't have the test-specific schema. But I feel like that's on the user at that point.

eeeebbbbrrrr added a commit that referenced this issue Nov 12, 2021
When building the extension shared library for run/test/install modes, build it with `--lib` so that we're not needlessly trying to compile the sql-generator binary too.
@eeeebbbbrrrr
Copy link
Contributor

Soon as CI passes I'll publish a new release. In an hour or so.

@ccleve
Copy link
Contributor Author

ccleve commented Nov 12, 2021

I'll look at it right away.

@eeeebbbbrrrr
Copy link
Contributor

v0.2.4 is live on crates.io. Remember to cargo install cargo-pgx first.

@ccleve
Copy link
Contributor Author

ccleve commented Nov 12, 2021

Did cargo install cargo-pgx, then cargo upgrade (via the cargo-edit crate) which updated pgx, pgx-macros, pgx-tests = "0.2.4" Ran cargo pgx test pg14 once, which compiled a bunch of stuff, and then ran it a second time:

cargo pgx test pg14
"cargo" "test" "--lib" "--features" " pg14 pg_test" "--no-default-features"
    Finished test [unoptimized + debuginfo] target(s) in 0.28s
     Running unittests (target/debug/deps/pgxkestrel-10c6f5e593332159)

running 2 tests
building extension with features `pg14 pg_test`
"cargo" "build" "--lib" "--features" "pg14 pg_test" "--no-default-features"
    Finished dev [unoptimized + debuginfo] target(s) in 0.11s

Total success.

Then, I made a tiny source code change and ran it again:

cargo pgx test pg14
"cargo" "test" "--lib" "--features" " pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished test [unoptimized + debuginfo] target(s) in 6.31s
     Running unittests (target/debug/deps/pgxkestrel-10c6f5e593332159)

running 2 tests
building extension with features `pg14 pg_test`
"cargo" "build" "--lib" "--features" "pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished dev [unoptimized + debuginfo] target(s) in 4.85s

That took 11 seconds, which is way better than 30 seconds+.

Then, I added another [pg_extern] function, which forced a schema change:

cargo pgx test pg14
"cargo" "test" "--lib" "--features" " pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished test [unoptimized + debuginfo] target(s) in 5.73s
     Running unittests (target/debug/deps/pgxkestrel-10c6f5e593332159)

running 2 tests
building extension with features `pg14 pg_test`
"cargo" "build" "--lib" "--features" "pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished dev [unoptimized + debuginfo] target(s) in 3.33s

installing extension
     Copying control file to `/Users/ccleve/.pgx/14.0/pgx-install/share/postgresql/extension/pgxkestrel.control`
     Copying shared library to `/Users/ccleve/.pgx/14.0/pgx-install/lib/postgresql/pgxkestrel.so`
    Building SQL generator with features `pg14 pg_test`
"cargo" "build" "--bin" "sql-generator" "--features" "pg14 pg_test" "--no-default-features"
   Compiling pgxkestrel v0.0.0 (/Users/ccleve/Documents/dev/pgrust/pgxkestrel)
    Finished dev [unoptimized + debuginfo] target(s) in 22.41s
 Discovering SQL entities
  Discovered 13 SQL entities: 2 schemas (1 unique), 10 functions, 1 types, 0 enums, 0 sqls, 0 ords, 0 hashes
running SQL generator with features `pg14 pg_test`
"cargo" "run" "--bin" "sql-generator" "--features" "pg14 pg_test" "--no-default-features" "--" "--sql" "/Users/ccleve/.pgx/14.0/pgx-install/share/postgresql/extension/pgxkestrel--1.0.sql"
    Finished dev [unoptimized + debuginfo] target(s) in 0.16s
     Running `target/debug/sql-generator --sql /Users/ccleve/.pgx/14.0/pgx-install/share/postgresql/extension/pgxkestrel--1.0.sql`
     Copying extension schema file to `/Users/ccleve/.pgx/14.0/pgx-install/share/postgresql/extension/pgxkestrel--1.0.sql`
     Copying extension schema upgrade file to `/Users/ccleve/.pgx/14.0/pgx-install/share/postgresql/extension/pgxkestrel--0.0.0.sql`
    Finished installing pgxkestrel
test pipeline::char_normalizer::tests::pg_a_small_test ... ok
test tests::pg_test_hello_pgxkestrel ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 28.16s

Stopping Postgres

Now we're back up to 30 seconds.

All in all, 0.2.x is now usable, but still way slower than 0.1.x if you make a schema change.

Thanks, I'm back in business.

@eeeebbbbrrrr
Copy link
Contributor

All in all, 0.2.x is now usable, but still way slower than 0.1.x if you make a schema change.

Yeah. But it's at least way more correct and flexible than 0.1.x.

I suspect we'll find ways to make it faster or at least smarter about when a schema generation is necessary, but none of that is clear at the moment. Now that Rust's incremental compiler is actually live and working correctly, it makes what we're doing with the schema generator pretty tough. There might be some tricks we can play with codegen-units = 1 to help us know what we've done between cargo pgx runs, but it's gonna require a lot of thought and engineering.

@ccleve ccleve closed this as completed Nov 12, 2021
@ccleve
Copy link
Contributor Author

ccleve commented Dec 30, 2021

Maybe I should reopen this, so we have an open request. 30 seconds for every edit/compile/test cycle makes it hard to get work done.

Santa did not give me a $5,000 Threadripper this year. I'm stuck with an ordinary machine.

@ccleve
Copy link
Contributor Author

ccleve commented Jan 21, 2022

Replaced by #391

@ccleve ccleve closed this as completed Jan 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants