Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove dlopen hack #1468

Merged
merged 2 commits into from
Feb 7, 2024
Merged

Conversation

usamoi
Copy link
Contributor

@usamoi usamoi commented Jan 8, 2024

Resolves #1369

Motivation of this PR:

  • no need to compile cargo-pgrx after upgrading compiler
  • make it possible to support binstall

This PR:

  • recompilation occurs even if we use a nightly toolchain.
  • schema is generated by cargo run --bin pgrx_embed_${ext_name} , instead of using libloading crate (unused symbols are removed by linker, so it compiles)
  • removes compiler version check
  • removes codes about stub
  • removes unnecessary metadata traits

This is a breaking change: users need to write pgrx::pgrx_embed!(); to src/pgrx_embed.rs and add that [[bin]] to the Cargo.toml, otherwise schema generation will report an error.

@usamoi usamoi marked this pull request as draft January 8, 2024 16:07
@eeeebbbbrrrr
Copy link
Contributor

This is... this is exciting!

@usamoi usamoi force-pushed the nohack branch 3 times, most recently from 216303e to 9c2b706 Compare January 8, 2024 17:46
@workingjubilee
Copy link
Member

find all entities using ctor, instead of using object crate (macro_magic does not support items that are not accessible)

Curious, why do we want to support non-accessible items?

@workingjubilee
Copy link
Member

workingjubilee commented Jan 8, 2024

Using linker sections again would be annoying, but could be plausible.

However:

Rust's philosophy is that nothing happens before or after main and this library explicitly subverts that. The code that runs in the ctor and dtor functions should be careful to limit itself to libc functions and code that does not rely on Rust's stdlib services.

This sounds like it will eventually prove to be an unreasonable expectation, if it is not one already?

@workingjubilee
Copy link
Member

Does this support cross-compilation?

@workingjubilee
Copy link
Member

workingjubilee commented Jan 9, 2024

Curious, why do we want to support non-accessible items?

To be clearer: when pgrx performs code expansion, even if the programmer doesn't put pub on their function, their function is linked into a publicly-accessible wrapper. The fact that we accept non-pub functions is mostly a relic of implementation details, they are still publicly-accessible in a fairly meaningful sense.

@workingjubilee
Copy link
Member

Does this support cross-compilation?

I suppose schema gen has to happen on the build host anyways, so we're hoping programmers don't do anything silly like combining pg_extern with cfg(target_...)...

@usamoi
Copy link
Contributor Author

usamoi commented Jan 9, 2024

Does this support cross-compilation?

I suppose schema gen has to happen on the build host anyways, so we're hoping programmers don't do anything silly like combining pg_extern with cfg(target_...)...

They can take advantages of qemu. We had the idea in #981 (comment)

@usamoi
Copy link
Contributor Author

usamoi commented Jan 9, 2024

We can use cargo rustc --lib -- --Zunpretty=expanded and syn to get all #[no_mangle] extern "Rust" fn __pgrx_internals, do some codegen for binary schema in cargo-pgrx and run binary schema. Is it an acceptable solution? Non-accessible and private items can be accessed (my code marks many modules containing pg_extern private and I do not want to break my code) and ctor is removed.

@workingjubilee
Copy link
Member

workingjubilee commented Jan 9, 2024

We can use cargo rustc --lib -- --Zunpretty=expanded and syn to get all #[no_mangle] extern "Rust" fn __pgrx_internals, do some codegen for binary schema in cargo-pgrx and run binary schema. Is it an acceptable solution? Non-accessible and private items can be accessed (my code marks many modules containing pg_extern private and I do not want to break my code) and ctor is removed.

Yes, I think that sounds acceptable. My main concern was that anything based on linker sections is historically prone to dying when it slams into the rocks of platform-specific nonsense. So we simply don't have sufficient contributors to reasonably work through that and fix while also maintaining the other parts of pgrx, and they always cause issues in ways that are hard to even identify as a problem with the platform-specific thing.

I understand the concern about breakage. 0.12 is likely to have quite enough to fix to update to as-is, because it will land fixes for lifetime issues, so I agree we don't want to have even more breakage in this cycle.

@usamoi usamoi force-pushed the nohack branch 5 times, most recently from 11255f3 to f214499 Compare January 10, 2024 04:24
@usamoi usamoi marked this pull request as ready for review January 10, 2024 04:51
@usamoi
Copy link
Contributor Author

usamoi commented Jan 10, 2024

CI passed and I will continue to finish unfinished work.

@workingjubilee
Copy link
Member

workingjubilee commented Jan 10, 2024

This looks very interesting!

I discussed this briefly with the other maintainers. Another note against the ctor crate: @thomcc mentions that rustc has problems with... accidentally discarding ctors. Only #[no_mangle] pub extern "C" fn can really be trusted to land in the binary. Hypothetically these ctor issues are bugs and rustc should have fixed them by now, but we feel uncertain that there aren't more bugs underneath, and they seem like good to avoid.

A concern of build performance regression was brought up, but at the same time, we don't really feel a regression that isn't "literally doubling build time" is worth blocking this over: pgrx has a bunch of low-hanging fruit on build time, and having a much less cursed build setup is worth sacrificing at least some performance.

@usamoi
Copy link
Contributor Author

usamoi commented Jan 11, 2024

blocked by rust-lang/cargo#13280 use cfg as a workaround

@workingjubilee
Copy link
Member

workingjubilee commented Jan 11, 2024

Hmm. That currently requires a build.rs in order to work, for rerun-if-env-changed. Is there a chance we could use something like cfg options?

@usamoi usamoi force-pushed the nohack branch 2 times, most recently from ff58e46 to cb893fc Compare January 11, 2024 09:48
@usamoi
Copy link
Contributor Author

usamoi commented Jan 11, 2024

cargo expand causes recompliation even if we use a nightly toolchain. Also using cargo expand is just moving from a trick to another trick since this is unstable. So I just add object crate back.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unfortunately seems to make using the command line tools to interact with things in this repo a bit... confusing:

$ cargo run --bin strings/pgrx_embed
error: no bin target named `strings/pgrx_embed`.
Available bin targets:
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    cargo-pgrx
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx-version-updater
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed
    pgrx_embed

This would be understandable if the other concerns were not as big a concern.

cargo-pgrx/src/command/schema.rs Outdated Show resolved Hide resolved
cargo-pgrx/src/command/schema.rs Show resolved Hide resolved
cargo-pgrx/src/templates/pgrx_embed_rs Outdated Show resolved Hide resolved
cargo-pgrx/src/command/schema.rs Outdated Show resolved Hide resolved
@workingjubilee
Copy link
Member

workingjubilee commented Jan 17, 2024

To address the "suddenly all pgrx extensions look the same to tooling" thing, and to allow developers to have more than one pgrx extension in a workspace... more common than you might think... I (currently) think we're best off using the crate name plus a suffix so as to indicate our schema-generator's binary name. Perhaps just giving up on cargo run --bin being easy to type in a workspace is better? I'm not entirely sold on either path.

Either way, we also can help ourselves by adding notes to the Cargo.toml using a [package.metadata.pgrx] table if we like. The [package.metadata] table is completely unexamined and explicitly "unclaimed" by the cargo tooling, so we can write whatever we want to it. And we can coexist peacefully with other tools that do this as long as we use a sensible namespace, like using a namespaced subtable.

@workingjubilee
Copy link
Member

I opened rust-lang/cargo#13312 and got some interesting feedback: using the same bin name for all of these can definitely cause problems.

@usamoi usamoi force-pushed the nohack branch 2 times, most recently from d11dbbc to d4fbdad Compare January 27, 2024 08:54
@thomcc
Copy link
Contributor

thomcc commented Jan 27, 2024

Just to check, we no longer use ctor right? (It won't work because rustc doesn't compile things as whole-archive, so it would lead to silently dropped exports in large crates).

@usamoi
Copy link
Contributor Author

usamoi commented Jan 27, 2024

Just to check, we no longer use ctor right? (It won't work because rustc doesn't compile things as whole-archive, so it would lead to silently dropped exports in large crates).

Yes, we no longer use ctor.

@workingjubilee
Copy link
Member

workingjubilee commented Jan 27, 2024

It looks like my todo tests were a bad idea in actuality, since they have many tiny spurious failures, so the current CI status doesn't reflect on this PR. I will PR an ignore for them (on Monday, probably), if nobody beats me to it. Handled, it was actually just one test that was prone to spuriousness.

pgrx/src/lib.rs Outdated Show resolved Hide resolved
cargo-pgrx/src/command/schema.rs Outdated Show resolved Hide resolved
pgrx-sql-entity-graph/src/used_type.rs Show resolved Hide resolved
Signed-off-by: usamoi <[email protected]>
@usamoi
Copy link
Contributor Author

usamoi commented Feb 7, 2024

pgrx_embed is renamed to pgrx_embed_{name} now.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: 5566cea yeah that can get a bit long...

I might take a look later at adjusting the exact way that is handled before 0.12.0 actual... We may want to use the package metadata table to record a name, for future evolution purposes, and use that to make it more concise. It also would let people to rearrange their repo how they want, assigning whatever binary and file names they want, without confusing the pgrx tooling.

That's not a blocking concern for this, however. This looks great and a new extension passes cargo pgrx test all after doing the path hack and rerunning cargo pgrx init, which means that all we need to make this reusable is to publish it. This opens up so many possibilities! Thank you.

@workingjubilee workingjubilee merged commit 03e97ee into pgcentralfoundation:develop Feb 7, 2024
8 checks passed
@eeeebbbbrrrr
Copy link
Contributor

@usamoi: Thanks for all your work here. You too, of course, @workingjubilee.

It's really special that someone from the community helped solve this problem. Much appreciation and respect!

workingjubilee added a commit that referenced this pull request Feb 7, 2024
I noticed #1468 would allow us to also remove a dependency or two from
cargo-pgrx, so I started work by removing once_cell from it and a few
other spots in our tree, although we use it so pervasively it's not
realistic to remove it entirely. It was actually secretly in our public
API, behind a `#[doc(hidden)]`, for some reason.

I then noticed a bunch of other easy removals from cargo-pgrx:
- serde_derive should not be used directly
- flate2, prettyplease, syn, and unescape are not directly used anymore
- nix crate maintenance has gotten better, but it is a big dep for 1 bool
workingjubilee added a commit that referenced this pull request Mar 1, 2024
Welcome to pgrx 0.12.0-alpha.1!

Say the magic words with me!

```shell
cargo install cargo-pgrx --locked --version 0.12.0-alpha.1
```

# Breaking Changes

## No more dlopen!

Perhaps the most exciting change this round is @usamoi's contribution in
#1468 which means that
we no longer perform a `dlopen` in order to generate the schema. The
cost, such as it is, is that your pgrx extensions now require a
`src/bin/pgrx_embed.rs`, which will be used to generate the schema. This
has much less cross-platform issues and will enable supporting things
like `cargo binstall` down the line.

It may be a bit touchy on first-time setup for transitioning older
repos. If necessary, you may have to directly add a
`src/bin/pgrx_embed.rs` and add the following code (which should be the
only code in the file, though you can add comments if you like?):

```rust
::pgrx::pgrx_embed!();
```

Your Cargo.toml will also want to update its crate-type key for the
library:
```toml
[lib]
crate-type = ["cdylib", "lib"]
```

## Library Code

- pgrx-pg-sys will now use `ManuallyDropUnion` thanks to @NotGyro in
#1547
- VARHDRSZ `const`s are no longer `fn`, thanks to @workingjubilee in
#1584
- We no longer have `Interval::is_finite` since
#1594
- We translate more `*_tree_walker` functions to the same signature
their `*_impl` version in Postgres 16 has:
#1596
- Thanks to @eeeebbbbrrrr in
#1591 we no longer have
the `pg_sql_graph_magic!()` macro, which should help with more things in
the future!

# What's New

We have quite a lot of useful additions to our API:

- `SpiClient::prepare_mut` was added thanks to @XeniaLu in
#1275
- @usamoi also contributed bindings subscripting code in
#1562
- For `#[pg_test]`, you have been able to use `#[should_panic(expected =
"string")]` to anticipate a panic that contains that string in that
test. For various reasons, `#[pg_test(error = "string")]` is much the
same. Now, you can also use `#[pg_test(expected = "string")]`, in the
hopes that is easier to stumble across, as of
#1570

## `Result<composite_type!("..."), E>` support

- In #1560 @NotGyro
contributed support for using `Result<composite_type!("Name"), E>`, as a
case that had not been handled before.

## Significantly expanded docs
Thanks to @rjuju, @NotGyro, and @workingjubilee, we now have
significantly expanded docs for cargo-pgrx and pgrx in general. Some of
these are in the API docs on https://docs.rs or the READMEs, but there's
also a guide, now! It's not currently published, but is available as an
[mdbook](https://github.com/rust-lang/mdBook) in the repo.

Some diagnostic information that is also arguably documentation, like
comments and the suggestion to `cargo install`, have also been improved,
thanks to @workingjubilee in
- #1579
- #1573

## `#[pg_cast]`

An experimental macro for a `CREATE CAST` was contributed by @xwkuang5
in #1445!

## Legal Stuff

Thanks to @the-kenny in
#1490 and
@workingjubilee in
#1504, it was brought to
our attention that some dependencies had unusual legal requirements. So
we fixed this with CI! We now check our code included into pgrx-using
binaries is MIT/Apache 2.0 licensed, as is common across crates.io,
using `cargo deny`!. The build tools will have more flexible legal
requirements (partly due to the use of Mozilla Public License code in
rustls).

# Internal Changes
Many internal cleanups were done thanks to
- @workingjubilee in too many PRs to count!
- @thomcc found a needless condition in
#1501
- @nyurik in too many PRs to count!

In particular:
- we now actually `pfree` our `Array`s we detoasted as-of
#1571
- creating a `RawArray` is now low-overhead due to
#1587

## Soundness Fixes
We had a number of soundness issues uncovered or have added more tests
to catch them.
- Bounds-checking debug assertions for array access by @NotGyro in
#1514
- Fix unsound `&` and `&mut` in `fcinfo.rs` by @workingjubilee in
#1595

## Less Deps
Part of the cleanup by @workingjubilee was reducing the number of deps
we compile:
* cargo-pgrx: reduce trivial dep usages in
#1499
* Update 2 syn in #1557

Hopefully it will reduce compile time and disk usage!

## New Contributors
* @the-kenny made their first contribution in
#1490
* @xwkuang5 made their first contribution in
#1445
* @rjuju made their first contribution in
#1516
* @nyurik made their first contribution in
#1533
* @NotGyro made their first contribution in
#1514
* @XeniaLu made their first contribution in
#1275

**Full Changelog**:
v0.12.0-alpha.0...v0.12.0-alpha.1
@workingjubilee workingjubilee mentioned this pull request Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fixing sql-entity-graph and cargo-pgrx's dlopen hack
4 participants