Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turn JNINativeInterface + JNIInvokeInterface_ into unions #28

Merged
merged 3 commits into from
Aug 31, 2023

Conversation

rib
Copy link
Contributor

@rib rib commented Jul 23, 2023

This implements the idea discussed here: #25

In particular one concrete motivation for this comes from having seen that the jni crate currently incorrectly accesses JNI 1.2+ functions without checking that the JNI version is >= 1.2 (e.g. jni-rs/jni-rs#463) and I think that kind of mistake would be much less likely if you had explicitly write out the version of the API that you are dereferencing.

This implements a #[jni_to_union] procmacro that lets us declare what version of the JNI spec each function was added (or mark them as "reserved") and the macro will replace the struct with a union that only exposes functions for the specific version being referenced.

So instead of a struct like:

struct JNIInvokeInterface_ {
    pub reserved0: *mut c_void,
    ..
    pub GetVersion: unsafe extern "system" fn(env: *mut JNIEnv) -> jint,
    ..
    pub NewLocalRef: unsafe extern "system" fn(env: *mut JNIEnv, ref_: jobject) -> jobject,
}

we have a union like:

union JNIInvokeInterface_ {
    v1_1: JNIInvokeInterface__1_1,
    v1_2: JNIInvokeInterface__1_2,
    reserved: JNIInvokeInterface__reserved,
}

And would access GetVersion like: env.v1_1.GetVersion and access NewLocalRef like: env.v1_2.NewLocalRef.

Each version struct includes all functions for that version and lower, so it's also possible to access GetVersion like env.v1_2.GetVersion.

This way it's more explicit when you're accessing functions that aren't part of JNI 1.1 which require you to have checked the version of JNI the JVM supports.

@rib rib force-pushed the rib/pr/jni-to-union branch from dd499f2 to ede36e1 Compare July 23, 2023 16:48
@rib
Copy link
Contributor Author

rib commented Jul 23, 2023

hi @nikomatsakis - considering your current work on Duchess I'd be interested to hear your feedback on this

@rib
Copy link
Contributor Author

rib commented Jul 23, 2023

@argv-minus-one and @jrose-signal it would also be good to get your feedback here too

@jrose-signal
Copy link

jrose-signal commented Jul 24, 2023

So from a compiler perspective this isn't 100% correct either: a union has the size of its largest member, but on a system that only has JRE 1.2 the JNI struct will be much smaller than the full set of callbacks. What you really want here is a union of pointers, not a pointer to a union, but that's certainly harder to work with.

In terms of improving correctness, well, the compiler can't enforce that you've actually used the right version here, so it's mostly about making things easier for humans. Which isn't bad, but my personal preference would be to do something better:

  • You could wrap all fields in accessors that check the version first, possibly with a bypass mechanism. (Likely unreasonably expensive.)
  • You could allocate your own struct that's always as big as the latest-supported JNI version, initialize it from the VM-provided interface struct, and use Option after all.
  • You could do something like the following (untested, may have syntax errors):
struct JNINativeInterface<Table>(*const Table);

impl<Table> Deref for JNINativeInterface<Table> {
  type target = Table;
  // …
}

// Unsafe because implementing this trait means pointers will be cast to this type and unilaterally dereferenced under certain conditions.
unsafe trait JNIGetVersion {
  const VERSION: u32;
  fn get_version(&self);
}

impl<Table> JNINativeInterface<Table>
where Table: JNIGetVersion {
  #[inline]
  fn downcast<NewTable: JNIGetVersion>(&self) -> Option<JNINativeInterface<NewTable>> {
    if Table::VERSION >= NewTable::VERSION || self.get_version() >= NewTable::VERSION {
      Some(JNINativeInterface(self.0.cast()))
    } else {
      None
    }
  }
}

fn do_something(interface: &JNINativeInterface<JNINativeInterface_1_1>) {
  if let Some(newer_interface) = interface.downcast::<JNINativeInterface_1_2>() {
    // do something with newer_interface
  } else {
    // fallback
  }
}

Again, the downside here is potentially calling GetVersion a lot.

@rib
Copy link
Contributor Author

rib commented Jul 24, 2023

Note: since, we're talking about the -sys API here our priority is more about providing a raw/zero-cost binding of the C ABI vtable and even though things are inherently unsafe at this level I think there are a couple of footguns currently that make the sys binding more dangerous to use than it needs to be.

We could potentially generate a wrapper API with runtime version checking but for this crate we'd still want to also expose the un-abstracted lower-level vtable for use by higher-level layers that might come up with their own way of dealing with version checks more efficiently. (It's quite likely that a higher level abstraction wouldn't need to repeat versions checks for every call so we can't impose that at this level)

a union has the size of its largest member, but on a system that only has JRE 1.2 the JNI struct will be much smaller than the full set of callbacks

yep, we still want a static Rust type that is ABI compatible with JNIInvokeInterface_ so the total size is going to encompass the full set of functions for the latest JNI version which may not be accessible at runtime.

This is baked into the design of JNI and it really pretty horrific but I think the scope of this -sys crate is that we mostly just want to provide a direct binding of the C API to Rust - with some wiggle room to try and find the least dangerous options.

So I think for now we should be assuming higher-level layers like the jni crate or Duchess will be responsible for putting a safe abstraction over this. For the jni crate in particular I think we should be setting at least 1.2 as a baseline that we should validate up-front which would also mean we wouldn't need to do any on-the-fly version checking when calling 1.1-1.2 functions.

My more-constrained aim with this change is to iterate the raw JNIInvokeInterface_ type to at least address a couple of footguns I've seen while working on the jni crate:

The first issue is that each function in the struct is currently wrapped in an Option which is misleading because it implies a need to check for None when calling functions and also gives a false impression that it would be meaningful to check for None for functions associated with newer JNI versions (See #25 for more discussion)

Secondly I think we're missing an opportunity to divide the functions up by version within a union in a way that remains ABI compatible but at least requires developers to have to type out their intent to access a specific JNI version.

Even though the jni crate should be responsible for providing a safe abstraction over this low-level interface we've seen that in practice it has not always checked the JNI version where it needs to and I think that would have been more clearly apparent if developers would have needed to type the version out as they dereferenced a particular function pointer.

@jrose-signal
Copy link

This is baked into the design of JNI and it really pretty horrific

Yes, it's a fairly common C idiom for an extensible "vtable" of sorts.

I think having a pointer-to-union rather than union-of-pointers or pointer-to-minimal-struct will make it more difficult to do "safe" wrappers like what I showed above, because Rust doesn't provide good tools for going from *const TableUnion to *const Table_1_1. But then again in my wrapper it goes directly from *const Table_1_1 to *const Table_1_2, so maybe it's not any better.

@rib
Copy link
Contributor Author

rib commented Jul 24, 2023

This is baked into the design of JNI and it really pretty horrific

Yes, it's a fairly common C idiom for an extensible "vtable" of sorts.

yep

I think having a pointer-to-union rather than union-of-pointers or pointer-to-minimal-struct will make it more difficult to do "safe" wrappers like what I showed above, because Rust doesn't provide good tools for going from *const TableUnion to *const Table_1_1. But then again in my wrapper it goes directly from *const Table_1_1 to *const Table_1_2, so maybe it's not any better.

Since this macro gives us these additional structs like JNIInvokeInterface__1_2 it does also open the possibility that higher-level wrappers could bypass the union type and use some other mechanism for picking the appropriate/minimal version vtable.

Code can also practically bypass the union and jump straight to the largest version as a way of getting back to the previous monolithic vtable if they maybe find that more convenient in some case (e.g. for defining certain macros perhaps).

I think the union type would still be useful to have as a general ABI-compatible binding for JNIInvokeInterface_ that can namespace functions by version but hopefully it's good that the added version-specific structs would also enable some of these other options too.

@rib
Copy link
Contributor Author

rib commented Jul 28, 2023

Although github linked it above I just wanted to also highlight here that I have done a port of the jni crate to this API, along with some, much needed, simplification of the internal jni crate macros used to make unsafe JNI calls: jni-rs/jni-rs#478

The namespacing of the functions by version helped highlight several places in the jni crate where it wasn't correctly ensuring it knew the JNI version before making > 1.1 calls.

@argv-minus-one
Copy link

argv-minus-one commented Jul 28, 2023

@rib: Seems like a good idea to me. I don't have anything to add.

Edit: Well, I used to have nothing to add. 😅

@argv-minus-one
Copy link

I noticed that this doesn't build in Rust 1.69, evidently because raw function pointers didn't implement Debug until recently, resulting in lots of errors like this:

error[E0277]: `unsafe extern "system" fn(*mut *const JNINativeInterface_) -> i32` doesn't implement `Debug`
   --> src/lib.rs:127:5
    |
115 | #[jni_to_union]
    | --------------- in this procedural macro expansion
...
127 |     pub GetVersion: unsafe extern "system" fn(env: *mut JNIEnv) -> jint,
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `unsafe extern "system" fn(*mut *const JNINativeInterface_) -> i32` cannot be formatted using `{:?}` because it doesn't implement `Debug`
    |
    = help: the trait `Debug` is not implemented for `unsafe extern "system" fn(*mut *const JNINativeInterface_) -> i32`
    = note: this error originates in the derive macro `Debug` which comes from the expansion of the attribute macro `jni_to_union` (in Nightly builds, run with -Z macro-backtrace for more info)

So, this PR will bump the jni-sys MSRV. I have no objection to that (I just forgot to rustup update), but others might. 🤷‍♂️

@jrose-signal
Copy link

Or the Debug implementation could be manually written. After all, it's not like it'll be a good debug experience to see a pile of function addresses.

@rib
Copy link
Contributor Author

rib commented Jul 28, 2023

Ah, okey.

I suppose it'd be nice to be reasonably conservative with the MSRV for a -sys crate ideally.

I haven't really looked at the ergonomics for the automatically derived Debug trait here, but yeah maybe we can generate something with the procmacro that's slightly better tailored.

If you were intentionally using the Debug trait with these function tables though I guess you might want those function addresses though which are gonna be pretty verbose.

Could maybe elide addresses unless pretty printing with #?, or vice versa

Or could just have a stub Debug implementation so we don't block other composite types that embed these types from deriving the Debug trait - and figure that it's anyway not particularly practical to print out a full JNI function table.

We should be ok to expand that later without affecting the semver, but would be a lot simpler to start with.

@rib rib force-pushed the rib/pr/jni-to-union branch 4 times, most recently from e1b2090 to 88cc4ae Compare August 31, 2023 19:50
@rib rib force-pushed the rib/pr/no-function-pointer-options branch from 5fb8742 to 8139e9f Compare August 31, 2023 19:50
Base automatically changed from rib/pr/no-function-pointer-options to master August 31, 2023 19:57
rib added 2 commits August 31, 2023 21:32
Although the project doesn't currently have an MSRV policy, this picks a
conservative (10 month old) minimum rust-version that will now be tested
via CI.
@rib rib force-pushed the rib/pr/jni-to-union branch from 88cc4ae to 70a0081 Compare August 31, 2023 21:36
@rib
Copy link
Contributor Author

rib commented Aug 31, 2023

For reference here, I realized in the end that the Debug trait wasn't really needed at all for these vtable structs containing function pointers.

Even when later looking at enabling warnings for missing_debug_implementations the lack of a Debug implementation for these types doesn't really matter, since they are accessed via pointers and the automatic Debug implementations for wrapper types aren't going to try and dereference these pointers.

Since this PR did unintentionally, temporarily require Rust 1.70 to build, it now includes a patch that sets the rust-version to 1.65.0 (10 months old) and tests this via CI.

I also renamed the proc macro crate to jni-sys-macros in preparation for publishing to crates.io

This implements a `#[jni_to_union]` procmacro that lets us declare what
version of the JNI spec each function was added and then declare a union
that only exposes functions for the specific version being referenced.

So instead of a struct like:

```rust
struct JNIInvokeInterface_ {
    pub reserved0: *mut c_void,
    ..
    pub GetVersion: unsafe extern "system" fn(env: *mut JNIEnv) -> jint,
    ..
    pub NewLocalRef: unsafe extern "system" fn(env: *mut JNIEnv, ref_: jobject) -> jobject,
}
```

we have a union like:
```
union JNIInvokeInterface_ {
    v1_1: JNIInvokeInterface__1_1,
    v1_2: JNIInvokeInterface__1_2,
    reserved: JNIInvokeInterface__reserved,
}
```

And would access `GetVersion` like: `env.v1_1.GetVersion` and access
`NewLocalRef` like: `env.v1_2.NewLocalRef`.

Each version struct includes all functions for that version and lower,
so it's also possible to access GetVersion like `env.v1_2.GetVersion`.

This way it's more explicit when you're accessing functions that
aren't part of JNI 1.1 which require you to have checked the version
of JNI the JVM supports.
@rib rib force-pushed the rib/pr/jni-to-union branch from 70a0081 to 98565d6 Compare August 31, 2023 22:15
@rib rib merged commit 065c79a into master Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants