Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust StringInfo wrapper impl to match actual definition #1457

Conversation

workingjubilee
Copy link
Member

@workingjubilee workingjubilee commented Dec 30, 2023

A commit that alters the definition of StringInfo was introduced about 2 months ago to Postgres: postgres/postgres@f0efa5a

It contains the following comment:

  * As a special case, a StringInfoData can be initialized with a read-only
  * string buffer.  In this case "data" does not necessarily point at a
  * palloc'd chunk, and management of the buffer storage is the caller's
  * responsibility.  maxlen is set to zero to indicate that this is the case.
  * Read-only StringInfoDatas cannot be appended to or reset.
  * Also, it is caller's option whether a read-only string buffer has a
  * terminating '\0' or not.  This depends on the intended usage.

This will make the following sequence of Rust code completely unsound by mere invocation as of Postgres 17, even assuming the lifetime constraint is correct:

let stringinfo: *mut pg_sys::StringInfoData = something_that_returns_ptr_to_stringinfodata();
let stringinfo: StringInfo<AllocatedByPostgres> = unsafe { StringInfoData::from_pg(stringinfo) };
let cstr_from_stringinfo: &'static CStr = stringinfo.into();

However, that commit has its own remarks:

There were various places in our codebase which conjured up a StringInfo
by manually assigning the StringInfo fields and setting the data field
to point to some existing buffer.  There wasn't much consistency here as
to what fields like maxlen got set to and in one location we didn't
correctly ensure that the buffer was correctly NUL terminated at len
bytes, as per what was documented as required in stringinfo.h

..."In one location we didn't correctly ensure that the buffer was correctly NUL terminated"?

On actually reviewing the commit, in multiple places, the maximum len was previously set to -1 or other incorrect values to indicate similar sorts of "borrow-like" usages. In other words, the max_len > len invariant was not preserved either, so it was not impossible for the pointee buffer to actually end at exactly len N, instead of also having another byte available.

While the only constructor for the StringInfoData wrapper is its unsafe fn from_pg, we were most certainly relying on this being true, and making comments as if we expected Postgres to not randomly pass us bogus data. In actuality, we find now that Postgres passes us bogus data all the time. As this guarantee has never actually been enforced, we should remove the implementation of impl From<StringInfo<_>> for &'static CStr immediately. This also means our Drop implementation has to go.

A more direct API is offered instead via StringInfo::leak_cstr.

This implementation only brings tragedy and sorrow by
making it easy to convert the data into a reference
but losing track of the original StringInfo handle
For some time now, Postgres has had "read-only" StringInfo,
though if you read commits which canonize this idea, you might
imagine that this is a new development. The entire abstraction here
should be burned down and rewritten, but this minimal hack
will last until I land other fixes first.
@workingjubilee workingjubilee changed the title C program run run progra segfault Adjust StringInfo wrapper impl to match actual definition Dec 30, 2023
@workingjubilee
Copy link
Member Author

Oh, eff. Forgot about the sql annotation... sigh.

@workingjubilee workingjubilee marked this pull request as draft December 30, 2023 04:50
@workingjubilee
Copy link
Member Author

...okay the "just one more level"-ness of this has faded away so I'm stopping here. The only way this is currently failing tests, as far as I can tell, is if we have always been unsound about this code. Which seems unfortunately plausible.

@eeeebbbbrrrr
Copy link
Contributor

I probably want to say something here but I’m not sure what. I’ll circle back in the new year.

@workingjubilee
Copy link
Member Author

...okay the "just one more level"-ness of this has faded away so I'm stopping here. The only way this is currently failing tests, as far as I can tell, is if we have always been unsound about this code. Which seems unfortunately plausible.

To be clear about my remark here: I mean violating Rust-level requirements. It may be that I'm misunderstanding something about the current code and need to review everything again, but the current panics shouldn't really happen if our code wasn't breaking the _unchecked CStr contract.

I think we should allow ourselves to rely on some invariants that Postgres specifies but favor checking those invariants... much more often, especially when it's some sort of basically-one-time check on instantiation of a type.

@workingjubilee
Copy link
Member Author

Okay, so the tests actually want to work now that I restart them? Okay, GitHub Actions. I see you.

Comment on lines +839 to +844
#[::pgrx::pgrx_macros::pg_extern (immutable,parallel_safe)]
pub fn #funcname_out #generics(input: #name #generics) -> &'static ::core::ffi::CStr {
let mut buffer = ::pgrx::stringinfo::StringInfo::new();
::pgrx::inoutfuncs::JsonInOutFuncs::output(&input, &mut buffer);
buffer.into()
// SAFETY: We just constructed this StringInfo ourselves
unsafe { buffer.leak_cstr() }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is bad, for various reasons, but right now it has to be this way. The proper fix is blocked on #1383 but this issue existing blocks solving that, because the previous, worse version of this code requires the staticizing hack and ignoring lifetimes in order to be allowed to ship. In other words:

I know it's unsound, and what's currently extant is also unsound, so there's no actual change, except that now the unsoundness becomes fixable.

@workingjubilee workingjubilee marked this pull request as ready for review January 2, 2024 22:52
@workingjubilee
Copy link
Member Author

The unsoundness I got confused about (not related to StringInfo, but related to the from_bytes_with_nul_unchecked contract) is actually not a problem, the real problem was "CI lol". Or a cache somewhere.

We absolutely are not using _unchecked anymore though.

@workingjubilee workingjubilee merged commit 483606b into pgcentralfoundation:develop Jan 2, 2024
8 checks passed
@workingjubilee workingjubilee deleted the c-program-run-run-progra--SEGFAULT branch January 2, 2024 22:59
workingjubilee added a commit that referenced this pull request Jan 4, 2024
These all are actually lifetimes bound by their invocation site, which
means that they return any lifetime which is named, instead of a
lifetime that is bound by the input types in a logical way. This means
that these lifetimes might as well be `'static`, for all it would (not)
matter.

An important detail is that while this pattern is normally acceptable
_and useful_ in Rust code, in pgrx it should be rejected for now because
the lifetimes on the actually-called-by-Postgres function are defined,
essentially, by a function _pointer_. These lifetimes cannot be
coherently described to a function pointer. Thus, they form a sort of
impedance mismatch: when you call this function from Rust code, you get
a constraint that doesn't actually exist in the code that will run in
Postgres. The difference is confusing, allowing you to write code
relying on the compiler's lifetime checks and even fail to call it in
certain ways, seemingly satisfying the soundness requirements. Then
suddenly the assumption those compiler checks are founded on evaporates
in the actual runtime execution, leading to undefined behavior.

I hope to fix a number of these problems relatively soon, but first
thing's first: The removal of illusions. If this code is unsound now,
then in truth it already was.

This effectively serves as a followup to #1457
and it is part of what enables solving #1383
by not requiring the pervasive use of that hack.
workingjubilee added a commit that referenced this pull request Jan 5, 2024
The `pgrx::composite_type!` macro currently assigns the `'static`
lifetime if you do not give one. This is a decision of dubious
soundness. Instead, demand a lifetime, or assign the anonymous lifetime.
Both cases match the behavior of attempting to elide lifetimes in your
usual Rust code, though with slightly worse error messages in some
cases. This may require people to write `'static` in more places (and
that is what the typical error suggests), but it allows noticing and
fixing problems that currently are not visible, easy to understand, or
easy to check.

This is the latest in this series:
- #1383
- #1457
- #1461
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants