Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low-level interop for PG arrays #636

Merged
merged 37 commits into from
Aug 26, 2022

Conversation

mhov
Copy link
Contributor

@mhov mhov commented Aug 17, 2022

These PG function shims will allow for future work to improve the pgx::Array<T> type, as well as manual access to ArrayType data in the meantime.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the helper functions have to live in pgx-pg-sys? Because they should be purely additive to the sys crate, I would somewhat prefer they were exposed through a mod in the main pgx crate, so as to make it easier to thread them through the other code in pgx without violating coherence.

I also would like it if all new unsafe code that lands in PGX to document its public /// # Safety requirements as precisely as possible, and for every invocation of unsafe to come with // SAFETY: annotations, and I would prefer it to come with #[deny(unsafe_op_in_unsafe_fn)] on the containing module so as to require both of those even inside an unsafe fn.

It is the last one, the extreme safety requirements, that is completely impractical to implement throughout pgx-pg-sys. Note that the comments I desire can be very terse, they do not have to be elaborate, they just have to make clear all the little details like "this does a field read".

pgx-pg-sys/cshim/pgx-cshim.c Outdated Show resolved Hide resolved
pgx-pg-sys/src/lib.rs Outdated Show resolved Hide resolved
pgx-pg-sys/src/lib.rs Outdated Show resolved Hide resolved
@workingjubilee
Copy link
Member

workingjubilee commented Aug 17, 2022

The burden of assuring memory safety must end the moment code leaves an unsafe context. This means that a safe function may not impose any requirements whatsoever on its callers beyond that which it imposes on the purely programmatic level. It is permissible to induce logic errors for illogical inputs, but trying to read values out of the null page (or, really, any field offset from the null pointer) is definitely not one of them.

e.g. It is fine if impl Ord for Type produces incorrect results for some comparisons, because Ord is safe code. Thus when your fn sort relies on impl Ord to be correct in order to return a correct sort, it is fine if the resulting sort is wrong. It is not fine if you then rely on the impl of Ord being correct for memory safety purposes. This is an actual, in-the-wild concern in real code that is not Rust, as glibc's qsort in the past has induced segfaults on incorrect comparison functions. Rust does not tolerate that kind of shenanigans, so glibc's qsort would be considered unsafe.

The direct implication is that safe code is allowed to "shit the bed", from the perspective of unsafe code, and the unsafe code must either pass the burden on to a caller (and thus itself be an unsafe fn), or clean it up. Thus, for a function which provides a safe wrapper for unsafe code, the buck stops there. It must guarantee all safely-obtained inputs in Rust (and it is "safe" code to write std::ptr::null_mut()) to the unsafe block inside do not violate memory safety.

...Directly, mind. Unfortunately running DROP DATABASE IF EXISTS prod is in the category of things that are considered "logic errors", not "memory safety violations", despite having a marvelous potential to erase data you may have wanted to keep around.

mhov and others added 8 commits August 19, 2022 16:39
This offers safe accessors to the underlying ArrayType's fields,
without imposing any more burden on the data in question.
This also ports over ARR_DIMS(ArrayType*),
and ARR_NELEMS, which wraps ArrayGetNItems(ndim, *dims)
@workingjubilee workingjubilee self-requested a review August 24, 2022 06:34
pgx/src/array.rs Outdated Show resolved Hide resolved
Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a second pass, as already mentioned.

pgx/src/array.rs Show resolved Hide resolved
pgx/src/array.rs Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx-tests/src/tests/array_tests.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs improvement in the documentation before it lands, even though there is going to be a followup PR soon.

pgx/src/array.rs Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated
Comment on lines 110 to 112
// for expected behavior, see:
// postgres/src/include/utils/array.h
// #define ARR_DIMS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this path be visible in the public documentation? I suppose they can click through to see the source, but the public docs will support links.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, actually, directly linking is a good idea. And that way we could also link to the doxygen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doxygen is too unstable, will point to master and lines might change. Instead used gitweb links.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a pgx-maintainers perspective, how much of a maintenance nightmare is it linking to the doxygen pages? How often would that change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially every time someone lands a commit into postgresql/master.

pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
pgx/src/array.rs Outdated Show resolved Hide resolved
Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that was... quite a few commits, and only 11% were rustfmt commits! Nicely done!

pgx/src/array.rs Outdated Show resolved Hide resolved
@workingjubilee workingjubilee changed the title c-shims for PG array functions Low-level interop for PG arrays Aug 26, 2022
pgx/src/array.rs Outdated
Comment on lines 146 to 165
/**
The ability to rewrite the dimensions slice.

You almost certainly do not actually want to call this,
unless you intentionally stored the actually intended ndim and wrote 0 instead.
Returns a triple tuple of
* a mutable reference to the underlying ArrayType's ndim field
* a pointer to the first c_int of the dimensions slice
* a mutable reference to RawArray's len field

Write to them in order.
*/
pub unsafe fn dims_mut(&mut self) -> (&mut libc::c_int, NonNull<libc::c_int>, &mut usize) {
// SAFETY: Validity asserted on construction, origin ptr is non-null.
let dims_ptr = unsafe { NonNull::new_unchecked(pgx_ARR_DIMS(self.ptr.as_ptr())) };
let len = &mut self.len;

// SAFETY: Validity asserted on construction. Just reborrowing a subfield.
(unsafe { &mut self.ptr.as_mut().ndim }, dims_ptr, len)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You ain't gonna need it, tho'.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okaaaay it was just fun to write.

@workingjubilee workingjubilee merged commit 1d80c95 into pgcentralfoundation:develop Aug 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants