Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved "HeapTuple" support #532

Merged
merged 22 commits into from
Apr 20, 2022
Merged

Improved "HeapTuple" support #532

merged 22 commits into from
Apr 20, 2022

Conversation

eeeebbbbrrrr
Copy link
Contributor

@eeeebbbbrrrr eeeebbbbrrrr commented Apr 6, 2022

This PR provides a new PgHeapTuple type that is capable of being constructed in various ways, along with reading and writing the individual attribute values in the underlying HeapTuple.

Fixes #502.

Breaking changes

  • FromDatum::from_datum's type signature has changed. It no longer needs an oid parameter.

Other changes

  • A HeapTuple type has been introduced which provides nice wrappers around heap_getattr and friends.
  • Added FromDatum::try_from_datum() which returns a Result<T, TryFromDatumError> instead of an Option allowing users to get context of why a converstion failed.

This constant was set to false in all every `FromDatum` implementation, and simply isn't necessary.

Further changes to `FromDatum` are going to change how this idea worked in the first place.
This argument was never actually used, only passed around.  And in many cases it was set to `pg_sys::InvalidOid` anyways.
```rust
        } else if datum == 0 {
            panic!("array was flagged not null but datum is zero");
        }
```

As the function is already unsafe, there's no need for us to have a check for a case that is likely never going to happen.  Postgres may send us a null datum, but as long as we check `is_null` first, we're okay.
This function ensures the desired Rust type is compatible with the backing Postgres type of the Datum being converted.  If they're not compatible, an `Err` is returned rather than either panicking or leading to undefined behavior.

It's not actually used anywhere yet.
Currently it only knows how to construct itself in a few ways, along with retrieving named/indexed datum values (as runtime type-checked Rust values).

There's a few breaking API changes in other places, especially in `htup.rs` and `PgTupleDesc`
…mposite_type examples (the latter of which really isn't quite finished yet) accordingly.
@eeeebbbbrrrr eeeebbbbrrrr changed the base branch from master to from-datum-overhaul April 6, 2022 17:13
@eeeebbbbrrrr
Copy link
Contributor Author

Once this gets merged, I can move on to doing work on issue #499 (which I'm mentally calling "composite type" support) and then also retool our Spi support to use the new PgHeapTuple. I don't think we have an issue for that...

@Hoverbear Hoverbear changed the title Issue 502: Improved "HeapTuple" support Improved "HeapTuple" support Apr 6, 2022
@Hoverbear Hoverbear linked an issue Apr 6, 2022 that may be closed by this pull request
@eeeebbbbrrrr eeeebbbbrrrr changed the base branch from from-datum-overhaul to develop April 6, 2022 17:24
@eeeebbbbrrrr eeeebbbbrrrr mentioned this pull request Apr 6, 2022
pgx/src/heap_tuple/mod.rs Outdated Show resolved Hide resolved
pgx/src/heap_tuple/mod.rs Outdated Show resolved Hide resolved
&mut self,
attno: NonZeroUsize,
value: T,
) -> FromDatumResult<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It strikes me that this we could have some of this function extracted into a replace_at_index(..) -> Result<Self, _> which returns the new heaptuple, and call it here as well as exposing that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow?

Copy link
Contributor

@Hoverbear Hoverbear Apr 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this function always must create a copy, it might desirable to support a pattern like:

let cloned = my_tuple.replace_at_index(1, "thing");

Instead of

let mut cloned = my_tuple.clone();
cloned.set_at_index(1, "thing");

This would allow the user to avoid an extra clone happening. (and this function could call that one, so it's not added complexity on our side)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PgHeapTuple doesn’t implement Clone so I still don’t follow…

}
}

impl<'a, AllocatedBy: WhoAllocated<pg_sys::HeapTupleData>> PgHeapTuple<'a, AllocatedBy> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting for @workingjubilee this is a similar pattern to how PgBox works.

Copy link
Contributor Author

@eeeebbbbrrrr eeeebbbbrrrr Apr 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's really poorly named. Instead of WhoAllocated it should be WhoNeedsToFreeThisThing.

The reason is, despite if it's AllocatedByRust or AllocatedByPostgres, it's a postgres palloc()'d pointer.

We can address this at a later time. Not relevant to this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WhoDrops

Copy link
Contributor Author

@eeeebbbbrrrr eeeebbbbrrrr Apr 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya know, @workingjubilee, why don't you put up a PR for that (just off of current develop branch) and otherwise give PgBox an audit that only someone with your skills can do.

It's looking like our next release is going to have some pretty big API breakage anyways, so now would be a good time to rejigger naming things...

pgx/src/heap_tuple.rs Outdated Show resolved Hide resolved
@Hoverbear
Copy link
Contributor

I'd like to make a detailed changelog entry for this and add it to the issue description for me to copy from on release. :)

@eeeebbbbrrrr
Copy link
Contributor Author

eeeebbbbrrrr commented Apr 6, 2022

oh, I wanted to /cc @einhverfr, as it relates to: #528.

This doesn't touch any of the Spi code yet, but this will become the foundation for retooling our Spi API.

@workingjubilee
Copy link
Member

Reviewing this to my satisfaction (and getting a start on auditing PgBox's "drop logic") required doing a deep-dive into how PGX and indeed Postgres handles transmitting data types, and auditing that code took my yesterday. However, at my current level of understanding, this looks good to me.

@eeeebbbbrrrr
Copy link
Contributor Author

Reviewing this to my satisfaction (and getting a start on auditing PgBox's "drop logic") required doing a deep-dive into how PGX and indeed Postgres handles transmitting data types, and auditing that code took my yesterday. However, at my current level of understanding, this looks good to me.

Cool. Thanks a bunch. Go ahead and smash that "approve" button.

I'll probably want to hold off on merging this until you're done poking around with PgBox.

@workingjubilee
Copy link
Member

Hmm. I would probably feel best if this went ahead first, precisely because my audit started touching on potentially conflicting areas and it would give me freedom to rampage over this (or more likely it would reduce the diff, really).

@eeeebbbbrrrr
Copy link
Contributor Author

Hmm. I would probably feel best if this went ahead first, precisely because my audit started touching on potentially conflicting areas and it would give me freedom to rampage over this (or more likely it would reduce the diff, really).

Okay. Let’s ask @Hoverbear to merge it in the morning.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, sorry. I re-rechecked this before hitting "Approve" and had almost missed a line.

@@ -18,7 +18,6 @@ pub struct Array<'a, T: FromDatum> {
array_type: *mut pg_sys::ArrayType,
elements: *mut pg_sys::Datum,
nulls: *mut bool,
typoid: pg_sys::Oid,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Carrying less state around is always an improvement.

if is_null {
None
} else if datum == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being able to skip these kinds of checks is what particularly helps a lot in reducing the diff I am working on.

pgx/src/datum/array.rs Show resolved Hide resolved
pgx/src/datum/into.rs Show resolved Hide resolved
@Hoverbear
Copy link
Contributor

After @workingjubilee 's feedback I'd love to see this merge, and can't wait to see what they follow up with!

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question turned out to mostly be a point of confusion: the function doesn't necessarily use the Postgres model for is_pass_by_value but IntoDatum's model, which asks this question re: Option<Datum>.

@eeeebbbbrrrr eeeebbbbrrrr changed the base branch from develop to develop-v0.5.0 April 20, 2022 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC: Improved HeapTuple support
3 participants