Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make #[derive(PostgresType)] impl its own FromDatum #1381

Conversation

workingjubilee
Copy link
Member

@workingjubilee workingjubilee commented Nov 8, 2023

Mainly, this removes a source of persistent and confounding type errors because of the generic blanket impl of FromDatum for all T that fulfill so-and-so bounds. These may mislead one, if one is writing code generic over FromDatum, to imagine that one needs a Serialize and Deserialize impl or bound for a given case, even when those are not required. By moving these requirements onto the type that derives, this moves any confusion to the specific cases it actually applies to.

This has a regrettable effect that now PostgresType requires a Serialize and Deserialize impl in order to work, unless one uses the hacky #[bikeshed_postgres_type_manually_impl_from_into_datum] attribute, which I intend to rename or otherwise fix up before pgrx reaches its 0.12.0 release.

@workingjubilee
Copy link
Member Author

The only downstream user, as far as I can tell, that this might negatively affect, is @pksunkara's pgx-ulid. If you genuinely don't want to impl/derive Serialize and Deserialize for your extension's type, lmk and I'll figure out an alternative.

Copy link

@pksunkara pksunkara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let me know if I am misunderstanding something.

It looks like we are making this change to improve the UX of the errors and there's nothing that will change the functionality of IntoDatum and FromDatum.

I have a suggestion to allow skipping specifying Serialize and Deserialize for users who implement their own IntoDatum and FromDatum. Have a proc macro called PostgresDatum or something that implements these and enforces the Serialize and Deserialize bounds. I think this would solve #965 too.

@workingjubilee
Copy link
Member Author

workingjubilee commented Nov 8, 2023

@pksunkara Everything is going to be completely reworked, including the FromDatum and IntoDatum traits, and I don't see why the Serialize and Deserialize bounds are involved at all in your suggestion. ( Apologies if you saw the earlier draft, I misread something. )

@workingjubilee
Copy link
Member Author

It would also not solve #965, because it does not actually do what is needed in order to actually make Postgres use those functions for binary send/receive, which is to export a function that can be called by Postgres.

@workingjubilee
Copy link
Member Author

workingjubilee commented Nov 8, 2023

There is a bunch of important background information to take in before we go with that idea, @pksunkara:

The definition of the FromDatum trait itself is effectively unsound. All implementations are not merely of unsafe fn, but it should be an unsafe trait which elaborates its safety requirements. This is because it is a similar promise as unsafe impl Send or unsafe impl Sync: the rest of the code around it, especially unsafe code, needs to be able to rely on its implementation being correct. Further, it needs to enforce lifetime relationships in ways that it currently does not do.

Thus, before we make a full release of 0.12.0, there will need to be a new trait or set of traits defining type translations to/from Datums, and they will not be able to use the usual pg_sys::Datum interface either. So you will have to write new implementations for those new traits for your extension in order to migrate to 0.12.0 in a useful fashion. Though, given your extension is fairly simple, it will likely be a very quick copy/paste job after doing a bit reading.

Currently I am making a bunch of small changes like this one to "clear the field" and make migration easier. Importantly, I am trying to do so while not disturbing the majority of code which depends on the current interfaces, so that most users will be able to simply write pgrx = "=0.12.0" in their toml and do the cargo install dance and hopefully get this already. But for those who do need to do more work, they will need to have useful errors.

With all that in mind, I honestly would rather not add an entirely new derive to pgrx's existing API as a long-term change to satisfy a single use-case.

I am more unsure what you get out of the derive(PostgresType) at all, honestly, given that you don't use its default FromDatum and IntoDatum blanket. The inoutfuncs, I guess? I don't understand why those elements are so tightly coupled in pgrx.

@pksunkara
Copy link

I think I have a custom implementation of InOutFuncs too.

@workingjubilee
Copy link
Member Author

if that's true, then what actually happens if you remove the PostgresType derive???

@pksunkara
Copy link

Never tried it since I didn't know what it contained. FWIW, I have #[inoutfuncs] for whatever reason, https://github.com/pksunkara/pgx_ulid/blob/master/src/lib.rs#L26

@workingjubilee
Copy link
Member Author

@pksunkara In any case, while I expressed some reticence, I am happy to work out a hack so you can write something other than derive(PostgresType) and get... whatever you are currently getting, as I have been informed there is a second user of the same pattern you are using (@willmurnane) and that means there's probably dozens of you out of reach of the search engines.

This would go on the large pile of TODOs blocking 0.12 though, as we have several at this point, and I would prefer it if we could open an issue and figure out what actually needs to happen.

And yes, the derive(PostgresType) is somewhat opaque. It does both more and less than you think it does.

Copy link
Contributor

@thomcc thomcc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, at least for now. Long term I think we'll want flexibility here.

@thomcc
Copy link
Contributor

thomcc commented Nov 8, 2023

The only downstream user, as far as I can tell, that this might negatively affect, is @pksunkara's pgx-ulid. If you genuinely don't want to impl/derive Serialize and Deserialize for your extension's type, lmk and I'll figure out an alternative.

I thnk there are a lot of cases where users will want to control this. I remember a while looking into why a pgrx implementation of roaring bitmaps was slower than a C one (specifically https://github.com/ChenHuajun/pg_roaringbitmap, I believe) and one of the big reasons was our encoding/decoding overhead.

@workingjubilee
Copy link
Member Author

@thomcc Right. I want to leave the possibility of control in but I think we need a new mechanism for enabling it, although it might be because we want to put PostgresType as-is out to pasture and bring it back in a more modular form.

@workingjubilee
Copy link
Member Author

workingjubilee commented Nov 8, 2023

There is now a #[bikeshed_postgres_type_manually_impl_from_into_datum] attribute that allows exempting a type from the automatic implementations of FromDatum and IntoDatum that this adds.

...Please yell at me if that actually makes it into a final release without getting at least a rename.

@workingjubilee workingjubilee merged commit 2df66d8 into pgcentralfoundation:develop Nov 8, 2023
8 checks passed
@workingjubilee workingjubilee deleted the make-postgrestype-impl-its-own-fromdatum branch November 8, 2023 22:17
@workingjubilee
Copy link
Member Author

The deprecation in question is the hidden function, cbor_decode_into_context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants