-
-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build upon #902: Spi Error Handling #969
Conversation
0cd2e6b
to
09f8b21
Compare
pgx-tests/src/tests/spi_tests.rs
Outdated
}); | ||
} | ||
|
||
#[pg_test] | ||
#[ignore = "come back to this test"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this test fails. I guess I don't understand what it's testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's testing that we longer have a bug in that we always use SPI_processed
(#938)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, it was definitely missing a comment.
I was thinking something along these lines:
#[pg_test]
fn test_open_multiple_tuptables() {
// Regression test to ensure a new `SpiTupTable` instance does not override the
// effective length of an already open one due to misuse of Spi statics
#[pg_test]
#[ignore = "come back to this test"]
fn test_open_multiple_tuptables_rev() {
// Regression test to ensure a new `SpiTupTable` instance does not override the
// effective length of an already open one.
// Same as `test_open_multiple_tuptables`, but with the second tuptable being empty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I've pushed a few minor changes to these two tests in bc27ff3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ha! and 9074043.
I’ve marked this as draft because I’m probably not done yet. My major complaint now is with Spi::connect() and Spi::execute(). I think we only need Spi::connect() and may make that decision in this PR. With that, I don’t like that Spi::connect()’s E type is generic. I think it needs to just be spi::Error. |
I am not sure why this is better. If the Datum being returned from Postgres is NULL, and the requested type on Rust's side is not
I don't suppose this comes from the return of
While I am not of a strong opinion that we should keep both, I do think there's a minor convenience in
I rather strongly disagree with this. As a matter of fact, I started with this (you can see that in 87e17ec), and I've later removed it (f8da957) as it's making a rather big assumption that any error is an SPI error. I don't think that's the right signature. Just because that clause is within You might say that, for practical purposes, there will be |
As far as SQL is concerned, it's not an error. NULL is a perfectly fine Datum value for any type and is a fundamental part of the SQL type system. pgx's "generalized" SPI interface shouldn't decide that a valid Datum value for a given type is an error. In fact, Postgres' SPI api, as flawed as it is, doesn't consider NULL to be an ERROR either -- the various functions like If the user, in whatever their use of SPI, decides that NULL isn't valid for some situation, they should encode that decision in their code. It's a business logic or data domain problem, not a fundamental error from within pgx. We wouldn't, for example, decide that "42" isn't a valid integer either -- that kind of decision is up to the user. Sadly, getting a Datum value out of SPI is a pure runtime operation. We can't lean on the compiler (plus help from Postgres itself to ensure invariants) like we can with, say, the (As an aside, this leads me down the thought process that
I asked how we could eliminate the double unwrapping to make the PR easier to follow. I didn't have any particular solution or implementation detail in mind. SPI viewing a NULL value as an error is an incorrect worldview for pgx, so it makes sense SPI still work with I fully realize the convenience of
It came out of me still wanting to "eliminate the double unwrapping". Being able to return a Result from the places a user is likely to use SPI (namely |
Yeah, I haven't really thought through this particular problem. I plan on sitting down to write some code with SPI and see what feels right. It might be that we do want it to be a I just know that in updating all the tests that use Spi::connect(), returning |
Just to expand on this, @workingjubilee and I have discussed ideas about representing a Datum, instead of enum NullableDatum<T> {
Null,
Concrete(T)
} It kinda still looks like an Right now pgx represents datums as What I don't want, and this isn't to say that #902 did this because it didn't, is for NULL to be considered an error like the "postgres" dbi driver crate does (https://docs.rs/postgres/latest/postgres/row/struct.Row.html#method.get). Hell, it flat out panics if you ask for a concrete value that happens to be null! I know why they did their API that way -- because it's much more convenient for the user, but it's just incorrect as it relates to SQL. An SQL integer, for example, can be one of |
Option itself, or an Option-like type for better readability, doesn't really matter. I'm fine with either. I think the user should be able to decide if null is an erroneous value to them. That's why they should use type signatures to indicate what's considered correct and what is not. What am I missing in this picture? |
Maybe this: https://en.wikipedia.org/wiki/Null_(SQL) ? Among many other good explanations, including the paragraph on "Closed-world assumption", is:
Above I may have been a little too loose in saying "NULL Datum value", but the point remains the same: NULL is a fine and acceptable thing in SQL. It's not an error condition as it relates to SQL.
I absolutely agree, which is why returning From a technical perspective, if a missing/unknown value is an error situation for the user, they need to encode that themselves. Unfortunately we can't involve the compiler to help us see sql check constraints behind arbitrary SQL queries, so we shouldn't instead use the compiler to mis-represent a fundamental aspect of SQL. |
I am aware of what is Null (however, the definition at the top is dubious --
Indeed it is.
That's right. However, there are cases where it is possible to indicate that null is not accepted, and they do lead to errors.
That's just misleading to assume they may always get a null value to force them to unwrap in some way. Here's a contrived examples (for brevity):
should we be forced to unwrap this because it may be null? I don't think it may become null. Therefore, the way things are done in #902 is, strictly speaking, better as it doesn't make assumptions about the nullability of the value. If one expects that the value may be
or
This way, at no point we need to match or unwrap when there's clarity with regard to the type's nullability.
You are making an assumption one way or another. You're either forced to unwrap always or specify the correct type in the first place. In reality, this means that a lot of signatures will contain |
Ugh, Boxes are a kludge.
If anything, this above shows holes in the If we simply had a |
Yes. Should pgx force the user to deal with possible
One should expect that any datum may be null. As such, pgx must encode that in the type signature for them. The meaning of null is context dependent. pgx should not allow the user to write type signatures that makes a fundamental part of SQL values into an error. It is not the same as actual error conditions pgx might encounter, such as "cannot convert 'hello world' to an i32" or "some internal pointer is invalid". They're entirely separate concerns.
We appear to have a disagreement on what "correct type" means. The correct type is The Postgres Java JDBC driver's Take this: let i:i32 = Spi::get_one("SELECT non_null_i32()")?; // Oh no! this actually returned NULL With what you're proposing, the user would get a runtime let i:i32 = match Spi::get_one("SELECT non_null_i32()") {
Ok(i) => i,
Err(spi::Error::NullValue) => i32::default(),
Err(e) => panic!("{}", e)
}; Whereas what I'm proposing gets us to: let i:i32 = Spi::get_one("SELECT non_null_i32()")?.unwrap_or_default(); With the latter form, the user can wave away the actual errors they likely can't do anything about and then turn the NULL case into something that's relevant to their application. And honestly, I don't see a problem with this: // SAFETY: ain't no way Postgres will ever return NULL here
let i:i32 = unsafe { Spi::get_one("SELECT 1")?.unwrap_unchecked() }; The null-ness of a Datum can only be determined at runtime. We shouldn't imply that the compiler can check it. In SQL, all types can have a NULL variant. If there were ways we could do some fancy compile-time sql type checking, constraint checking, or whatever, to ensure the nullability of an SQL value, man, I'd be all over that! |
That is not what I am proposing at all. I am proposing they spell out what they actually expect: let i: i32 = Spi::get_one("SELECT strictly_non_null_i32()")?; // ? for error
let i: Option<i32> = Spi::get_one("SELECT i_am_not_sure_if_it_may_be_a_null_i32()")?;
// or, for readability
let i: Nullable<i32> = Spi::get_one("SELECT i_am_not_sure_if_it_may_be_a_null_i32()")?; Now if they are SURE that it is a non-null value, and null does appear, and they did use a strict |
I think this is where our disconnect is. On the database side, NULL can be anywhere and is not part of the type system. In Rust, it effectively is, and we can and should use it – at least, this is how I think about it. |
I hope you are not serious. That's not a good way to get around an assumption that "anything can be null." If you are not sure: use Seems pretty clear to me. In the end, you're getting your Option-like interface where you need it, and none if you don't need it (where you're sure). And even if you were wrongly sure, an error is the worst thing that happens to you. And we have to handle the |
FWIW, this is easily avoidable: client.open_cursor("SELECT * FROM tests.cursor_table", None).map(|mut cursor| {
assert_eq!(sum_all(cursor.fetch(3)), 1 + 2 + 3);
cursor.detach_into_name()
}) Sure, the Then again, this won't be necessary at all if we let go of |
pgx represents an instantiated Datum as an This is built into As I've said before, pgx should not allow the programmer to specify expected types that might not be a valid representation of the Datum for any cases where Postgres itself would do a successful conversion. The only correct representation of an "i32" that comes from a Datum, is
That's a pretty terrible thing to happen because that means we've designed an API that gives the programmer no compiler help along the way to make sure they're handling possibly unexpected values when they should. And "when the should" is when they're writing their code. let i: i32 = Spi::get_one("SELECT strictly_non_null_i32()")?; // ? for error This code should not compile. It is not correct because an "i32 Datum" cannot be represented as an It may be undesired by the programmer. It may be unexpected by the programmer. It may be impossible to the programmer to imagine. None of that should change pgx' knowledge that any Datum can be NULL. All of this is the indicator that pgx must build its API signatures so that the programmer has to deal with reality in order for their code to even compile.
If the programmer thinks they're smarter than pgx, they're free to write that. I'd never encourage it, of course. I'm not a monster. But it's a fine way of the programmer making their assumptions clear without pgx silently injecting an Error they didn't handle when that assumption turns out to be wrong. |
It seems we're at an impasse here. You have merge/commit rights, and I don't. So, I can't win the argument! :) We see it differently. Error for a null where null is absolutely not expected but the programmer is an absolutely valid and correct API choice, period (if we're speaking in absolutes like "The only correct representation of an "i32" that comes from a Datum, is Option. That's it.") |
The good news is that there's nothing about always working with The bad news (or the other good news) is that pgx isn't going to help them do that by creating a runtime situation that they may not have realized they need to account for. I'm going to take a look at the |
That's not good news. That's just insertion of nullability whether you want/need it or not. I'm going to stop arguing about this. You have your own vision for pgx. I can't do much about that if you have made your mind up. |
This pulls out some commits from #969 into its own PR to add support for `#[pg_extern]`-style functions returning a `Result` of some kind. We don't care about the type of the Error, so long as it implements `Display`. If the returned value is the Err variant, the `IntoDatum` impl for `Result<T, E>` will raise a Postgres ERROR using `ereport!(ERROR, PgSqlErrorCode::ERRCODE_DATA_EXCEPTION, &format!("{}", e))`. It's tested to also work with the [`eyre::Result`](https://docs.rs/eyre/latest/eyre/type.Result.html) type along with returning `Result<Option<T>, E>`. The case of returning a `Result<(), E>` uncovered some issues with how pgx converts `()` into a Datum, so those have been fixed along the way. It also allows `#[pg_test]` functions to return Results which makes writing unit tests a little nicer.
This is not great as it doesn't allow for better handling of errors. Solution: redesign SPI signatures to accommodate for errors Importantly, this now enforces compatibility checks on Datum. This alone has led me to discovering a few bugs in pgx that are now fixed (primarily incorrect compatibility checks and wrong expectations in tests).
Solution: remove the dependency
Solution: get rid of by removing the need to double-check nullability
Feels like an unnecessary ceremony and makes code comprehension really difficult. Solution: don't try to handle NULL for every return type Instead, simply handle Option<T> when parameterized with it With this change, all you have to deal with is a Result.
There is nothing that actually implies some kind of SPI error may occur in the closure given to `connect`. This makes handling custom errors really clunky (using `Other` variant, a more convoluted `connect` type signature) Solution: lift the restriction that the error has to convert to spi::Error
Solution: make it adhere to Result returns
…hich caused a bunch of files (mostly tests and examples) to be updated. First off, the various Spi functions like `Spi::get_one()` go back to returning `-> Result<Option<T>, Error>`. This might seem counter to my comment at #902 (review), but there's an important thing Spi **must** support: Any Datum **can be NULL**, which means we must return `Option<T>`. So, the thing that makes this better is we also now have `impl<T, E> IntoDatum for Result<T, E> where T: IntoDatum, E: std::error::Error`. This means any `#[pg_extern]` function can now return a `Result`, and if that Result represents an error, it is raised as a Postgres ERROR. This is a big win for `#[pg_extern]`-style functions in general, and makes working with an Spi interface that returns `Result<Option<T>, Error>` pretty fluent as now you can just directly return the result from `Spi::get_XXX()` or you can use the `?` operator to nicely unwrap the error and then deal with the returned, possibly NULL, Datum. Doing this, and updating the Spi API caused all sorts of code to be touched in the tests and examples. As a result, there's (hacky) support for `#[pg_test]` functions being able to return `-> Result<(), pgx::spi::Error>`, which makes using Spi in the tests nearly the same, but now with a `?` operator and a `return Ok(())` at the end of such functions. The examples that use SPI needed to be touched as well. This is a pretty big breaking API change, afterall.
…s a `Result`. Requires special handling for `Result<Option<T>, E>`.
…::connect()` to have a fairly free signature of `pub fn connect<R, F: FnOnce(SpiClient<'_>) -> R>(f: F) -> R`. At the end of the day, I can't see any reason why it should force the user to return a Result. And if they do, it can be whatever `Result<T, E>` they want, and they can have all the proper error conversion/propagation code to deal with situations where maybe they need to convert between a few different types of `E` (say, a `pgx::spi::Error` and a `std::io::Error`). Also cleans up a few usages of `Spi::connect()` in examples/tests to better make use of this new signagure.
`Option<Option<T>>` is a nonsensical datum representation when instantiated as a concrete Rust type. A datum is always known as `Option<T>`. This requires that `impl<T: FromDatum> FromDatum for Vec<Option<T>>` impl in `pgx/src/datum/array.rs` come back from the dead. An Array can contain NULL elements. Also updates a couple of tests that were wanting `Spi::get_one::<Option<T>>()`, which now isn't a thing.
This exposed that `pgx::spi::Error` needed to derive `PartialEq`. Also add comments from @EdMcBane
9074043
to
bae3924
Compare
I think this is now ready for merge. I'm gonna let it sit overnight and mull things over. It's as big as it is because of all the updates and cleanups to the |
One last change I think I'm going to make in this PR, since it'll update a bunch of tests too, is get rid of |
Now that `Spi::connect()` can return anything (including nothing), from the user's side it (can be) functionally equivalent to `Spi::execute()`. So just remove `Spi::execute()` and update all the code accordingly.
Building upon @yrashk's work in #902 which changes the various SPI "getter" functions to return a
Result<T, E>
so that (at least) type conversion errors can be detected at runtime rather than crashing postgres, this PR makes a few additional changes.First off, the various Spi functions like
Spi::get_one()
go back to returning-> Result<Option<T>, Error>
. This might seem counter to my comment at #902 (review), but there's an important thing Spi must support: Any Datum can be NULL, which means we must returnOption<T>
.Secondly,
Spi::connect()
's return type no longer has any bounds. If the user needs to return something, even aResult
, they can now return whatever they like.Which leads us to...
Spi::execute()
has been removed. It is equivalent to aSpi::connect()
that returns nothing.Unfortunately, the general SPI API changes have caused all sorts of code to be touched in the tests and examples.
Regarding the tests, I've spent quite a bit of time cleaning them up. They're the biggest part of this PR, but I see little value in holding them for another PR when most of them needed to be touched anyways since they need to understand that
Spi::get_XXX()
returns aResult<Option<T>>
now./cc @yrashk