Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping PG interval type #629

Merged
merged 30 commits into from
Feb 20, 2023

Conversation

mhov
Copy link
Contributor

@mhov mhov commented Aug 15, 2022

Interval (any collision/confusion potential with pg_sys::Interval?) is an opaque type that can convert to/from time::Duration. time::Duration has a wider range of representation than PG's. Because the native datum is a combination of i32 month/day "leaps" with a precision i64 microsecond (usually less than 1 day's worth) offset, it's non-trivial to find a true MIN/MAX value. So we're artificially constraining the bound check of Duration-into-Interval to PG's stated interval range of -/+ 178_000_000 years.

@mhov mhov marked this pull request as ready for review August 15, 2022 23:32
@workingjubilee
Copy link
Member

Because the native datum is a combination of i32 month/day "leaps" with a precision i64 microsecond (usually less than 1 day's worth) offset, it's non-trivial to find a true MIN/MAX value.

Can you include this kind of explanation of this type in this PR in docs? References into the Postgres docs is fine, just something that makes it easy to cross-reference without doing more legwork, ideally. I legitimately find their choice of layout confusing, so a better understanding of the intent would be nice.

@mhov
Copy link
Contributor Author

mhov commented Aug 30, 2022

i've added some background to the storage layout and some important qualifying info to the methods/accessors

@workingjubilee
Copy link
Member

...I feel sufficiently bothered by Postgres's decisions I almost want to say days should be an #[repr(i32)] enum Days? Is that too ridiculous?

@mhov
Copy link
Contributor Author

mhov commented Sep 1, 2022

ok did some C investigations here
basically starting with this

    Interval *span = PG_GETARG_INTERVAL_P(0);
    elog(INFO, "Interval { month: %d, day: %d, time: %ld }", span->month, span->day, span->time);

and got these results

select ris_c.describe_interval(interval'1 month 29 days 50 seconds');
INFO:  Interval { month: 1, day: 29, time: 50000000 }

select ris_c.describe_interval(interval'1 month 30 days 50 seconds');
INFO:  Interval { month: 1, day: 30, time: 50000000 }

select ris_c.describe_interval(interval'1 month 30 days 25 hours');
INFO:  Interval { month: 1, day: 30, time: 90000000000 }

select ris_c.describe_interval(interval'80 days 50 seconds');
INFO:  Interval { month: 0, day: 80, time: 50000000 }

So this is as expected according to the docs based on the way you input units.
interval'60 days' != interval'2 months' because adding 2 months to Feb 1st, is very different from adding 60 days to Feb 1st. so they can't safely rollover units up the scale for time(hours) into days (due to leap second days). they can't rollover days into months because the true length of the month depends on the date you ADD the interval to. SO they take it exactly as you specify it. I feel we should do the same with our abstraction. It's only in the time::Duration into Interval conversion that I'd taken the pains to normalize the units. However when looking at interval.c it would convert timestamp-like structures (where only offset-usecs are given (similar to time::Duration) it only normalizes years->months (that's 12 no matter what) then puts everything else in the i64 time field, so overflow looks to be fine there.

static int
tm2interval(struct tm *tm, fsec_t fsec, interval * span)
{
	if ((double) tm->tm_year * MONTHS_PER_YEAR + tm->tm_mon > INT_MAX ||
		(double) tm->tm_year * MONTHS_PER_YEAR + tm->tm_mon < INT_MIN)
		return -1;
	span->month = tm->tm_year * MONTHS_PER_YEAR + tm->tm_mon;
	span->time = (((((((tm->tm_mday * INT64CONST(24)) +
					   tm->tm_hour) * INT64CONST(60)) +
					 tm->tm_min) * INT64CONST(60)) +
				   tm->tm_sec) * USECS_PER_SEC) + fsec;

	return 0;
}	

I'm going to adjust my time::Duration -> Invterval to do the same. I'm going to remove any limits on days/usec that were in there and we'll just take the approach that PG is. People shouldn't really be reading these PG fields directly, I only expose them via the methods so that someone could make their own chrono <-> Interval mapping.

@mhov
Copy link
Contributor Author

mhov commented Sep 1, 2022

k i've made the adjustments i described before.

I no longer used this const USECS_PER_DAY in time.rs, but I liked the idea of it, i allow(dead_code)'d it. But i feel like it's probably be against your policy, and YAGNI. Let me know if it should go

#[allow(dead_code)]
pub(crate) const USECS_PER_DAY: i64 = pg_sys::SECS_PER_DAY as i64 * USECS_PER_SEC;

@workingjubilee
Copy link
Member

Thanks for investigating that! Yes, I agree, if Postgres actually produces "out of bounds" (highly relatively speaking) intervals, I agree that we shouldn't try to validate inputs (...now that I understand what Postgres is doing a bit better).

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR will need to be updated to use the new type mapping strategy after #615 is merged, so 0.5.1 is the more practical time for this to be released.

@mhov
Copy link
Contributor Author

mhov commented Sep 21, 2022

updated for #615

@workingjubilee workingjubilee dismissed their stale review September 21, 2022 03:47

Request now stale but approval not yet granted.

@workingjubilee
Copy link
Member

Apologies, it seems #677 was also fated to stomp on this. 😬

pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
@mhov
Copy link
Contributor Author

mhov commented Nov 3, 2022

@workingjubilee I made an attempt at the PgBox change, in it's PgBox::<pg_sys::Interval>::allocate() form in the one place where a pg_sys::Interval would be created by us. There weren't many previous datum examples to go from, so I hope it's what was intended.

@workingjubilee
Copy link
Member

Going to attempt to finish review on this and actually get it merged soon.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned these in a Discord discussion but to formally register the remarks, for transparency's sake (and easier reference):

pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
@mhov
Copy link
Contributor Author

mhov commented Dec 29, 2022

when the automated testing fails for transient non-code reasons (timeouts, etc) should we do anything? Not sure if the failures hold anything up.

@mhov
Copy link
Contributor Author

mhov commented Jan 3, 2023

"worked on my machine" ^_^ my local develop didn't have the most recent SPI changes, all fixed now

@mhov
Copy link
Contributor Author

mhov commented Jan 29, 2023

@workingjubilee am I missing something? It says you still have a change requested but i don't see anything in the PR thread
image

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this was waiting on me to re-review the pointer manipulation for correctness and doublecheck the API. Things look okay here. I have just one final nit and this should be good to go in.

pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
pgx/src/datum/interval.rs Outdated Show resolved Hide resolved
@mhov
Copy link
Contributor Author

mhov commented Feb 18, 2023

@workingjubilee, i've renamed those, thanks

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hokay! Then I think we can call this done! Thank you!

@workingjubilee workingjubilee merged commit 8372d2c into pgcentralfoundation:develop Feb 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants