Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding timeAdd with timezone kernel #1700
Adding timeAdd with timezone kernel #1700
Changes from 13 commits
81c1fd3
9205162
c698d96
94a4891
c9e121c
6b09249
b3723e3
d710ce2
eb48460
9e7e6df
43d4ff6
0bddee8
ef08f54
fbadd79
758f2fa
eadce5c
6fd2fb7
f483608
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a null check on the duration handle too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is problematic because c++ is going to say that
duration_handle
is a pointer tocudf::duration_scalar<cudf::duration_us>
, even if it is not. The only thing that is guaranteed is.Beyond that you will need to use dynamic_cast where C++ will verify that it is the type you expect. Note that dynamic_cast returns nullptr if it is not the right type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I moved the
dynamic_cast
and type checking to timezones.cu L392-L386 because scalar is forward declaration here. I'm not quite sure I did things right there, please help check again.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we do this some place else? Shouldn't transitioning between time zones like this be a static inline function? And if not can you add comments as to why this is different from
GpuTimeZoneDB.fromUtcTimestampToTimestamp
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference between this and
convert_timestamp_tz_functor
is that we want to keep theto_local_offset
. I updated the comment.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think there is a enough code to make it common. If we take the time to make the proper abstractions on the CUDA side then adding DST support should have code changes restricted primarily to one common place. But I will leave that up to you and the rest of the people working on timezone support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we tested with more microseconds than in a day, along with a days value? I know that this is an odd corner case, but if we have to apply days separately from microseconds wouldn't this result in the same kind of errors if we assume that we can always
convert microseconds_per_day
into days? It looks like for DayTimeIntervalType microsecond values outside of a day are not technically allowed, so we are probably okay there, but CalendarIntervalType does not indicate any limitations like that, and reading the code there are none.We might be able to work around this by restricting the usage of CalendarInterval, but I'd prefer not to need to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following test can pass with this pr:
It's an interesting case I hadn't thought of, I'll check the code to see if there is some validation in spark/pyspark to make this test pass.
If it is possible to pass an invalid CalendarInterval, we can have a simple workaround in plugin side to pass only the days to this kernel and then add the microseconds directly to the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this case failed on 311. Spark seems to automatically use DayTimeIntervalType for interval literals in higher versions. Will match this behavior in plugin PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matched this in plugin datetimeExpressionsUtils.scala
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: duration parameter has 2 types:
numeric_scalar
andcolumn_view
.Extract a template for duration parameter to combine the two
time_add
functions to one function.