Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add PARSE_TIME and FORMAT_TIME functions #7722
feat: add PARSE_TIME and FORMAT_TIME functions #7722
Changes from 2 commits
866798e
db63aba
1695d73
1e0a25a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been wondering about this cache for a long time and haven't asked. But why do we need it in the time/date/timestamp functions? If a query calls a time UDF with a specific format, then the query will only use 1 format pattern for all rows, won't it? Or if a query calls UDF more than once (one per column) with different formats, doesn't each column have its own instance of
FormatTime
which will end up with one single format pattern for all rows?I haven't checked the above reasoning, but is that the right assumption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's because this function gets called every time there's a new record, so having a cache prevents it from having to recreate the formatter each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is instantiated once per record, then it probably makes sense. But that magic number of 1000 seems too big. We should dig more into this after 0.20. See if we can get rid of that cache or make it hold the exact # of formatters of the row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If
formatPattern
has characters, such as days, months, etc., would they be added to the resulted string?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, an exception gets thrown - I'll add a test for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For easy reading. Perhaps a declaring a constant for this is better? Is this a nano per second value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used TimeUnit conversion functions instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same two questions from FormatTime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, if we parse something like
parse_time('2021 05:45', 'yyyy HH:mm)
, then it will parse everything but only return the time component (so in this case, it returns 05:45). It's weird that Local time.parse doesn't throw anything.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a check to reject formats with non-time elements