-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(sqllab/charts): casting from timestamp[us] to timestamp[ns] would result in out of bounds timestamp #18873
Conversation
…f bounds timestamp from sqllab and charts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! It would be nice if we could add a unit test to ensure these far in the past/future timestamps work with this fix (optimally those unit tests should fail on master branch).
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue |
Any chance this could be merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@villebro Sorry for the delay, I've added a couple of unittests.
Thanks @yeachan153 ! Let me review/test now |
Codecov Report
@@ Coverage Diff @@
## master #18873 +/- ##
==========================================
+ Coverage 66.36% 66.53% +0.16%
==========================================
Files 1621 1714 +93
Lines 63057 65051 +1994
Branches 6382 6724 +342
==========================================
+ Hits 41850 43280 +1430
- Misses 19547 20060 +513
- Partials 1660 1711 +51
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Very minor non-blocking comment (I'll leave the PR open for another day if you feel like fixing, otherwise merging as-is). Also, I love how the tests were placed at precisely the error threshold, very precise 😆
tests/unit_tests/dataframe_test.py
Outdated
@pytest.mark.parametrize( | ||
"inpt, expected", | ||
[ | ||
pytest.param( | ||
[ | ||
(datetime.strptime("1677-09-22 00:12:43", "%Y-%m-%d %H:%M:%S"), 1), | ||
(datetime.strptime("2262-04-11 23:47:17", "%Y-%m-%d %H:%M:%S"), 2), | ||
], | ||
[ | ||
{"a": datetime.strptime("1677-09-22 00:12:43", "%Y-%m-%d %H:%M:%S"), "b": 1}, | ||
{"a": datetime.strptime("2262-04-11 23:47:17", "%Y-%m-%d %H:%M:%S"), "b": 2}, | ||
], | ||
id="timestamp conversion fail" | ||
), | ||
pytest.param( | ||
[ | ||
(datetime.strptime("1677-09-22 00:12:44", "%Y-%m-%d %H:%M:%S"), 1), | ||
(datetime.strptime("2262-04-11 23:47:16", "%Y-%m-%d %H:%M:%S"), 2) | ||
], | ||
[ | ||
{"a": Timestamp("1677-09-22 00:12:44"), "b": 1}, | ||
{"a": Timestamp("2262-04-11 23:47:16"), "b": 2} | ||
], | ||
id="timestamp conversion success" | ||
) | ||
] | ||
) | ||
def test_max_pandas_timestamp(inpt, expected) -> None: | ||
from superset.db_engine_specs import BaseEngineSpec | ||
from superset.result_set import SupersetResultSet | ||
|
||
cursor_descr: DbapiDescription = [ | ||
("a", "datetime", None, None, None, None, False), | ||
("b", "int", None, None, None, None, False), | ||
] | ||
results = SupersetResultSet(inpt, cursor_descr, BaseEngineSpec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this feels more pythonic:
@pytest.mark.parametrize( | |
"inpt, expected", | |
[ | |
pytest.param( | |
[ | |
(datetime.strptime("1677-09-22 00:12:43", "%Y-%m-%d %H:%M:%S"), 1), | |
(datetime.strptime("2262-04-11 23:47:17", "%Y-%m-%d %H:%M:%S"), 2), | |
], | |
[ | |
{"a": datetime.strptime("1677-09-22 00:12:43", "%Y-%m-%d %H:%M:%S"), "b": 1}, | |
{"a": datetime.strptime("2262-04-11 23:47:17", "%Y-%m-%d %H:%M:%S"), "b": 2}, | |
], | |
id="timestamp conversion fail" | |
), | |
pytest.param( | |
[ | |
(datetime.strptime("1677-09-22 00:12:44", "%Y-%m-%d %H:%M:%S"), 1), | |
(datetime.strptime("2262-04-11 23:47:16", "%Y-%m-%d %H:%M:%S"), 2) | |
], | |
[ | |
{"a": Timestamp("1677-09-22 00:12:44"), "b": 1}, | |
{"a": Timestamp("2262-04-11 23:47:16"), "b": 2} | |
], | |
id="timestamp conversion success" | |
) | |
] | |
) | |
def test_max_pandas_timestamp(inpt, expected) -> None: | |
from superset.db_engine_specs import BaseEngineSpec | |
from superset.result_set import SupersetResultSet | |
cursor_descr: DbapiDescription = [ | |
("a", "datetime", None, None, None, None, False), | |
("b", "int", None, None, None, None, False), | |
] | |
results = SupersetResultSet(inpt, cursor_descr, BaseEngineSpec) | |
@pytest.mark.parametrize( | |
"input_, expected", | |
[ | |
pytest.param( | |
[ | |
(datetime.strptime("1677-09-22 00:12:43", "%Y-%m-%d %H:%M:%S"), 1), | |
(datetime.strptime("2262-04-11 23:47:17", "%Y-%m-%d %H:%M:%S"), 2), | |
], | |
[ | |
{"a": datetime.strptime("1677-09-22 00:12:43", "%Y-%m-%d %H:%M:%S"), "b": 1}, | |
{"a": datetime.strptime("2262-04-11 23:47:17", "%Y-%m-%d %H:%M:%S"), "b": 2}, | |
], | |
id="timestamp conversion fail" | |
), | |
pytest.param( | |
[ | |
(datetime.strptime("1677-09-22 00:12:44", "%Y-%m-%d %H:%M:%S"), 1), | |
(datetime.strptime("2262-04-11 23:47:16", "%Y-%m-%d %H:%M:%S"), 2) | |
], | |
[ | |
{"a": Timestamp("1677-09-22 00:12:44"), "b": 1}, | |
{"a": Timestamp("2262-04-11 23:47:16"), "b": 2} | |
], | |
id="timestamp conversion success" | |
) | |
] | |
) | |
def test_max_pandas_timestamp(input_, expected) -> None: | |
from superset.db_engine_specs import BaseEngineSpec | |
from superset.result_set import SupersetResultSet | |
cursor_descr: DbapiDescription = [ | |
("a", "datetime", None, None, None, None, False), | |
("b", "int", None, None, None, None, False), | |
] | |
results = SupersetResultSet(input_, cursor_descr, BaseEngineSpec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing! Should be changed now
@yeachan153 there's also some linting issues - if you're ok with it, I can fix the remaining issues and push directly to your PR? |
Thanks, I think it should be fixed now? But feel free to push anything! |
Thanks @yeachan153! let's see if CI passes - if not I'll push the remaining fixes so we can get it merged |
Ah sorry, I didn't realise you can check which files black is complaining about in the workflow. There were two more files to lint. It should actually work this time! |
Restarted CI - fingers crossed 🙂🤞 |
… result in out of bounds timestamp (apache#18873) * fix casting from timestamp[us] to timestamp[ns] would result in out of bounds timestamp from sqllab and charts * Add unittests * Lint changes and parameter variable rename * Fix linting
… result in out of bounds timestamp (apache#18873) * fix casting from timestamp[us] to timestamp[ns] would result in out of bounds timestamp from sqllab and charts * Add unittests * Lint changes and parameter variable rename * Fix linting
SUMMARY
Addresses #18871, #18596, #16487, #13661. This allows users to query date columns outside the maximum ranges of pandas timestamps:
1677-09-22 00:12:43.145225
and2262-04-11 23:47:16.854775807
.The same fix as #14006, with a small additional fix for charts to hide the timestamps it cannot convert.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Sql Lab before:
Sql Lab after can query past the date ranges mentioned above:
Charts before:
Charts after now show data, excluding the dates that reside outside this period:
TESTING INSTRUCTIONS
select TIMESTAMP any_timestamp_outside_ranges_mentioned_above
TIME COLUMN
, it should still return a chart successfully and omit showing the date ranges that are unsupported.ADDITIONAL INFORMATION