Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fixing pinot query generation for date format conversion from python datetime format to java simple date format #13163

Merged
merged 3 commits into from
Feb 20, 2021

Conversation

xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Feb 17, 2021

SUMMARY

Fixing Pinot query generation converting DateTime format from python DateTime format to Java SimpleDate Format.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

After:
Generated query and chart:

SELECT DATETIMECONVERT(FlightDate, '1:SECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd', '1:SECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd', '1:DAYS'),
       COUNT(*)
FROM "airlineStats"."airlineStats"
GROUP BY DATETIMECONVERT(FlightDate, '1:SECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd', '1:SECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd', '1:DAYS')
LIMIT 10000;

image

TEST PLAN

Adding unit test for query generation.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

@codecov-io
Copy link

codecov-io commented Feb 17, 2021

Codecov Report

Merging #13163 (63a1f0f) into master (2ce7982) will increase coverage by 14.39%.
The diff coverage is 33.04%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #13163       +/-   ##
===========================================
+ Coverage   53.06%   67.45%   +14.39%     
===========================================
  Files         489      492        +3     
  Lines       17314    29041    +11727     
  Branches     4482        0     -4482     
===========================================
+ Hits         9187    19590    +10403     
- Misses       8127     9451     +1324     
Flag Coverage Δ
cypress ?
python 67.45% <33.04%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/examples/birth_names.py 73.19% <ø> (ø)
...43f2fdb_add_granularity_to_charts_where_missing.py 0.00% <0.00%> (ø)
...s/260bf0649a77_migrate_x_dateunit_in_time_range.py 0.00% <0.00%> (ø)
...ons/versions/41ce8799acc3_rename_pie_label_type.py 0.00% <0.00%> (ø)
superset/connectors/sqla/views.py 62.43% <4.34%> (ø)
superset/views/datasource.py 88.70% <16.66%> (ø)
superset/charts/commands/exceptions.py 92.85% <77.77%> (ø)
superset/utils/core.py 88.26% <81.57%> (ø)
superset/db_engine_specs/elasticsearch.py 90.24% <87.50%> (ø)
superset/config.py 90.68% <100.00%> (ø)
... and 937 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc4c837...63a1f0f. Read the comment docs.

@xiangfu0 xiangfu0 force-pushed the fixing_simple_date_format branch 9 times, most recently from 0dfc4ab to d230927 Compare February 17, 2021 09:03
@pull-request-size pull-request-size bot added size/M and removed size/S labels Feb 17, 2021
@xiangfu0 xiangfu0 force-pushed the fixing_simple_date_format branch 3 times, most recently from dcb7a85 to 386b9e9 Compare February 17, 2021 09:13
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing again! Quick bycatch comment

superset/db_engine_specs/pinot.py Show resolved Hide resolved
@villebro
Copy link
Member

@fx19880617 btw you may want to cherry pick this into your installation, as I believe this fix also affects Pinot: #13138

@xiangfu0 xiangfu0 force-pushed the fixing_simple_date_format branch 2 times, most recently from 9f713ad to 19766f5 Compare February 17, 2021 09:36
@xiangfu0
Copy link
Contributor Author

@fx19880617 btw you may want to cherry pick this into your installation, as I believe this fix also affects Pinot: #13138

Sure, is it on the latest docker image?

@xiangfu0 xiangfu0 force-pushed the fixing_simple_date_format branch 6 times, most recently from a65185a to b5c011b Compare February 17, 2021 10:59
@xiangfu0 xiangfu0 force-pushed the fixing_simple_date_format branch from b5c011b to 43732c3 Compare February 17, 2021 11:16
@junlincc junlincc added the data:connect:pinot Related to Pinot label Feb 17, 2021
@xiangfu0 xiangfu0 requested a review from villebro February 17, 2021 20:06
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness, could we also add an assertion for missing pdf and invalid time grain spec?

@xiangfu0
Copy link
Contributor Author

For completeness, could we also add an assertion for missing pdf and invalid time grain spec?

it's already there on line 72 for pdf check and line 99 for time_grain check.

@villebro
Copy link
Member

villebro commented Feb 18, 2021

it's already there on line 72 for pdf check and line 99 for time_grain check.

Yes, I meant a simple unit test to test for those lines (sorry for being unclear). The reason I'm asking is Pinot is a fairly different connector compared to others, so I'd like to make sure we have good coverage for it to guard against regressions.

@xiangfu0 xiangfu0 force-pushed the fixing_simple_date_format branch from 890f147 to 892f67d Compare February 18, 2021 09:17
@xiangfu0 xiangfu0 force-pushed the fixing_simple_date_format branch from 892f67d to 851aa5d Compare February 18, 2021 09:48
def test_invalid_get_time_expression_arguments(self):
with self.assertRaises(NotImplementedError) as context:
PinotEngineSpec.get_timestamp_expr(column("tstamp"), None, "P1M")
self.assertEqual("Empty date format for 'tstamp'", str(context.exception))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this indented correctly? I'm fine removing the string comparison, testing for the raise is adequate IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to use VS Code pylint plugin and black to format the python file, but both don't pass the python lint check here :(
The black cmd I'm running is
black -t py37 tests/db_engine_specs/pinot_tests.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tried:

            NotImplementedError,
           PinotEngineSpec.get_timestamp_expr(column("tstamp"), None, "P1M"),
        )

but the test doesn't pass for this, hence I capture the exception and compare the string:

================================================================================================= FAILURES ==================================================================================================
_____________________________________________________________________ TestPinotDbEngineSpec.test_invalid_get_time_expression_arguments ______________________________________________________________________

self = <tests.db_engine_specs.pinot_tests.TestPinotDbEngineSpec testMethod=test_invalid_get_time_expression_arguments>

    def test_invalid_get_time_expression_arguments(self):
        self.assertRaises(
            NotImplementedError,
>           PinotEngineSpec.get_timestamp_expr(column("tstamp"), None, "P1M"),
        )

tests/db_engine_specs/pinot_tests.py:71:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

cls = <class 'superset.db_engine_specs.pinot.PinotEngineSpec'>, col = <sqlalchemy.sql.elements.ColumnClause at 0x11f54a5e0; tstamp>, pdf = None, time_grain = 'P1M', type_ = None

    @classmethod
    def get_timestamp_expr(
        cls,
        col: ColumnClause,
        pdf: Optional[str],
        time_grain: Optional[str],
        type_: Optional[str] = None,
    ) -> TimestampExpression:
        if not pdf:
>           raise NotImplementedError(f"Empty date format for '{col}'")
E           NotImplementedError: Empty date format for 'tstamp'

superset/db_engine_specs/pinot.py:72: NotImplementedError

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tweaked the tests slightly and pushed to the branch, I hope you don't mind (if you don't agree with the changes please feel free to git reset --hard HEAD~1!)

Comment on lines 77 to 79
self.assertEqual(
"No pinot grain spec for 'invalid_grain'", str(context.exception)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@villebro
Copy link
Member

@fx19880617 let me know if my last change is ok by you - if so I can merge

@xiangfu0
Copy link
Contributor Author

@fx19880617 let me know if my last change is ok by you - if so I can merge

LGTM, thanks for fixing it @villebro

@villebro villebro merged commit 786c12d into apache:master Feb 20, 2021
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.2.0 labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels data:connect:pinot Related to Pinot size/M 🚢 1.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants