Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix] Druid run_query dimensions part 3 + Unit tests #3949

Merged
merged 2 commits into from
Dec 1, 2017

Conversation

Mogball
Copy link
Contributor

@Mogball Mogball commented Nov 29, 2017

#3920 (comment)

Hopefully this'll be the last one. Added some unit tests to try to save some face 🥇

@@ -898,7 +898,7 @@ def run_query( # noqa / druid
row_limit=None,
inner_from_dttm=None, inner_to_dttm=None,
orderby=None,
extras=None, # noqa
extras={}, # noqa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a python gotcha:
http://docs.python-guide.org/en/latest/writing/gotchas/#mutable-default-arguments,
just set it to None here and go with extras = extras or {} in the function body

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Mogball Mogball force-pushed the jeff/druid_dim_part3 branch from 8ba00fc to 8bed036 Compare November 29, 2017 01:43
@ghulands
Copy link

ghulands commented Dec 1, 2017

Just seeing if you tested this with a time series chart too? That too was still broken after #3920.

@Mogball
Copy link
Contributor Author

Mogball commented Dec 1, 2017

Yes I checked it with that. The test case is the one where timeseries_limit is specified

@ghulands
Copy link

ghulands commented Dec 1, 2017

I just tested this and it fails for the time series charts. I'm using a column in the group by having the same setup as I reported in #3943. Using a data table view works fine.

Traceback (most recent call last):
  File "/app/.heroku/python/lib/python2.7/site-packages/superset/viz.py", line 275, in get_payload
    df = self.get_df()
  File "/app/.heroku/python/lib/python2.7/site-packages/superset/viz.py", line 97, in get_df
    self.results = self.datasource.query(query_obj)
  File "/app/.heroku/python/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 1078, in query
    client=client, query_obj=query_obj, phase=2)
  File "/app/.heroku/python/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 865, in get_query_str
    return self.run_query(client=client, phase=phase, **query_obj)
  File "/app/.heroku/python/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 1056, in run_query
    filters,
  File "/app/.heroku/python/lib/python2.7/site-packages/superset/connectors/druid/models.py", line 874, in _add_filter_from_pre_query_data
    f = Dimension(dim) == row[dim]
  File "/app/.heroku/python/lib/python2.7/site-packages/pandas/core/series.py", line 642, in __getitem__
    return self._get_with(key)
  File "/app/.heroku/python/lib/python2.7/site-packages/pandas/core/series.py", line 683, in _get_with
    return self.loc[key]
  File "/app/.heroku/python/lib/python2.7/site-packages/pandas/core/indexing.py", line 1328, in __getitem__
    return self._getitem_axis(key, axis=0)
  File "/app/.heroku/python/lib/python2.7/site-packages/pandas/core/indexing.py", line 1541, in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
  File "/app/.heroku/python/lib/python2.7/site-packages/pandas/core/indexing.py", line 1081, in _getitem_iterable
    self._has_valid_type(key, axis)
  File "/app/.heroku/python/lib/python2.7/site-packages/pandas/core/indexing.py", line 1418, in _has_valid_type
    (key, self.obj._get_axis_name(axis)))
KeyError: "None of [[u'outputName', u'outputType', u'type', u'extractionFn', u'dimension']] are in the [index]"

@Mogball
Copy link
Contributor Author

Mogball commented Dec 1, 2017

Looks like it's an issue with _add_filter_from_pre_query_data, which doesn't properly handle dimensions specs (and is called for timeseries). I think that's a bit outside the scope of this fix. And I'm not really familiar with PyDruid.

@mistercrunch mistercrunch merged commit 8f00e9e into apache:master Dec 1, 2017
michellethomas pushed a commit to michellethomas/panoramix that referenced this pull request May 24, 2018
* Fixed and added tests for druid run query

* Fixes style and python3
wenchma pushed a commit to wenchma/incubator-superset that referenced this pull request Nov 16, 2018
* Fixed and added tests for druid run query

* Fixes style and python3
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.21.0 labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.21.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants