Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

date_trunc function when connect to hive #7270

Closed
3 tasks done
iannamkr opened this issue Apr 10, 2019 · 7 comments
Closed
3 tasks done

date_trunc function when connect to hive #7270

iannamkr opened this issue Apr 10, 2019 · 7 comments
Labels
inactive Inactive for >= 30 days

Comments

@iannamkr
Copy link

Make sure these boxes are checked before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.

Superset version

0.28.1

Expected results

Actual results

errorCode=10011, errorMessage="Error while compiling statement: FAILED: SemanticException [Error 10011]: Line 6:9 Invalid function 'date_trunc'"), operationHandle=None)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/superset/connectors/sqla/models.py", line 783, in query
    df = self.database.get_df(sql, self.schema)
  File "/usr/lib/python3.6/site-packages/superset/models/core.py", line 789, in get_df
    self.db_engine_spec.execute(cursor, sqls[-1])
  File "/usr/lib/python3.6/site-packages/superset/db_engine_specs.py", line 1268, in execute
    cursor.execute(query, **kwargs)
  File "/usr/lib/python3.6/site-packages/pyhive/hive.py", line 365, in execute
    _check_status(response)
  File "/usr/lib/python3.6/site-packages/pyhive/hive.py", line 495, in _check_status
    raise OperationalError(response)

Steps to reproduce

When I create time series bar chart, Superset uses 'date_trunc' function in hiveql group by clause and it raise exception.

@jeepxiaozi
Copy link

jeepxiaozi commented May 10, 2019

Same here. I've checked the source code in incubator-superset/superset/db_engine_specs.py, from what I'm seeing, the reason this error happened is because they're using PrestoEngineSpec as HiveEngineSpec, as the code below:

class HiveEngineSpec(PrestoEngineSpec):

    """Reuses PrestoEngineSpec functionality."""

and hive doesn't support 'date_trunc' method.
So if u can build from the source code, u can just change the code into some thing like this:

class HiveEngineSpec(BaseEngineSpec):

    """Reuses PrestoEngineSpec functionality."""
    
    engine = 'hive'
    max_column_name_length = 767
    
    time_grain_functions = {
        None: '{col}',
        'PT1S': "from_unixtime(unix_timestamp({col}), 'yyyy-MM-dd HH:mm:ss')",
        'PT1M': "from_unixtime(unix_timestamp({col}), 'yyyy-MM-dd HH:mm:00')",
        'PT1H': "from_unixtime(unix_timestamp({col}), 'yyyy-MM-dd HH:00:00')",
        'P1D': "from_unixtime(unix_timestamp({col}), 'yyyy-MM-dd 00:00:00')",
        'P1W': "date_format(date_sub(nianxian, CAST(7-from_unixtime(unix_timestamp(nianxian),'u') as int)), 'yyyy-MM-dd 00:00:00')",
        'P1M': "from_unixtime(unix_timestamp({col}), 'yyyy-MM-01 00:00:00')",
        'P0.25Y': "date_format(add_months(trunc(nianxian, 'MM'), -(month(nianxian)-1)%3), 'yyyy-MM-dd 00:00:00')",
        'P1Y': "from_unixtime(unix_timestamp({col}), 'yyyy-01-01 00:00:00')",
        'P1W/1970-01-03T00:00:00Z': "date_format(date_add(nianxian, INT(6-from_unixtime(unix_timestamp(nianxian), 'u'))), 'yyyy-MM-dd 00:00:00')",
        '1969-12-28T00:00:00Z/P1W': "date_format(date_add(nianxian, -INT(from_unixtime(unix_timestamp(nianxian), 'u'))), 'yyyy-MM-dd 00:00:00')"
    }

this should do the work.

@iannamkr
Copy link
Author

@jeepxiaozi
thank for your reply :)

@jeepxiaozi
Copy link

@jeepxiaozi
thank for your reply :)

sorry about the code part, I've edited a little bit so that u can see it clearly

@oliviermichaelis
Copy link
Contributor

@jeepxiaozi thank you for the fix! Could you create a pull request to the apache project please? :)

@lpillmann
Copy link

+1 for this one ;)

@villebro
Copy link
Member

Remember you can override time grain functions in superset_config.py (see examples) without having to edit source and rebuild!

@stale
Copy link

stale bot commented Aug 12, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inactive Inactive for >= 30 days
Projects
None yet
5 participants