Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TApplicationException: Invalid method name: 'GetLog' #2954

Closed
3 tasks done
timfeirg opened this issue Jun 13, 2017 · 6 comments · Fixed by #2968
Closed
3 tasks done

TApplicationException: Invalid method name: 'GetLog' #2954

timfeirg opened this issue Jun 13, 2017 · 6 comments · Fixed by #2968

Comments

@timfeirg
Copy link
Contributor

timfeirg commented Jun 13, 2017

Make sure these boxes are checked before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if any
  • I have reproduced the issue with at least the latest released version of superset
  • I have checked the issue tracker for the same issue and I haven't found one similar

Superset version

0.18.4

Expected results

pyhive will work

Actual results

when query against hive:// backends, superset worker will raise this error:

  File \"\/usr\/lib\/python3.6\/site-packages\/superset\/db_engines\/hive.py\", line 18, in fetch_logs
    logs = self._connection.client.GetLog(req)
  File \"\/usr\/lib\/python3.6\/site-packages\/pythrifthiveapi\/TCLIService\/TCLIService.py\", line 762, in GetLog
    return self.recv_GetLog()
  File \"\/usr\/lib\/python3.6\/site-packages\/pythrifthiveapi\/TCLIService\/TCLIService.py\", line 779, in recv_GetLog
    raise x
thrift.Thrift.TApplicationException: Invalid method name: 'GetLog'

Steps to reproduce

  • install pyhive v0.3.0, pythriftapi on 4e2e9935bdddbe2f30630ef22c5aa045b713478e (at the time of writing, the pythriftapi on pypi is using python3 incompatible import style, must install from github repo)
  • add a database in superset using the hive:// prefix
  • run some hql queries, in my case, the simple ones like select * from table_name limit 2 will work, but complex query that'll run for a long time in hive CLI, will raise this error.

Study

I see there's this fetch_logs method that looks like a patch for the original fetch_logs method, but I didn't see any code that actually does the patching, just a method declaration lying there, so AFAIK the fetch_logs method ran by superset worker, is still the original ones from pyhive?

@mistercrunch
Copy link
Member

Looks like it's monkey patched over here:
https://github.com/airbnb/superset/blob/38375be5c3062caf9f6cd4d6ac62109e1940bd10/superset/db_engine_specs.py#L651
@bkyryliuk did that a little while ago.

@timfeirg
Copy link
Contributor Author

I should've learn that this is not about the monkey patch, the traceback clearly indicates that the patched version of fetch_logs is being used.

So it's hive thrift server that's reporting this method not found error, any ideas? @mistercrunch

@timfeirg
Copy link
Contributor Author

@timfeirg
Copy link
Contributor Author

consider removing GetLog usage from superset?

@mistercrunch
Copy link
Member

Wouldn't you just have to handle TApplicationException in that try statement?

@timfeirg
Copy link
Contributor Author

timfeirg commented Jun 13, 2017

yeah I guess that could be it, but doing so means that pyhive queries will not be able to report progress on newer versions of hive.

I see there's this GetOperationStatus API within TCLIService.thrift, haven't tried but looks like it's able to get the same information as GetLog?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants