Skip to content

BI & Data Science SIG Meeting Minutes 2017 02 20 2PM EST

Cupid Chan edited this page Feb 23, 2017 · 1 revision

Attendees

  • Cupid Chan (4C Decision)
  • Alan Gates (Hortonworks)
  • Tom Sabo (SAS)
  • Raj Desai (IBM)
  • Tanping Wang (IBM)
  • Ganesh Raju (Linaro)

Minutes

  • Introduction from all attendees
  • Overview of the background of SIG
  • Sharing the expectation of the SIG from attendees
  • There are multiple solutions in Data Science and hard to provide one single solution.
  • The 2 tools being discussed extensively in the call are Jupyter (On the cluster or not), Zeppelin
  • Discussion on benchmarking to evaluate the effectiveness of BI/Data Science
  • Need a way to quantify the benchmark
  • Need to find ways to claim ODPi compliant. But for now, most of the cases are feature compliant, not Benchmark compliant.
  • TPC-DS has already developed and used by vendors for benchmarking SQL
  • Benchmarking on Data Science notebook is a very a new field and may lead to too much work to be completed within the first deliverable timeframe
  • Best practices and standard on how to hook all things up
  • Not to tell what is the right tool but how to connect them. Examples include
  • How to set up Zeppelin on Spark?
  • How to configure Jupyter?
  • What is the best way to connect MicroStrategy to Hive and then how to tune Hive for better performance?
  • How data science be related to visualization and show how an algorithm run in real time?
  • Agreement on first deliverable
  • Standardize a guideline for vendor so that they know how to put up a data science notebook
  • Strengths and Weaknesses comparison
  • Recommendation of when to use what
  • So far, we have only 2 tools for comparison: Jupyter and Zeppelin
  • Communication
  • Most communication will be done by email
  • Concern about using ODPi listserv ([email protected]) as the email for this SIG will hide some real communication this group should be aware of.
  • Minimize conference call meeting due to busy schedule of the members and hard to come up with a common time across different time zones
  • Action item
  • Tanping will share what IBM is currently doing regarding to Notebook best practice
  • Cupid will follow up with Roman and John for having our own email alias.