-
Notifications
You must be signed in to change notification settings - Fork 2
BI & Data Science SIG Meeting Minutes 2017 02 20 2PM EST
Cupid Chan edited this page Feb 23, 2017
·
1 revision
Attendees
- Cupid Chan (4C Decision)
- Alan Gates (Hortonworks)
- Tom Sabo (SAS)
- Raj Desai (IBM)
- Tanping Wang (IBM)
- Ganesh Raju (Linaro)
Minutes
- Introduction from all attendees
- Overview of the background of SIG
- Sharing the expectation of the SIG from attendees
- There are multiple solutions in Data Science and hard to provide one single solution.
- The 2 tools being discussed extensively in the call are Jupyter (On the cluster or not), Zeppelin
- Discussion on benchmarking to evaluate the effectiveness of BI/Data Science
- Need a way to quantify the benchmark
- Need to find ways to claim ODPi compliant. But for now, most of the cases are feature compliant, not Benchmark compliant.
- TPC-DS has already developed and used by vendors for benchmarking SQL
- Benchmarking on Data Science notebook is a very a new field and may lead to too much work to be completed within the first deliverable timeframe
- Best practices and standard on how to hook all things up
- Not to tell what is the right tool but how to connect them. Examples include
- How to set up Zeppelin on Spark?
- How to configure Jupyter?
- What is the best way to connect MicroStrategy to Hive and then how to tune Hive for better performance?
- How data science be related to visualization and show how an algorithm run in real time?
- Agreement on first deliverable
- Standardize a guideline for vendor so that they know how to put up a data science notebook
- Strengths and Weaknesses comparison
- Recommendation of when to use what
- So far, we have only 2 tools for comparison: Jupyter and Zeppelin
- Communication
- Most communication will be done by email
- Concern about using ODPi listserv ([email protected]) as the email for this SIG will hide some real communication this group should be aware of.
- Minimize conference call meeting due to busy schedule of the members and hard to come up with a common time across different time zones
- Action item
- Tanping will share what IBM is currently doing regarding to Notebook best practice
- Cupid will follow up with Roman and John for having our own email alias.