This is my repository of my Spark experiments while reading the book Learning Spark by Jules S. Damji, Brooke Wenig, Tathagata Das and Denny Lee (available as pdf from Databricks).
I only run the experiments in Python. The book contains code snippets in Python, Scala and Java. In the book, not all snippets and programs are available in all languages. I will try to write all snippets in executable and tested Python code (mostly in Jupyter Notebooks).
This repository also contains the end-to-end examples from the book's github repository, which are originally saved in a Databricks Notebook (.dbc file
). I have converted the Python files into Jupyter Notebooks and saved them in the folder BookExamples
.
Chapter 2-9 done. Starting to work on chapter 10 (MLlib).
Christoph Windheuser - May 28, 2022