Stars
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
MapReduce, Spark, Java, and Scala for Data Algorithms Book
A curated list of awesome Apache Spark packages and resources.