This repository contains a running series of Spark examples for comparison with a matching series of Dataflow examples.
Before diving in here, you likely want to read Dataflow/Beam & Spark: A Programming Model Comparison.
The equivalent Dataflow code lives in GoogleCloudPlatform/DataflowJavaSDK-examples and is documented in Mobile Gaming Pipeline Examples.
For details on running these examples on a Google Cloud Dataproc cluster, please see this README.