A great reason to jump into Spark on Mesos on Google Cloud Platform is because you can quickly spin up a development environment to work with Spark, Mesos, Google Cloud, and Marathon together very quickly. A great way to set this up is to follow the steps in Paco Nathan’s (@pacoid) great blog post Spark atop Mesos on Google Cloud Platform.
But what’s missing from this configuration is the ability to connect to Google Cloud Storage (GCS) so you can run your Spark queries off of a persistent elastic storage. As noted in the diagram below, you will first install Spark onto the development Mesos cluster which contains a master node with three slave nodes. By installing the GCS connector, Spark can now communicate with GCS.
For more information, continue reading at Spark atop Mesos on Google Cloud Platform querying Google Cloud Storage.