A great reason to jump into Spark on Mesos on Google Cloud Platform is because you can quickly spin up a development environment to work with Spark, Mesos, Google Cloud, and Marathon together very quickly. A great way to set this up is to follow the steps in Paco Nathan’s (@pacoid) great blog post Spark atop Mesos on Google Cloud Platform.
But what’s missing from this configuration is the ability to connect to Google Cloud Storage (GCS) so you can run your Spark queries off of a persistent elastic storage. As noted in the diagram below, you will first install Spark onto the development Mesos cluster which contains a master node with three slave nodes. By installing the GCS connector, Spark can now communicate with GCS.
For more information, continue reading at Spark atop Mesos on Google Cloud Platform querying Google Cloud Storage.
Beg your pardon, but what does this have to do with dim sum? 🙂
Doing this allows me to save money – that I need for dim sum 🙂
Was researching dim sum in SF recently, thought about you and that great place you took me to in Seattle for dim sum. Hope things are going well for you and yours. I trust that life away from the mothership has been ok?