Spark Mesos Google Cloud feature diagram

Spark atop Mesos on Google Cloud Platform querying Google Cloud Storage

A great reason to jump into Spark on Mesos on Google Cloud Platform is because you can quickly spin up a development environment to work with Spark, Mesos, Google Cloud, and Marathon together very quickly. A great way to set this up is to follow the steps in Paco Nathan’s (@pacoid) great blog post Spark atop Mesos on Google Cloud Platform. But what’s missing from this configuration is the ability to connect to Google Cloud Storage (GCS) so you can run your Spark queries off of a persistent elastic storage. As noted in the diagram below, you will first install Spark…

Rate this:

clip_image003_thumb.png

Why use Blob Storage with HDInsight on Azure

By Brad Sarsfield and Denny Lee One of the questions we are commonly asked concerning HDInsight, Azure, and Azure Blob Storage is why one should store their data into Azure Blob Storage instead of HDFS on the HDInsight Azure Compute nodes.  After all, Hadoop is all about moving compute to data vs. traditionally moving data to compute as noted in Moving data to compute or compute to data? That is the Big Data question.  The network is often the bottleneck and making it performant can be expensive.  Yet the practice for HDInsight on Azure is to place the data into…

Rate this:

image_thumb4.png

Oh where, oh where did my S3N go? (in Windows Azure HDInsight) Oh where, Oh where, can it be?!

As noted in my previous post Connecting Hadoop on Azure to your Amazon S3 Blob storage, you could easily setup HDInsight Azure to go against your Amazon S3 / S3N storage.  With the updates to HDInsight, you’ll notice that Manage Cluster dialog no longer includes the quick access to Set up S3. Yet, there are times where you may want to connect your HDInsight cluster to access your S3 storage.  Note, this can be a tad expensive due to transfer costs. To get S3 setup on your Hadoop cluster, from the HDInsight dashboard click on the Remote Desktop tile so…

Rate this:

Connecting Hadoop on Azure to your Amazon S3 Blob storage

When working with Hadoop on Azure, you may be used to the idea of putting your data in the Cloud.  In addition to using Azure Blob Storage, another option is connecting your Hadoop on Azure cluster to query data against Amazon S3.  To configure Hadoop on Azure to connect to it, below are the steps (with the presumption that you already have an Amazon AWS / S3 account) and have uploaded data into your S3 account. 1) Log into your Amazon AWS Account  and click onto Security Credentials 2) Obtain your access credentials – you’ll need both your Access Key…

Rate this:

Connecting PowerPivot to Hadoop on Azure – Self Service BI to Big Data in the Cloud

. “I caught a fish thiiiiis biiig” — On stage with Ted Kummert during the PASS 2011 Keynote on Big Data (thanks to Karen Lopez @datachick for the pic) . . During the PASS 2011 Keynote (back in October 2011), I had the honor to demo Hadoop on Windows / Azure.   One of the key showcases during that presentation was to show how to connect PowerPivot to Hadoop on Windows.  In this post, I show the steps on how to connect PowerPivot to Hadoop on Azure. Pre-requisites PowerPivot for Excel (as of this post, using SQL Server 2012 RC1 version)…

Rate this:

An Azure Elephant Never Forgets…

All your HBase are belong to us Friends, coders, beta testers – lend me your data! . We have more tag lines like that all within the Hadoop on Azure portal!  Click on the image so you can check out the Hadoop on Azure video by clicking on the link – in addition to funky electronica, it showcases how you can integrate Hadoop, Hive, Pig Latin, Hadoop Javascript, Azure DataMarket, and Excel! . After over the past eight months of working with the Isotope Development team led by Alexander Stojanovic (@stojanovic) – the Founder and General Manager of Hadoop on Windows…

Rate this: