Spark atop Mesos on Google Cloud Platform querying Google Cloud Storage

A great reason to jump into Spark on Mesos on Google Cloud Platform is because you can quickly spin up a development environment to work with Spark, Mesos, Google Cloud, and Marathon together very quickly. A great way to set this up is to follow the steps in Paco Nathan’s (@pacoid) great blog post Spark atop Mesos on Google Cloud Platform. But what’s missing from this configuration is the ability to connect to Google Cloud Storage (GCS) so you can run your Spark queries off of a persistent elastic storage. As noted in the diagram below, you will first install Spark…

Rate this:

Yes, you can connect Tableau to SparkSQL (Spark 1.1)

As a data scientist and engineer, I appreciate that Apache Spark  has many components to make it easy to analyze, gain insight, and to generate recommendations from my data.  However, as noted within my previous presentation , one of the things missing is an easy way for analysts to visualize their data. The good news is there is an easy way to gain visuals of your data by connecting Tableau to SparkSQL!  As noted in my Tableau Data14 presentation (slides are embedded below), there is an unofficial method to connect Tableau to SparkSQL. For more information, please read on at An Absolutely…

Rate this:

The Future of Hadoop: A deeper look at Apache Spark

Understand why Apache Spark has experienced such wide adoption and learn about some Spark use cases today. There is also a technical deep dive into the architecture, and our vision for the Hadoop ecosystem and why we believe Spark is the successor to MapReduce for Hadoop data processing. As well, here’s the link to The Future of Hadoop: A deeper look at Apache Spark webinar.

Rate this:

Simplifying Big Data: An Introductory Hadoop Primer

Back in July, I had the honor to speak with Michael Zeller moderating the July 2014 AM webinar on Big Data. If you are interested in learning more about Big Data from a business / analyst perspective – here is our webinar on YouTube. Abstract: What’s the Big Deal with Big Data? And, more importantly, what is the business case for Big Data? In this session, we will focus on the fundamentals of Hadoop as it is the foundation for Big Data. We’ll talk the technology but also the business cases on what you can do and not do with…

Rate this:

How Concur uses Big Data to get you to Tableau Conference On Time

Last week, I was excited to meet old SQL BI community friends in the same venue (Washington State Convention Center) but for the Tableau Conference Data14 (as opposed to SQLPASS). It was a lot of fun and I really appreciated the opportunity to speak about Tableau, BI, and Big Data … just like old times! A big shout out to Cloudera for sponsoring Concur and myself for Tableau Conference – thanks! The session abstract was: Many of you attending the Tableau Customer Conference probably used Concur to book your flights. Did you know that at the heart of Concur is…

Rate this:

To Spark … and Beyond!

One of the very exciting thing about Spark is that there is the potential to have one ubiquitous tool to solve my aggregate, machine learning, graph, and other statistical / analytics problems.  And while I am proud of my time with the SQL Server team and we had achieved some amazing lofty goals (e.g. Yahoo! 24TB Analysis Services cube), I had been drawn back to my statistical roots. Statistical Roots? It may surprise you that I had been bouncing between the path of becoming a Doctor (…you know, Asian parents) or a statistician (my father was a Statistics professor).  …

Rate this:

Big Data and Legos

I was recently asked the question – how to explain Big Data to an 8yo. So after realizing the 4 Vs of Big Data barely make sense to non-marketing (i.e. most of us) let alone to kids – I realized that the best construct would be to use Legos. When I was her age, the lego blocks were only squares and rectangles – I could build a lot of buildings and boxes which was great at that time (in data speak, relational databases). Instead, Big Data is a massive amount (e.g. volume of data) of lego blocks of different shapes…

Rate this: