The Future of Hadoop: A deeper look at Apache Spark

Understand why Apache Spark has experienced such wide adoption and learn about some Spark use cases today. There is also a technical deep dive into the architecture, and our vision for the Hadoop ecosystem and why we believe Spark is the successor to MapReduce for Hadoop data processing. As well, here’s the link to The Future of Hadoop: A deeper look at Apache Spark webinar.

Rate this:

To Spark … and Beyond!

One of the very exciting thing about Spark is that there is the potential to have one ubiquitous tool to solve my aggregate, machine learning, graph, and other statistical / analytics problems.  And while I am proud of my time with the SQL Server team and we had achieved some amazing lofty goals (e.g. Yahoo! 24TB Analysis Services cube), I had been drawn back to my statistical roots. Statistical Roots? It may surprise you that I had been bouncing between the path of becoming a Doctor (…you know, Asian parents) or a statistician (my father was a Statistics professor).  …

Rate this:

Seattle Spark Meetup Roundup: Summit, xPatterns, and Machine Learning – next is Interactive OLAP!

We’ve had some really exciting Spark sessions at the Seattle Spark Meetup even with all of the great stuff announced during last week’s Spark Summit 2014.  This post is a couple months past due, so here’s the latest compiled together! xPatterns on Spark, Shark, Mesos, & Tachyon Claudiu Barbura showcased Atigeo’s xPatterns – a real world customer architecture utilizing Spark, Shark, Mesos, and Tachyon!  A lot of great demos along with lessons learned and tips & tricks! xPatterns on Spark, Shark, Mesos, and Tachyon Session xPatterns on Spark, Shark, Mesos, and Tachyon Slides   Fun Things You Can Do With…

Rate this:

Learnings from Running Spark at Twitter

As part of the Seattle Spark Meetup series, we had a great Learnings from Running Spark at Twitter session at the @TwitterSeattle Offices.  We (Seattle Spark Meetup organizers) want to thank Sriram Krishnan (@krishnansriram) and Benjamin Hindman (@benh) for presenting and Jeff Currier (@jeff_currier) and @TwitterSeattle for hosting us! As well, we had raffled off Paco Nathan’s (@pacoid) Hands-On Apache Spark Workshop (Seattle) – the winner is Monir Abu Hilal (@monirabuhilal )!  If you want to learn more about Spark, I highly recommend his course! Below are the links to the two sessions: Spark at Twitter: Evaluation & Lessons Learnt by @krishnansriram and Mesos for Spark Users…

Rate this:

Build your own CDH5 QuickStart VM with Spark on CentOS

A great way to jump into CDH5 and Spark (with the latest version of Hue) is to build your own CDH5 setup on a VM.  As of this writing, a CDH5 QuickStart VM is not available (though you can download the Cloudera QuickStart VM for CDH4.5). Below are the steps to build your own CDH5 / Spark setup on CentOS 6.5.  Note, the installation of CDH5 through Cloudera Manager is actually quite straight forward.  Instead, these instructions focus on the steps prior to installing Cloudera Manager 5 (and the express install of CDH5) to minimize the hiccups you may run…

Rate this:

Seattle Spark Meetup Kicks Off with DataBricks

I am very excited to that announce that Matei Zaharia and Pat McDonough from DataBricks will be speaking at the Seattle Spark Meetup and we’ve increased the room size to accommodate more people! Seattle Spark Meetup Kick Off with DataBricks They will come up and join us for pizzas and to talk about Apache Spark!  I highly encourage you to join the Seattle Spark Meetup for this and other exciting sessions!  Below is the abstract of their session as well as their biographies. Introduction to Apache Spark Apache Spark has quickly grown to be one of the most active projects in big data,…

Rate this: