Spark Ecosystem & Spark Streaming Fundamentals

This is a re-post from Spark Ecosystem & Spark Streaming Fundamentals post on the Concur blog. For the March 2015 Seattle Spark Meetup group convened for our monthly meeting. It was a special meeting since it was actually a joint Seattle Spark Meetup and Pacific Northwest Cloudera User Group session held at the Concur Technologies…

Feb Spark Events: Data Discovery, Dato & Spark, and Spark Camp

An awesomely busy February coming up for those whom are interested in all things Apache Spark! Concur Discovers the True Value of Data A joint Cloudera and Concur webinar on February 2nd, 2015 where we will discuss the benefits of utilizing CDH5 within Concur’s modern Big Data architecture (including Spark of course!) Better Together: Dato…

Quick Tip: Dropping Phantom Hive Databases (e.g. CDH5 Canary test dB)

While I’m a big fan of CDH5 and Hue – sometimes I will see some funkiness that’s a tad irritating.  Specifically, there is a database with a name similar to cloudera_manager_metastore_canary_test_db_hive_hivemetastore_$guid$_2014_10_06_11_20_41 Even more irritating there is a table called cm_test_table which cannot be deleted (or renamed or even described). hive> describe cm_test_table; FAILED: SemanticException [Error 10001]:…

Yes, you can connect Tableau to SparkSQL (Spark 1.1)

As a data scientist and engineer, I appreciate that Apache Spark  has many components to make it easy to analyze, gain insight, and to generate recommendations from my data.  However, as noted within my previous presentation , one of the things missing is an easy way for analysts to visualize their data. The good news is…

Simplifying Big Data: An Introductory Hadoop Primer

Back in July, I had the honor to speak with Michael Zeller moderating the July 2014 AM webinar on Big Data. If you are interested in learning more about Big Data from a business / analyst perspective – here is our webinar on YouTube. Abstract: What’s the Big Deal with Big Data? And, more importantly,…

