2014 Flight Departure Performance via d3.js Crossfilter

As part of some quick analysis of flight departure data, to more quickly understand the impact of distance, date, and time of day on departure delays – I forked the Square Crossfilter and incorporated data from RITA BTS Flight Departure Statistics and Great Circle Mapper to calculate airport distances. At the bottom is a nice screenshot of it, but you can interact with the data directly with the links directly below. Please note that it will take a few seconds to a few minutes to load up because of the large files d3 will process. Airline On-time Departure Performance (Top…

Rate this:

Big Data and Legos

I was recently asked the question – how to explain Big Data to an 8yo. So after realizing the 4 Vs of Big Data barely make sense to non-marketing (i.e. most of us) let alone to kids – I realized that the best construct would be to use Legos. When I was her age, the lego blocks were only squares and rectangles – I could build a lot of buildings and boxes which was great at that time (in data speak, relational databases). Instead, Big Data is a massive amount (e.g. volume of data) of lego blocks of different shapes…

Rate this:

SQL Server Reporting Services Disaster Recovery Slides and Webinar

Back in March 2014, Ayad Shammout (@aashamout) and I – and hosted by Julie Koesmarno (@mssqlgirl) had the opportunity to present real world SQL Server Reporting Services (SSRS) Disaster Recovery as part of the PASS DW/BI webinar series.  In this session, we dived into lessons learned from the robust end-to-end DR solution at CareGroup Healthcare. This DR method will involve both automatic and manual failover in the form of content switches, SQL Server failover clustering, and database mirroring. You can find the slides below. SQL Server Reporting Services Disaster Recovery Webinar from Ayad Shammout and Denny Lee As well, the webinar…

Rate this:

Learnings from Running Spark at Twitter

As part of the Seattle Spark Meetup series, we had a great Learnings from Running Spark at Twitter session at the @TwitterSeattle Offices.  We (Seattle Spark Meetup organizers) want to thank Sriram Krishnan (@krishnansriram) and Benjamin Hindman (@benh) for presenting and Jeff Currier (@jeff_currier) and @TwitterSeattle for hosting us! As well, we had raffled off Paco Nathan’s (@pacoid) Hands-On Apache Spark Workshop (Seattle) – the winner is Monir Abu Hilal (@monirabuhilal )!  If you want to learn more about Spark, I highly recommend his course! Below are the links to the two sessions: Spark at Twitter: Evaluation & Lessons Learnt by @krishnansriram and Mesos for Spark Users…

Rate this:

Build your own CDH5 QuickStart VM with Spark on CentOS

A great way to jump into CDH5 and Spark (with the latest version of Hue) is to build your own CDH5 setup on a VM.  As of this writing, a CDH5 QuickStart VM is not available (though you can download the Cloudera QuickStart VM for CDH4.5). Below are the steps to build your own CDH5 / Spark setup on CentOS 6.5.  Note, the installation of CDH5 through Cloudera Manager is actually quite straight forward.  Instead, these instructions focus on the steps prior to installing Cloudera Manager 5 (and the express install of CDH5) to minimize the hiccups you may run…

Rate this:

Seattle Spark Meetup Kicks Off with DataBricks

I am very excited to that announce that Matei Zaharia and Pat McDonough from DataBricks will be speaking at the Seattle Spark Meetup and we’ve increased the room size to accommodate more people! Seattle Spark Meetup Kick Off with DataBricks They will come up and join us for pizzas and to talk about Apache Spark!  I highly encourage you to join the Seattle Spark Meetup for this and other exciting sessions!  Below is the abstract of their session as well as their biographies. Introduction to Apache Spark Apache Spark has quickly grown to be one of the most active projects in big data,…

Rate this:

Quick Tips on Restoring a MySQL Full InnobackupEx Backup

As I recently started using MySQL (with many years of SQL Server under my belt), here are some quick tips on restoring an InnobackupEx backup.  There was a context switch for me  but I’m enjoying the experience. Introduction There are many ways to backup / restore database (s) for MySQL but the mechanism I’m referring to here is Percona’s InnobackupEx 2.1 for MySQL.  The instructions for this backup can be found at Preparing a Full Backup with innobackupex and  Making a Full Backup.  What’s great about this type of backup is that it backups all of the databases on the…

Rate this: