Jump Start into Python and Apache Spark with Learning PySpark

For the last few years, I have had the opportunity to work on some of the coolest Apache Spark committers, contributors, and projects.  As luck would have it, I got the opportunity to meet my co-author Tomasz Drabas (author of the awesome Practical Data Analysis Cookbook) while we were solving some other cool Apache Spark projects.  In the process, we joined forces to share our lessons learned that will hopefully help you jump start your Python and Apache Spark projects with our book: Learning PySpark. And just to make sure, this book was reviewed by the incomparable Holden Karau, author of the…

Rate this:

On-Time Flight Performance with GraphFrames for Apache Spark

Feature Image: NASA Goddard Space Flight Center: City Lights of the United States 2012 This is an abridged version of the full blog post On-Time Flight Performance with GraphFrames. You can also reference the webinar GraphFrames: DataFrame-based graphs for Apache Spark and the On-Time Flight Performance with GraphFrames for Apache Spark notebook. An intuitive approach to understanding flight departure delays is to use graph structures. Why Graph? The reason for using graph structures is because it is a more intuitive approach to many classes of data problems: social networks, restaurant recommendations, or flight paths.  It is easier to understand these data problems…

Rate this:

Notebook Gallery

Here are some of the notebooks created to showcase various Apache Spark use cases. These are all using Databricks Community Edition which you can get at Try Databricks. You can also access the source from : https://github.com/dennyglee/databricks. JSON Support GLM in SparkR Window Functions  Random Forests DataFrame API ML Operations   Decision Trees Statistical Functions  Data Import  Data Exploration Quick Start Python Quick Start Scala  Ad-Tech Example Flight Delays  Genomics Mobile Sample   Pop vs. Price LR  Pop vs. Price DF  Salesforce Leads Spark 1.6 (Multiple)   Spark 1.6  

Rate this:

In the context of quantum entanglement and time travel – Stargate may be more correct than Star Trek

Feature Image: Michael Bolognesi’s Diamonds in the Sky As a follow up to In the context of quantum entanglement and teleportation – Stargate may be more correct than Star Trek, I’m diving into one of SciFi’s persistent quandaries – time travel.  And before anyone gets started, I am a proud Trekkie so this is not meant as a knock on Star Trek.  In fact, I’ve already purchased my tickets for Star Trek Into Darkness and as fan of BBC’s Sherlock, I have to admit I’m sort of rooting for the villain this time around!   Image source: Benedict Cumberbatch – Star…

Rate this:

Using Avro with HDInsight on Azure at 343 Industries

By Michael Wetzel, Tamir Melamed, Mark Vayman, Denny Lee Reviewed by Pedro Urbina Escos, Brad Sarsfield, Rui Martins Thanks to Krishnan Kaniappan, Che Chou, Jennifer Yi, and Rob Semsey As noted in the Windows Azure Customer Solution Case Study, Halo 4 developer 343 Industries Gets New User Insights from Big Data in the Cloud, a critical component to achieve faster Hadoop query and processing performance AND keep file sizes small (thus Azure storage savings, faster query performance, and reduced network overhead) was to utilize Avro sequence files. Avro was designed for Hadoop to help make Hadoop more interoperable with other…

Rate this:

An all too brief stop over in Tainan (台南)

For anyone who regularly visits Taiwan, the city of Tainan can easily be missed with the other three major cites of Taipei, Taichung, and Kaohsiung.  Yet that would be a grave mistake if you consider yourself a foodie.  A word of warning, driving in Tainan is atrocious – there are only two major roads going into the city off of Highway 1.  Yet if you brave the driving conditions and/or decide take the train (HSR or TRA), make your way to the old historic district and you will be pleasantly satiated with all sorts of Taiwanese small eats (台灣小吃).  This…

Rate this:

Travel Tuesday: Sagrada Famila

For this Travel Tuesday post – allow me to share some personal photos from the beautiful Basilica I Temple Expiatori de la Sagrada Familia in Barcelona, Catalonia, Spain.  This is one of Antoni Gaudi’s most amazing and famous works merging Gothic, Catholicism, and  Art Nouveau.  For more information, check out wikipedia: http://en.wikipedia.org/wiki/Sagrada_Fam%C3%ADlia Though it is an incomplete work of art that will take decades to complete, it is well worth visiting and touring this UNESCO site.  Important biblical events are depicted all throughout on the exterior walls of the basilica. It is a testament to Gaudi’s vision to have drawn…

Rate this: