Updated HDInsight on Azure ASV paths for multiple storage accounts

If you’ve joined the HDInsight Preview – you will notice many new changes including the tight integration with Windows Azure and that HDInsight defaults to ASV.  As noted in Why use Blob Storage with HDInsight on Azure, there are some interesting technical (performance) and business reasons for utilizing Azure storage accounts. But if you had been playing with the HadoopOnAzure.com beta and switched over to the Windows Azure HDInsight Service Preview – you’ll may have noticed a quick change in the way asv paths work.  Here’s a quick cheat sheet for you. In general, to access ASV sources #ls asv://$container$@$storage_account$.blob.core.windows.net/$path$…

Rate this:

Why use Blob Storage with HDInsight on Azure

By Brad Sarsfield and Denny Lee One of the questions we are commonly asked concerning HDInsight, Azure, and Azure Blob Storage is why one should store their data into Azure Blob Storage instead of HDFS on the HDInsight Azure Compute nodes.  After all, Hadoop is all about moving compute to data vs. traditionally moving data to compute as noted in Moving data to compute or compute to data? That is the Big Data question.  The network is often the bottleneck and making it performant can be expensive.  Yet the practice for HDInsight on Azure is to place the data into…

Rate this:

Using Avro with HDInsight on Azure at 343 Industries

By Michael Wetzel, Tamir Melamed, Mark Vayman, Denny Lee Reviewed by Pedro Urbina Escos, Brad Sarsfield, Rui Martins Thanks to Krishnan Kaniappan, Che Chou, Jennifer Yi, and Rob Semsey As noted in the Windows Azure Customer Solution Case Study, Halo 4 developer 343 Industries Gets New User Insights from Big Data in the Cloud, a critical component to achieve faster Hadoop query and processing performance AND keep file sizes small (thus Azure storage savings, faster query performance, and reduced network overhead) was to utilize Avro sequence files. Avro was designed for Hadoop to help make Hadoop more interoperable with other…

Rate this:

Presentation: Yahoo!, Big Data, and Microsoft BI: Bigger and Better Together

About three and a half years ago, I had virtually joined the Yahoo! Targeting, Analytics, and Optimization (TAO) Engineering team where we embarked on an incredible journey to create the largest single instance Analysis Services cube.  Mind you, that was not our actual goal – our actual goal was to create fast interactive analytics against a massive amount of display advertising data from Yahoo! sites.  The requirements were staggering as noted in the slide below.   Ultimately, we took 2PB of data from one of Yahoo!’s large Hadoop cluster and created a 24TB Analysis Services cube so users could do…

Rate this:

Getting Hadoop and protobufs up and running with Elephant Bird on Mac OSX Mountain Lion

. “No, not Angry Bird – Elephant Bird!” — said no one . . In a few of my customer projects, we started diving into using protocol buffers (protobufs) as our sequence file to be stored within our Hadoop infrastructure.  While these were HDInsight on Azure projects, most of the native Hadoop code is written originally in Linux and has implied assumptions as directory paths, etc.  Therefore, one of the first things I usually do is try to install said software on my handy MacBook Air so that way if and when I run into issues getting the code to…

Rate this:

#PASSBAC – Ensuring Compliance of Patient Data with Big Data and BI

Over the past seven years, Ayad Shammout (@aashamout), Principal Business Intelligence Consultant at Beth Israel Deaconess Medical Center (a teaching hospital of Harvard Medical School), and I have worked on a variety of very exciting SQL Server projects including (but not limited to) Healthcare Group Upgrading to SQL Server 2008 to Better Protect 2 Terabytes of Data,  Healthcare Group Improves Availability and Security of Mission-Critical Databases, Healthcare Group to Enhance Information Access with Powerful Business Intelligence Tools, and SQL Server Reporting Services Disaster Recovery Case Study We’ve worked on some pretty hinke ones including the infamous PowerPivot for SharePoint / Windows Authenticated Users…

Rate this:

A touch of Shanghai cuisine (in this case, in Taichung 台中)

When I’m in Taiwan – it’s all about the Taiwanese small eats (台灣小吃).  These little shops (you cannot even call them restaurants) line alley ways and markets have some of the most delicious food (though lacking amenities). And if you’re in Taichung (台中), you typically will continue that trend – if for no other reason it has the largest night market in Taiwan (Fengjia Night Market – 逢甲夜市). Yet through all the food chaos, sometimes (albeit only very occasionally) a sit down restaurant is in order. You still want good food, but with service, a place to actually sit down,…

Rate this: