<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Denny Lee &#187; dennyglee</title>
	<atom:link href="http://dennyglee.com/author/dennyglee/feed/" rel="self" type="application/rss+xml" />
	<link>http://dennyglee.com</link>
	<description>Ramblings of a data dork: from BI and Big Data to Travel and Food</description>
	<lastBuildDate>Fri, 03 Feb 2012 15:01:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='dennyglee.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Denny Lee &#187; dennyglee</title>
		<link>http://dennyglee.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://dennyglee.com/osd.xml" title="Denny Lee" />
	<atom:link rel='hub' href='http://dennyglee.com/?pushpress=hub'/>
		<item>
		<title>Foodie Friday: Vancouver&#8217;s Banana Leaf Malaysian Cuisine Restaurant</title>
		<link>http://dennyglee.com/2012/02/03/foodie-friday-vancouvers-banana-leaf-malaysian-cuisine-restaurant/</link>
		<comments>http://dennyglee.com/2012/02/03/foodie-friday-vancouvers-banana-leaf-malaysian-cuisine-restaurant/#comments</comments>
		<pubDate>Fri, 03 Feb 2012 15:00:00 +0000</pubDate>
		<dc:creator>dennyglee</dc:creator>
				<category><![CDATA[Foodie Friday]]></category>
		<category><![CDATA[Malaysian]]></category>

		<guid isPermaLink="false">https://dennyglee.wordpress.com/?p=841</guid>
		<description><![CDATA[Braised Lamb Shank in Cumin and Star Anise Photo credit goes to Steph L. on Yelp Vancouver is regularly ranked in the top liveable city list (for awhile, they were #1) &#8211; the temperate environment, beautiful scenery, and general aura of the city is just amazing.  Just as awesome is the city’s quality and diversity [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=841&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://dennyglee.files.wordpress.com/2012/01/braised-lamb-shank.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;float:left;padding-top:0;border-width:0;" title="braised lamb shank" src="http://dennyglee.files.wordpress.com/2012/01/braised-lamb-shank_thumb.png?w=337&#038;h=279" alt="braised lamb shank" width="337" height="279" align="left" border="0" /></a></p>
<blockquote><p>Braised Lamb Shank in Cumin and Star Anise</p>
<p><a href="http://www.yelp.ca/biz_photos/eUf08fT2qf-l8Ofi--sETw?select=4opehz8O29a1YM3_iz0BpQ" target="_blank">Photo credit goes to Steph L. on Yelp</a></p></blockquote>
<p>Vancouver is regularly ranked in the top liveable city list (for awhile, they were #1) &#8211; the temperate environment, beautiful scenery, and general aura of the city is just amazing.  Just as awesome is the city’s quality and diversity of great food.</p>
<h3>Amazing Malaysian Cuisine</h3>
<p>One of those places is most certainly <a href="http://bananaleaf-vancouver.com/" target="_blank"><strong>Banana Leaf Malaysian Cuisine</strong></a>.  As of this post, they have four restaurants in the Vancouver area – I personally have went to both the one on Denman and the one in Kits – both are amazingly good.</p>
<p>Normally, I’m one to try all sort of food – I rather pride myself on the more than willingness to eat all sorts of things that people would find … weird.  In this, I am definitely Chinese.   <a href="http://dennyglee.files.wordpress.com/2012/01/thecorruptor.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;float:right;padding-top:0;border:0;" title="thecorruptor" src="http://dennyglee.files.wordpress.com/2012/01/thecorruptor_thumb.png?w=188&#038;h=260" alt="thecorruptor" width="188" height="260" align="right" border="0" /></a></p>
<p>As noted in the movie “The Corruptor” (not the best movie even with the great actors Chow Yun-Fat and Mark Wahlberg)</p>
<blockquote><p>You wanna be Chinese, you gotta eat the nasty stuff&#8221;</p></blockquote>
<p>Fortunately, this is certainly NOT the case for Banana Leaf – <strong>aromatic, bold, delictable, and flavourful</strong> (not salty – and yes, I did spell flavour the Canadian way) are the words that come to mind.</p>
<p>While there are many great dishes – my personal call outs are:</p>
<p><strong>Braised Lamb Shank in Cumin &amp; Star Anise</strong>: Just as the menu describes it, the lamb shank is so tender it literally falls off the bone.  Amazingly good with just the right amount of spices to bring out the flavour of the lamb instead of over powering it.  The lamb shank is cooked just right to ensure that the meat is amazingly tender (over cook lamb, and you’ve got yourself one awesome piece of rubber).  <em>If you order nothing else, this IS the dish you order. Period.</em></p>
<p><strong>Pineapple Fried Rice with Seafood &amp; Chicken: </strong>I almost never, ever, ever, ever, ever, …, ever order fried rice from an Asian restaurant…ever.  Often its too much soy sauce, too much MSG, adding ketchup (seriously WTF, fried rice made with ketchup!!!), …. etc.  This is seriously not the case.  Served in a scooped out pineapple – the fried rice is lightly sweet with solid heapings of seafood and chicken.  Quite filling – in a good way!</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/pisang.jpg"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;float:left;padding-top:0;border:0;" title="pisang" src="http://dennyglee.files.wordpress.com/2012/01/pisang_thumb.jpg?w=195&#038;h=241" alt="pisang" width="195" height="241" align="left" border="0" /></a></p>
<p>And the final call out of course goes to dessert.  I certainly have what one would call a “sweet tooth”.  And the <strong>Pisang Goreng</strong> is my current dessert choice du jour.  Crispy fried banana, vanilla ice cream, crushed peanuts, and gula melaka (which I found out from Wikipedia is palm sugar) – what’s not to love!</p>
<p>So the next time you’re in Vancouver and you’re up for some good Malaysian cuisine – or just good food in general – check it out.  I leave you with the wisdom of George Bernard Shaw:</p>
<blockquote><p>There is no love sincerer than the love of food.</p></blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dennyglee.wordpress.com/841/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dennyglee.wordpress.com/841/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dennyglee.wordpress.com/841/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dennyglee.wordpress.com/841/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dennyglee.wordpress.com/841/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dennyglee.wordpress.com/841/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dennyglee.wordpress.com/841/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dennyglee.wordpress.com/841/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dennyglee.wordpress.com/841/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dennyglee.wordpress.com/841/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dennyglee.wordpress.com/841/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dennyglee.wordpress.com/841/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dennyglee.wordpress.com/841/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dennyglee.wordpress.com/841/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=841&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dennyglee.com/2012/02/03/foodie-friday-vancouvers-banana-leaf-malaysian-cuisine-restaurant/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/250a51c7bd19002fe660942b887c149d?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">dennyglee</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/braised-lamb-shank_thumb.png" medium="image">
			<media:title type="html">braised lamb shank</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/thecorruptor_thumb.png" medium="image">
			<media:title type="html">thecorruptor</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/pisang_thumb.jpg" medium="image">
			<media:title type="html">pisang</media:title>
		</media:content>
	</item>
		<item>
		<title>Moving data to compute or compute to data? That is the Big Data question</title>
		<link>http://dennyglee.com/2012/01/31/moving-data-to-compute-or-compute-to-data-that-is-the-big-data-question/</link>
		<comments>http://dennyglee.com/2012/01/31/moving-data-to-compute-or-compute-to-data-that-is-the-big-data-question/#comments</comments>
		<pubDate>Tue, 31 Jan 2012 15:00:09 +0000</pubDate>
		<dc:creator>dennyglee</dc:creator>
				<category><![CDATA[BigData]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Chemistry]]></category>
		<category><![CDATA[Scale-Out]]></category>
		<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">https://dennyglee.wordpress.com/?p=883</guid>
		<description><![CDATA[Dorky attempts at geek Shakespere aside; as the volume, complexity, and variability of your data systems increase in … entropy …, this becomes a fundamental question in whether one scales up or scale out their data problem. Apologies for the nerdy chemistry references in advance – which starts with this picture of Dr. Arthur Grosser [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=883&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://assassinscreed.wikia.com/wiki/Arthur_Grosser"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;float:right;padding-top:0;border-width:0;" title="Arthur_Grosser" src="http://dennyglee.files.wordpress.com/2012/01/arthur_grosser.png?w=150&#038;h=187" alt="Arthur_Grosser" width="150" height="187" align="right" border="0" /></a>Dorky attempts at geek Shakespere aside; as the volume, complexity, and variability of your data systems increase in … entropy …, this becomes a fundamental question in whether one scales up or scale out their data problem.</p>
<blockquote><p>Apologies for the nerdy chemistry references in advance – which starts with this picture of Dr. Arthur Grosser (more later)</p></blockquote>
<p>As noted in the previous post <a href="http://wp.me/pHDEa-dM" target="_blank">Scale Up or Scale Out your Data Problems? A Space Analogy</a>, the decision to scaling up or scaling out your data problem is a key facet in your Big Data problem.  But just as important as the ability to distribute the data across commoditized hardware, another key facet is the <strong>movement of data</strong>.</p>
<p>Latencies (i.e. slower performance) are introduced when you need to move data from one location to another.  To solve this problem within the data world, you can solve this by making it easier to move the data faster (e.g. compression, delta transfer, faster connectivity, etc.) or you design a system that reduces the need to move the data in the first place (i.e. moving data to compute or compute to data).</p>
<h3>Scaling Up the Problem / Moving Data to Compute</h3>
<p>To help describe the problem, the diagram below is a representation of a scale up traditional RDBMS.  The silver database boxes on the left represent the database servers (each with blue platters representing local disks), the box with 9 blue platters represents a disk array (e.g. SAN, DAS, etc.), the blue arrows represent fiber channel connections (between the server and disk array), and the green arrows represent the network connectivity.</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image18.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb18.png?w=412&#038;h=298" alt="image" width="412" height="298" border="0" /></a></p>
<p>In an optimized scale up RDBMS, we often will setup DAS or SANs to quickly transfer data from the disk array to the RDBMS server or compute node (often allocating the local disk for the compute node to hold temp/backup/cache files).  This scenario works great under the specific scenario that you can <strong>ensure low latencies</strong>.</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image19.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb19.png?w=406&#038;h=289" alt="image" width="406" height="289" border="0" /></a></p>
<p>And this is where things can get complicated, because if you were to lose disks on the array and/or fiber channel connectivity to the disk array – the RDBMS would go offline.    But as described in the above diagram, perhaps you setup active clustering so the secondary RDBMS can take over.</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image20.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb20.png?w=406&#038;h=295" alt="image" width="406" height="295" border="0" /></a></p>
<p>Yet, if you were to lose network connectivity (e.g. the secondary RDBMS is not aware the primary is offline) or lose fiber channel connectivity, you would also lose the secondary.</p>
<p><strong><span style="text-decoration:underline;">The Importance of ACID</span></strong></p>
<p>It is important to note that many RDBMS systems have features or designs that work around these problems.  But to ensure <em>availability</em> and <em>redundancy</em>, if often requires more expensive hardware to work around the problematic network and disk failure points.</p>
<p>As well, this is not to say that RDBMS are based design – they are designed with <strong>ACID</strong> in mind – atomicity, consistency, isolation, and durability – to <span style="text-decoration:underline;">guarantee the reliability and robustness of database transactions</span> (for more info, check out the Wikipedia entry: <a href="http://en.wikipedia.org/wiki/ACID" target="_blank">ACID</a>).</p>
<h3>Scaling Out the Problem / Moving Compute to Data</h3>
<p>In a scale out or distributed solution, the idea is to have many commodity servers; they are many points of failure but there are also many paths for success.</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image21.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb21.png?w=469&#038;h=255" alt="image" width="469" height="255" border="0" /></a></p>
<p>Key to a distributed system is that as data comes in (the blue file icon on the right represent data such as web logs), the data is <span style="text-decoration:underline;">distributed and replicated</span> in chunks to many nodes within the cluster.  In the case of Hadoop, files are broken into 64MB / 128MB chunks and each of these chunks are placed into three different locations (if you set the replication factor to 3).</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image22.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb22.png?w=474&#038;h=251" alt="image" width="474" height="251" border="0" /></a></p>
<p>While you are using more disk space to replicate the data, now that you have placed the data into the system, you have ensured <em>redundancy</em> by replicating the data within it.</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image23.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb23.png?w=474&#038;h=247" alt="image" width="474" height="247" border="0" /></a></p>
<p>What is great about these types of distributed systems, they are designed right from the beginning to handle latency issues whether they be disk or network connectivity problems to out right losing a node.  In the above diagram, a user is requesting data, but there is a loss to some disks and some network connections.</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image24.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb24.png?w=476&#038;h=250" alt="image" width="476" height="250" border="0" /></a></p>
<p>Nevertheless, there are other nodes that do have network connectivity and the data has been replicated so it is available.    Systems that are designed to scale out and distribute like Hadoop can ensure <em>availability</em> of the data and will complete the query just as long as the data exists (it may take longer if nodes are lost, but the query will be completed).</p>
<p><strong><span style="text-decoration:underline;">The importance of BASE</span></strong></p>
<p>By using many commodity boxes, you distribute and replicate your data to multiple systems.  But as there are many moving parts, distributed systems like these <span style="text-decoration:underline;">cannot</span> ensure the reliability and robustness of database transactions.  Instead, they fall under the domain of <strong>eventual consistency</strong> where over a period of time (i.e. eventually) the data within the entire system will be consistent (e.g. all data modifications will be replicated throughout the cluster).  This concept is also known as BASE (as opposed to ACID) – Basically Available, Soft State, Eventually Consistent.  For more information, check out the Wikipedia reference: <a href="http://en.wikipedia.org/wiki/Eventual_consistency">Eventual Consistency</a>.</p>
<h3>Discussion</h3>
<p>Similar to the post <a href="http://wp.me/pHDEa-dM" target="_blank">Scale Up or Scale Out your Data Problems? A Space Analogy</a>, choosing whether ACID or BASE works for you is not a matter of which one to use – but which one to use when.  For example, as noted in the post <a href="http://blogs.msdn.com/b/sqlcat/archive/2011/11/15/what-s-so-big-about-big-data.aspx">What’s so BIG about “Big Data”?</a>, the Yahoo! Analysis Services cube is <strong>24 TB</strong> (certainly a case of moving data to compute with my obsession on random IO with SSAS) and the source of this cube is a <strong>2PB</strong> of data from a huge Hadoop cluster (moving compute to data).</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/yahoo-hadoop-to-cube.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="Yahoo Hadoop to Cube" src="http://dennyglee.files.wordpress.com/2012/01/yahoo-hadoop-to-cube_thumb.png?w=431&#038;h=195" alt="Yahoo Hadoop to Cube" width="431" height="195" border="0" /></a></p>
<p>Each one has its own set of issues – scaling out increases the complexity of maintaining so many nodes, scaling up becomes more expensive to ensure availability and reliability, etc.   It will be important to understand the pros/cons of each type – often it will be a combination of these two.   Another great example can be seen in Dave Mariani (@mariani)’s post: <a href="http://corp.klout.com/blog/2011/11/big-data-bigger-brains/">Big Data, Bigger Brains</a> at Klout’s blog.</p>
<blockquote><p>ACID and BASE each have their own set of problems, the good news is that mixing them together often neutralizes the problems.</p></blockquote>
<p>&#8212;</p>
<h3>Okay, what’s with the picture of Dr. Arthur Grosser?</h3>
<table width="497" border="0" cellspacing="0" cellpadding="10">
<tbody>
<tr>
<td valign="top" width="132"><a href="http://dennyglee.files.wordpress.com/2012/01/dork.jpg"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="dork" src="http://dennyglee.files.wordpress.com/2012/01/dork_thumb.jpg?w=114&#038;h=130" alt="dork" width="114" height="130" border="0" /></a></td>
<td valign="top" width="363">Oh, Dr. Arthur Grosser is an actor whose filmography includes Assassin’s Creed II, Splinter Cell, and the 90s TV show Urban Angel. But more importantly – to me anyways – is that he was my <strong>chemistry</strong> professor at McGill University. He was a great professor able to balance deep academic research and learning with making chemistry fun and entertaining. He also showed to me (and I think many other students) that dorky and nerdy could still be cool.</td>
</tr>
</tbody>
</table>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dennyglee.wordpress.com/883/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dennyglee.wordpress.com/883/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dennyglee.wordpress.com/883/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dennyglee.wordpress.com/883/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dennyglee.wordpress.com/883/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dennyglee.wordpress.com/883/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dennyglee.wordpress.com/883/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dennyglee.wordpress.com/883/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dennyglee.wordpress.com/883/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dennyglee.wordpress.com/883/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dennyglee.wordpress.com/883/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dennyglee.wordpress.com/883/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dennyglee.wordpress.com/883/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dennyglee.wordpress.com/883/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=883&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dennyglee.com/2012/01/31/moving-data-to-compute-or-compute-to-data-that-is-the-big-data-question/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/250a51c7bd19002fe660942b887c149d?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">dennyglee</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/arthur_grosser.png" medium="image">
			<media:title type="html">Arthur_Grosser</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb18.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb19.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb20.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb21.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb22.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb23.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb24.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/yahoo-hadoop-to-cube_thumb.png" medium="image">
			<media:title type="html">Yahoo Hadoop to Cube</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/dork_thumb.jpg" medium="image">
			<media:title type="html">dork</media:title>
		</media:content>
	</item>
		<item>
		<title>Cool Hadoop on Azure How To Videos by @bradoop</title>
		<link>http://dennyglee.com/2012/01/26/cool-hadoop-on-azure-how-to-videos-by-bradoop/</link>
		<comments>http://dennyglee.com/2012/01/26/cool-hadoop-on-azure-how-to-videos-by-bradoop/#comments</comments>
		<pubDate>Thu, 26 Jan 2012 07:13:31 +0000</pubDate>
		<dc:creator>dennyglee</dc:creator>
				<category><![CDATA[BigData]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Azure]]></category>
		<category><![CDATA[Hadoop]]></category>

		<guid isPermaLink="false">https://dennyglee.wordpress.com/?p=892</guid>
		<description><![CDATA[A big shout out to Brad Sarsfield (@bradoop) for creating these great How-To videos for Hadoop on Azure. &#160; How To: Upload Data and Use the WordCount Sample with Hadoop Services for Windows Azure (video) &#160; &#160; Run the Pi Estimator Sample on Hadoop on Windows Azure (video)<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=892&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A big shout out to Brad Sarsfield (@bradoop) for creating these great How-To videos for Hadoop on Azure.</p>
<p>&nbsp;</p>
<p><strong>How To: Upload Data and Use the WordCount Sample with Hadoop Services for Windows Azure (video)</strong></p>
<div id="scid:5737277B-5D6D-4f48-ABFC-DD9C333F4C5D:35eb250e-1560-432e-ada6-59e1996fb25f" class="wlWriterEditableSmartContent" style="display:inline;float:none;margin:0;padding:0;">
<div><span style="text-align:center; display: block;"><a href="http://dennyglee.com/2012/01/26/cool-hadoop-on-azure-how-to-videos-by-bradoop/"><img src="http://img.youtube.com/vi/BvssSmYJBoc/2.jpg" alt="" /></a></span></div>
</div>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><strong>Run the Pi Estimator Sample on Hadoop on Windows Azure (video)</strong></p>
<div id="scid:5737277B-5D6D-4f48-ABFC-DD9C333F4C5D:36dc539a-fd73-40d7-90f8-05aed6f74861" class="wlWriterEditableSmartContent" style="display:inline;float:none;margin:0;padding:0;">
<div><span style="text-align:center; display: block;"><a href="http://dennyglee.com/2012/01/26/cool-hadoop-on-azure-how-to-videos-by-bradoop/"><img src="http://img.youtube.com/vi/w0BpLawwmKI/2.jpg" alt="" /></a></span></div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dennyglee.wordpress.com/892/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dennyglee.wordpress.com/892/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dennyglee.wordpress.com/892/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dennyglee.wordpress.com/892/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dennyglee.wordpress.com/892/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dennyglee.wordpress.com/892/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dennyglee.wordpress.com/892/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dennyglee.wordpress.com/892/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dennyglee.wordpress.com/892/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dennyglee.wordpress.com/892/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dennyglee.wordpress.com/892/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dennyglee.wordpress.com/892/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dennyglee.wordpress.com/892/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dennyglee.wordpress.com/892/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=892&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dennyglee.com/2012/01/26/cool-hadoop-on-azure-how-to-videos-by-bradoop/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/250a51c7bd19002fe660942b887c149d?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">dennyglee</media:title>
		</media:content>
	</item>
		<item>
		<title>Scale Up or Scale Out your Data Problems? A Space Analogy</title>
		<link>http://dennyglee.com/2012/01/24/scale-up-or-scale-out-your-data-problems-a-space-analogy/</link>
		<comments>http://dennyglee.com/2012/01/24/scale-up-or-scale-out-your-data-problems-a-space-analogy/#comments</comments>
		<pubDate>Tue, 24 Jan 2012 15:00:01 +0000</pubDate>
		<dc:creator>dennyglee</dc:creator>
				<category><![CDATA[BigData]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Scale-Out]]></category>
		<category><![CDATA[Space]]></category>
		<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">https://dennyglee.wordpress.com/?p=854</guid>
		<description><![CDATA[As I am writing more about Big Data, I’m been asked whether we need to have traditional relational or cube systems now that we have Big Data / NoSQL / Hadoop.  My responses are to note that these are different systems that serve different purposes even though both are used to better understand data. But [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=854&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>As I am writing more about Big Data, I’m been asked whether we need to have traditional relational or cube systems now that we have Big Data / NoSQL / Hadoop.  My responses are to note that these are different systems that serve different purposes even though both are used to better understand data.</p>
<p>But before we dive into the specifics surrounding relational databases compared to Hadoop / Big Data, we need to first talk about the differences between solving the a data problem by <strong>scaling up the problem or scaling it out</strong>.</p>
<p>One way to understand the difference is to use a space analogy.  (If you’re part of the SQL Twitter community, you’ll notice a prevalence of NASA and Space tweets)</p>
<h3>Scale Up Space Analogy</h3>
<p><a href="http://hubblesite.org/gallery/album/galaxy/pr2006046a/xlarge_web/" target="_blank"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="Antennae Galaxies" src="http://dennyglee.files.wordpress.com/2012/01/antennae-galaxies.png?w=544&#038;h=265" alt="Antennae Galaxies" width="544" height="265" border="0" /></a></p>
<p>The amazing image above is The <em>Antennae Galaxies / NGC 4038-4039</em> from the Hubble Telescope – click on the image to see the full extra-large image, it is <em>magnificent</em>.</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/database.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;float:right;padding-top:0;border:0;" title="Database" src="http://dennyglee.files.wordpress.com/2012/01/database_thumb.png?w=101&#038;h=105" alt="Database" width="101" height="105" align="right" border="0" /></a></p>
<p>In terms of space analogies, the Hubble Telescope is analogous of a <strong>scale up</strong> technology such as your traditional relational database.</p>
<ul>
<li>It is often utilizes <span style="text-decoration:underline;">non-commodity</span> hardware (or if it is commodity, it’s <em>enterprise commodity</em> hardware).</li>
<li><span style="text-decoration:underline;">Specialized equipment</span> was designed and built for the Hubble telescope.  While not as astronomic (!) in price, enterprise database performance requires more specialized (and expensive) hardware.</li>
<li>The Hubble telescope in itself is a <span style="text-decoration:underline;">single point of failure</span> in that if we were to lose the telescope (or a lens), we would lose the ability to get all of these amazing images.</li>
</ul>
<p>This is <strong>NOT</strong> to say scale up is a bad thing, after all we get these amazing images from the Hubble telescope precisely because we (well, NASA scientists) have focused on creating Hubble with non-commodity specialized hardware.  Following the same analogy, relational database systems (or cube systems) also benefit from this scale up approach because it becomes possible to provide query results quickly.</p>
<h3>Scale Out Space Analogy</h3>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/galaxies-to-seti.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="Galaxies to SETI" src="http://dennyglee.files.wordpress.com/2012/01/galaxies-to-seti_thumb.png?w=519&#038;h=246" alt="Galaxies to SETI" width="519" height="246" border="0" /></a></p>
<p>The flow of the above image is my representation of the Search for Extra Terrestrial Intelligence (SETI) project.  SETI itself has the ultimate example of scale out distribution with their <a href="http://setiathome.ssl.berkeley.edu/" target="_blank"><strong>SETI@Home</strong></a> project.  The search for extra terrestrial intelligence:</p>
<ul>
<li>Begins with the search of <em>radio waves</em> through out the galaxies.  The image above is that of the <a href="http://hubblesite.org/hubble_discoveries/10th/photos/graphics/slide29high.jpg" target="_blank">Giant Galactic Nebula NGC 3603</a> (also from the Hubble telescope)</li>
<li>Those radio waves are detected by radio observatories located in various locations on the planet.</li>
<li>All of this data is stored and initially crunched by super computers like the Barcelona Supercomputing Center.</li>
<li>Yet how a lot of this crunching is done by <a href="http://boinc.berkeley.edu/" target="_blank">BOINC-based projects</a> like <a href="mailto:SETI@Home">SETI@Home</a> – i.e. making use of screensavers on one’s home machine.</li>
</ul>
<p>That is, a sizeable chunk in the search for radio waves in the astrological heavens is being done by home computer screensavers – around <em>5.2 Million participants</em> processing 769 teraFLOPS (11/14/2009) of data!</p>
<p>The key facets of commoditized distributed computing are then:</p>
<ul>
<li>The problems can be broken down into small enough chunks so they can be <span style="text-decoration:underline;">distributed and calculated locally</span>.</li>
<li>The system is designed &#8211; right from the beginning &#8211; to <span style="text-decoration:underline;">engage with hundreds or thousands of machines</span>.</li>
<li>The system can easily handle <span style="text-decoration:underline;">many points of failure transparently</span>.  Auto-replication is one of the ways to prevent this from being a problem.</li>
</ul>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/distro.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;float:right;padding-top:0;border:0;" title="distro" src="http://dennyglee.files.wordpress.com/2012/01/distro_thumb.png?w=135&#038;h=125" alt="distro" width="135" height="125" align="right" border="0" /></a></p>
<p>BOINC projects like <a href="mailto:SETI@Home">SETI@Home</a> are designed to handle the problems associated with distributed computing – network latency, loss of connectivity, tracking tasks, tracking jobs, etc. The fundamental being the ability to break down the problem into a small enough chunks so data can be easily transferred, processed, and transferred back – while keeping track of all of those chunks to ensure that data processing has been completed.</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/elephant.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;float:left;padding-top:0;border:0;" title="Elephant" src="http://dennyglee.files.wordpress.com/2012/01/elephant_thumb.png?w=102&#038;h=85" alt="Elephant" width="102" height="85" align="left" border="0" /></a></p>
<p>Bring this back to data systems, Hadoop and distributed data systems are able to take the problem and distribute this across tens / hundreds / thousands of machines with ease.  This is because they were designed with the idea of distributed processing in the first place (e.g. replication, fault tolerance, task restart-ability, etc.).</p>
<h3>Discussion</h3>
<p>There are many more concepts that need to be covered when we really dive into relational databases compared to Hadoop / Big Data systems.  But the fundamental to start with is that of “scale up” and “scale out”.</p>
<p>As you can see with the Hubble Telescope / SETI projects analogy – both are important and both solve their respective problems in different ways.  This doesn’t mean one is right and one is wrong – this is really more about the adage of “Use the right tool for the right problem”.   After all, it would be really hard to get the amazing images of Hubble telescope by using hundreds or thousands of <a href="http://www.costco.com/Common/Category.aspx?whse=BC&amp;Ne=4000000&amp;eCat=BC|111|22172|1564&amp;N=4007953&amp;pos=1&amp;Nr=P_CatalogName:BC&amp;cat=1564&amp;Ns=P_Price|1||P_SignDesc1&amp;lang=en-US&amp;ec=BC-EC11006-Cat22172&amp;topnav=" target="_blank">smaller commodity telescopes from Costco</a>.   Nor would it be possible for a single powerful telescope to examine all of the radio waves from the observatories on Earth.</p>
<p>So when it comes to scaling up or scale out for data problems – it’s not about which one to use, it&#8217;s about which one to use <span style="text-decoration:underline;">when</span>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dennyglee.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dennyglee.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dennyglee.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dennyglee.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dennyglee.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dennyglee.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dennyglee.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dennyglee.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dennyglee.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dennyglee.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dennyglee.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dennyglee.wordpress.com/854/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dennyglee.wordpress.com/854/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dennyglee.wordpress.com/854/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=854&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dennyglee.com/2012/01/24/scale-up-or-scale-out-your-data-problems-a-space-analogy/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/250a51c7bd19002fe660942b887c149d?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">dennyglee</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/antennae-galaxies.png" medium="image">
			<media:title type="html">Antennae Galaxies</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/database_thumb.png" medium="image">
			<media:title type="html">Database</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/galaxies-to-seti_thumb.png" medium="image">
			<media:title type="html">Galaxies to SETI</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/distro_thumb.png" medium="image">
			<media:title type="html">distro</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/elephant_thumb.png" medium="image">
			<media:title type="html">Elephant</media:title>
		</media:content>
	</item>
		<item>
		<title>Sunny Sunday: Tofino</title>
		<link>http://dennyglee.com/2012/01/23/sunny-sunday-tofino/</link>
		<comments>http://dennyglee.com/2012/01/23/sunny-sunday-tofino/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 09:43:40 +0000</pubDate>
		<dc:creator>dennyglee</dc:creator>
				<category><![CDATA[Foodie Friday]]></category>
		<category><![CDATA[Sunny Sunday]]></category>
		<category><![CDATA[Tofino]]></category>

		<guid isPermaLink="false">https://dennyglee.wordpress.com/?p=862</guid>
		<description><![CDATA[This is a picture of an isolated beach in the wonderful town of Tofino (yes, I actually took it!).&#160; Located on the west coast of Vancouver Island – if you are surfer, camper, hiker, or just plain old nature lover – this is a beautiful place to hang out.&#160; Drive up from Victoria to Nanaimo [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=862&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://dennyglee.files.wordpress.com/2012/01/tofinoisolatedbeach.jpg"><img style="background-image:none;border-bottom:0;border-left:0;padding-left:0;padding-right:0;display:inline;border-top:0;border-right:0;padding-top:0;" title="TofinoIsolatedBeach" border="0" alt="TofinoIsolatedBeach" src="http://dennyglee.files.wordpress.com/2012/01/tofinoisolatedbeach_thumb.jpg?w=325&#038;h=499" width="325" height="499" /></a></p>
<p>This is a picture of an isolated beach in the wonderful town of <strong>Tofino</strong> (yes, I actually took it!).&#160; Located on the west coast of Vancouver Island – if you are surfer, camper, hiker, or just plain old nature lover – this is a beautiful place to hang out.&#160; Drive up from Victoria to Nanaimo (yes, of the famed Nanimo bars) and then cut through Vancouver Island to its west coast – it is a wonderfully scenic drive (this from a person that doesn’t like driving).&#160; </p>
<p>For more information on Tofino, check out <a href="http://en.wikipedia.org/wiki/Tofino,_British_Columbia" target="_blank">Tofino’s Wikipedia page</a> – and check out <a href="http://www.bing.com/images/search?q=tofino+british+columbia&amp;qpvt=tofino+british+columbia&amp;FORM=IGRE" target="_blank">Bing’s images of Tofino</a>.&#160; Oh, and if you go there – I highly suggest the local seafood joint <a href="http://www.schoonerrestaurant.ca/" target="_blank">The Schooner Restaurant</a>.&#160; </p>
<p>&#160;</p>
<p>&#160;</p>
<p>—</p>
<p>About “Sunny Sunday”: The Sunny Sunday blog posts are photos from various travel and/or outdoor (hiking) trips.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dennyglee.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dennyglee.wordpress.com/862/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dennyglee.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dennyglee.wordpress.com/862/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dennyglee.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dennyglee.wordpress.com/862/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dennyglee.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dennyglee.wordpress.com/862/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dennyglee.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dennyglee.wordpress.com/862/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dennyglee.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dennyglee.wordpress.com/862/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dennyglee.wordpress.com/862/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dennyglee.wordpress.com/862/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=862&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dennyglee.com/2012/01/23/sunny-sunday-tofino/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/250a51c7bd19002fe660942b887c149d?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">dennyglee</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/tofinoisolatedbeach_thumb.jpg" medium="image">
			<media:title type="html">TofinoIsolatedBeach</media:title>
		</media:content>
	</item>
		<item>
		<title>Connecting PowerPivot to Hadoop on Azure &#8211; Self Service BI to Big Data in the Cloud</title>
		<link>http://dennyglee.com/2012/01/21/connecting-powerpivot-to-hadoop-on-azure-self-service-bi-to-big-data-in-the-cloud/</link>
		<comments>http://dennyglee.com/2012/01/21/connecting-powerpivot-to-hadoop-on-azure-self-service-bi-to-big-data-in-the-cloud/#comments</comments>
		<pubDate>Sat, 21 Jan 2012 05:57:49 +0000</pubDate>
		<dc:creator>dennyglee</dc:creator>
				<category><![CDATA[BigData]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[PowerPivot]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Excel]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[hive]]></category>

		<guid isPermaLink="false">https://dennyglee.wordpress.com/?p=823</guid>
		<description><![CDATA[. “I caught a fish thiiiiis biiig” &#8211; On stage with Ted Kummert during the PASS 2011 Keynote on Big Data (thanks to Karen Lopez @datachick for the pic) . . During the PASS 2011 Keynote (back in October 2011), I had the honor to demo Hadoop on Windows / Azure.   One of the key [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=823&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://dennyglee.files.wordpress.com/2012/01/pass-2011-keynote-isotope.jpg"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-color:initial;border-style:initial;border-width:0;" title="PASS 2011 Keynote Isotope" src="http://dennyglee.files.wordpress.com/2012/01/pass-2011-keynote-isotope_thumb.jpg?w=353&#038;h=277" alt="PASS 2011 Keynote Isotope" width="353" height="277" align="left" border="0" /></a><br />
<span style="color:#ffffff;">.</span></p>
<blockquote><p>“I caught a fish thiiiiis biiig”</p>
<p>&#8211; On stage with Ted Kummert during the PASS 2011 Keynote on Big Data (thanks to Karen Lopez @datachick for the pic)</p></blockquote>
<p><span style="color:#ffffff;">.</span><br />
<span style="color:#ffffff;">.</span><br />
During the <a href="http://www.sqlpass.org/summit/2011/Live/LiveStreaming/LiveStreamingWednesday.aspx" target="_blank">PASS 2011 Keynote</a> (back in October 2011), I had the honor to demo Hadoop on Windows / Azure.   One of the key showcases during that presentation was to show how to connect PowerPivot to Hadoop on Windows.  In this post, I show the steps on how to connect PowerPivot to Hadoop on Azure.</p>
<h3>Pre-requisites</h3>
<ul>
<li><a href="http://powerpivot.com" target="_blank">PowerPivot for Excel</a> (as of this post, using SQL Server 2012 RC1 version)</li>
<li>Access to <a href="http://hadooponazure.com" target="_blank">Hadoop on Azure CTP</a></li>
</ul>
<p><span style="color:#ffffff;">.</span></p>
<h3>Configuration Steps</h3>
<p><strong>1) Reference the following steps from <a href="http://social.technet.microsoft.com/wiki/contents/articles/how-to-connect-excel-to-hadoop-on-azure-via-hiveodbc.aspx" target="_blank">How To Connect Excel to Hadoop on Azure via HiveODBC</a></strong></p>
<p>The steps to follow are the:</p>
<ul>
<li><span style="text-decoration:underline;">Install the HiveODBC Driver</span> (we will configure the DSN later)</li>
<li>Steps 1 – 3 from <span style="text-decoration:underline;">Using the Excel Hive Add-In</span> to open the ports in Hadoop on Azure</li>
</ul>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image7.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb7.png?w=428&#038;h=343" alt="image" width="428" height="343" border="0" /></a><br />
<span style="color:#ffffff;">.</span><br />
<span style="color:#ffffff;">.</span><br />
<strong>2) Create a Hive ODBC Data Source &gt; File DSN</strong></p>
<p>Here, we will go about creating a File DSN Hive ODBC Data Source.</p>
<blockquote><p>Thanks to <strong>Andrew Brust (@andrewbrust)</strong>, the better way to make a connection from PowerPivot to Hadoop on Azure is to create a File DSN.  This allows the full connection string to be stored directly within the PowerPivot workbook instead of relying on an existing DSN.</p></blockquote>
<p>To do this:</p>
<ul>
<li>Go to the <strong>ODBC Data Sources Administrator</strong> and click on the <strong>File DSN</strong> tab.</li>
</ul>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image8.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb8.png?w=325&#038;h=297" alt="image" width="325" height="297" border="0" /></a></p>
<ul>
<li>Click on <strong>Add, </strong>Choose <strong>HIVE, </strong>Click<strong> Next, </strong>Click <strong>Browse</strong> to choose a location of the file; click <strong>Finish</strong>.</li>
</ul>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image9.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb9.png?w=325&#038;h=258" alt="image" width="325" height="258" border="0" /></a></p>
<ul>
<li>Open the File DSN you just created and click <strong>Configure</strong>.  The <strong>ODBC Hive Setup</strong> and configure the host (e.g. [clustername].cloudapp.net) and authentication information (the username is what you had specified when you had created the cluster)</li>
</ul>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image10.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb10.png?w=323&#038;h=388" alt="image" width="323" height="388" border="0" /></a><br />
<span style="color:#ffffff;">.</span><br />
<span style="color:#ffffff;">.</span><br />
<strong>3) Connect PowerPivot to Hadoop on Azure via the HiveODBC File DSN</strong></p>
<ul>
<li>Open up the PowerPivot ribbon and click on the <strong>Get External Data from Other Sources</strong>.</li>
</ul>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image11.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb11.png?w=287&#038;h=160" alt="image" width="287" height="160" border="0" /></a></p>
<ul>
<li>From the <strong>Table Import Wizard</strong>, click on the <strong>Others (OLEDB/ODBC) </strong>and click<strong> Next.</strong></li>
</ul>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image12.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb12.png?w=387&#038;h=320" alt="image" width="387" height="320" border="0" /></a></p>
<ul>
<li>From here, click <strong>Build</strong> and the <strong>Data Link Properties, </strong>click on <strong>Provider</strong>, and ensure the <strong>Microsoft OLEDB Provider for ODBC Drivers</strong> is selected.<strong> </strong>Click <strong>Next.</strong></li>
</ul>
<ul>
<li>In the <strong>Data Link Properties</strong> dialog, choose “<strong>Use connection string</strong>”, and click <strong>Build </strong>and choose the File DSN you had created from Step #2.  Enter in the password to your Hadoop on Azure cluster.  Click OK.</li>
</ul>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image13.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb13.png?w=355&#038;h=412" alt="image" width="355" height="412" border="0" /></a></p>
<ul>
<li>The <strong>Data Link Properties</strong> now contains a <em>connection string</em> do the Hadoop on Azure cluster.</li>
</ul>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image14.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb14.png?w=264&#038;h=343" alt="image" width="264" height="343" border="0" /></a></p>
<blockquote><p>Note, after this dialog, verify that the password has been entered into the connection string that that has been built into the <strong>Table Import Wizard</strong>.  Note, the blue arrow points to a lack of a PWD=&lt;password&gt; clause.  If the password isn’t specified, make sure to add it back in.</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image15.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb15.png?w=406&#038;h=289" alt="image" width="406" height="289" border="0" /></a></p></blockquote>
<ul>
<li>Click OK, click Next.  From here you will get the <strong>Table Import Wizard</strong> and we are back to the usual PowerPivot steps.</li>
</ul>
<ul>
<li>Click on “Select from a list of tables and views to choose the data to import”</li>
</ul>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image16.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb16.png?w=258&#038;h=308" alt="image" width="258" height="308" border="0" /></a></p>
<ul>
<li>Choose your table (e.g. hivesampletable) and import the data in.</li>
</ul>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image17.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb17.png?w=437&#038;h=338" alt="image" width="437" height="338" border="0" /></a></p>
<p>It looks like a lot of steps but once you get into the flow of things, it’s actually a relatively easy flow.</p>
<p>Enjoy!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dennyglee.wordpress.com/823/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dennyglee.wordpress.com/823/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dennyglee.wordpress.com/823/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dennyglee.wordpress.com/823/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dennyglee.wordpress.com/823/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dennyglee.wordpress.com/823/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dennyglee.wordpress.com/823/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dennyglee.wordpress.com/823/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dennyglee.wordpress.com/823/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dennyglee.wordpress.com/823/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dennyglee.wordpress.com/823/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dennyglee.wordpress.com/823/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dennyglee.wordpress.com/823/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dennyglee.wordpress.com/823/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=823&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dennyglee.com/2012/01/21/connecting-powerpivot-to-hadoop-on-azure-self-service-bi-to-big-data-in-the-cloud/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/250a51c7bd19002fe660942b887c149d?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">dennyglee</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/pass-2011-keynote-isotope_thumb.jpg" medium="image">
			<media:title type="html">PASS 2011 Keynote Isotope</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb7.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb8.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb9.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb10.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb11.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb12.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb13.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb14.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb15.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb16.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb17.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>
	</item>
		<item>
		<title>Travel Tuesday: Top 5 reasons to go to Costco in Taiwan (from a US expat)</title>
		<link>http://dennyglee.com/2012/01/17/travel-tuesday-top-5-reasons-to-go-to-costco-in-taiwan-from-a-us-expat/</link>
		<comments>http://dennyglee.com/2012/01/17/travel-tuesday-top-5-reasons-to-go-to-costco-in-taiwan-from-a-us-expat/#comments</comments>
		<pubDate>Tue, 17 Jan 2012 15:00:33 +0000</pubDate>
		<dc:creator>dennyglee</dc:creator>
				<category><![CDATA[Travel Tuesday]]></category>
		<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">https://dennyglee.wordpress.com/?p=745</guid>
		<description><![CDATA[. . . I love Pittsburgh, they put fries on nachos here. &#8211; Pete Lattimer, Warehouse 13 . . . For those un-familar with the reference, Warehouse 13 is an awesome Syfy show…and Costco is a warehouse store – yeah, weak connection here. Yet another themed blog series Starting with the recent Foodie Friday blog [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=745&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://dennyglee.files.wordpress.com/2012/01/warehouse-13.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;float:left;padding-top:0;border:0;" title="Warehouse 13" src="http://dennyglee.files.wordpress.com/2012/01/warehouse-13_thumb.png?w=395&#038;h=283" alt="Warehouse 13" width="395" height="283" align="left" border="0" /></a></p>
<p><font color="#ffffff">.</font></p>
<p><font color="#ffffff">.</font></p>
<p><font color="#ffffff">.</font></p>
<blockquote><p>I love Pittsburgh, they put fries on nachos here.</p>
<p>&#8211; Pete Lattimer, Warehouse 13</p></blockquote>
<p><font color="#ffffff">.</font></p>
<p><font color="#ffffff">.</font></p>
<p><font color="#ffffff">.</font><br />
<em>For those un-familar with the reference, Warehouse 13 is an awesome Syfy show…and Costco is a warehouse store – yeah, weak connection here.</em></p>
<h3>Yet another themed blog series</h3>
<p>Starting with the recent Foodie Friday blog post (<a href="http://dennyglee.com/2012/01/13/foodie-friday-taiwanese-dessert-%e8%8a%8b%e5%9c%93/" target="_blank">Foodie Friday: Taiwanese dessert 芋圓</a>), figured I should add another non-geek themed blog post series – Travel Tuesday – probably every week or two, eh?!</p>
<h3>Okay…so what’s this about Costco?</h3>
<p>yeah right – so back to the title of this post – what are the top 3 reasons to go to Costco in Taiwan (from a US ex-pat).  The background is that my family and I are hanging out in Taiwan for the next few month – so lots of weird tips and foodie tips from Taiwan over the next few months, eh?!</p>
<p><strong>5) Well, it’s Costco after all!</strong></p>
<p>Where else can I buy enough toilet paper to survive the next ice age?  Or have such easy returns?  Or have actually good customer service?  But what’s great is that this is the same here in Taiwan too!</p>
<p>It’s still a huge warehouse with tonnes of the stuff that us ex-pats recognize but also plenty of stuff that’s made for the local market such as great Korean pears, Japanese quality fruit, etc.</p>
<p><strong>4) OMG &#8211; Spacious Parking!!</strong></p>
<p>If you drive around in Taiwan – almost anywhere in Taiwan that isn’t a highway – you are absolutely surrounded by a million scooters.  And if you don’t get a migraine from avoiding running over any of the scooters, parking in Taiwan…well, parking just sucks.  Small parking spaces, vehicles parking in places that just …aren’t parking spaces (e.g. in the middle of the road), …, ugh!</p>
<p>And at Costco – that’s just nice.  Wide lanes so two cars can actually comfortably fit, plenty of spots, and most importantly – spacious parking spots so I can actually park, get out of the car, and not worry that another car will trap me from getting into the car.</p>
<p><strong>3) “Reminds me” of Seattle</strong></p>
<p>And as many of you know, I’m based out of Seattle… and so is Costco! Costco’s home office is in Issaquah (suburb of Seattle). Costco’s Kirkland brand name is in homage to their original headquarters which was in Kirkland, WA (another suburb of Seattle).</p>
<p>So whenever we’re missing home – we just head off to Costco and we’re good to go!  Sort of reminiscent Garbage’s Only Happy When it Rains!</p>
<div id="scid:5737277B-5D6D-4f48-ABFC-DD9C333F4C5D:39a7c3b9-73d6-49b3-a5e0-7a3a7a37e7b9" class="wlWriterEditableSmartContent" style="display:inline;float:none;margin:0;padding:0;">
<div><span style="text-align:center; display: block;"><a href="http://dennyglee.com/2012/01/17/travel-tuesday-top-5-reasons-to-go-to-costco-in-taiwan-from-a-us-expat/"><img src="http://img.youtube.com/vi/-aWcXlG1sgY/2.jpg" alt="" /></a></span></div>
</div>
<p><strong>2) The ability bulk order scotch</strong></p>
<p>Yeah, that’s just cool!  ‘nuff said!</p>
<p>…</p>
<p>But the primary reason you want to go to Costco when in Taiwan (from a US-expat)</p>
<p><strong>1) Toilet Seat Covers!</strong></p>
<p>I don’t think I need (nor do you want me) to explain this one!</p>
<blockquote><p>Confucius says: Man who stand on toilet is high on pot!</p></blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dennyglee.wordpress.com/745/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dennyglee.wordpress.com/745/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dennyglee.wordpress.com/745/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dennyglee.wordpress.com/745/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dennyglee.wordpress.com/745/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dennyglee.wordpress.com/745/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dennyglee.wordpress.com/745/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dennyglee.wordpress.com/745/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dennyglee.wordpress.com/745/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dennyglee.wordpress.com/745/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dennyglee.wordpress.com/745/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dennyglee.wordpress.com/745/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dennyglee.wordpress.com/745/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dennyglee.wordpress.com/745/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=745&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dennyglee.com/2012/01/17/travel-tuesday-top-5-reasons-to-go-to-costco-in-taiwan-from-a-us-expat/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/250a51c7bd19002fe660942b887c149d?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">dennyglee</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/warehouse-13_thumb.png" medium="image">
			<media:title type="html">Warehouse 13</media:title>
		</media:content>
	</item>
		<item>
		<title>Sunny Sunday: Rattlesnake Ledge</title>
		<link>http://dennyglee.com/2012/01/15/sunny-sunday-rattlesnake-ledge/</link>
		<comments>http://dennyglee.com/2012/01/15/sunny-sunday-rattlesnake-ledge/#comments</comments>
		<pubDate>Sun, 15 Jan 2012 15:00:48 +0000</pubDate>
		<dc:creator>dennyglee</dc:creator>
				<category><![CDATA[Sunny Sunday]]></category>
		<category><![CDATA[Outdoor]]></category>

		<guid isPermaLink="false">https://dennyglee.wordpress.com/?p=761</guid>
		<description><![CDATA[When you hike up to Rattlesnake Ridge, you get a nice 270’ view of Rattlesnake Ridge area.  For more info, check out: ttp://www.wta.org/go-hiking/hikes/rattle-snake-ledge &#8212; About “Sunny Sunday”: The Sunny Sunday blog posts are photos from various travel and/or outdoor (hiking) trips.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=761&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://dennyglee.files.wordpress.com/2012/01/026_23a.jpg"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="026_23A" src="http://dennyglee.files.wordpress.com/2012/01/026_23a_thumb.jpg?w=491&#038;h=349" alt="026_23A" width="491" height="349" border="0" /></a></p>
<p>When you hike up to Rattlesnake Ridge, you get a nice 270’ view of Rattlesnake Ridge area.  For more info, check out: <a title="ttp://www.wta.org/go-hiking/hikes/rattle-snake-ledge" href="http://www.wta.org/go-hiking/hikes/rattle-snake-ledge">ttp://www.wta.org/go-hiking/hikes/rattle-snake-ledge</a></p>
<p>&#8212;</p>
<p>About “Sunny Sunday”: The Sunny Sunday blog posts are photos from various travel and/or outdoor (hiking) trips.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dennyglee.wordpress.com/761/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dennyglee.wordpress.com/761/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dennyglee.wordpress.com/761/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dennyglee.wordpress.com/761/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dennyglee.wordpress.com/761/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dennyglee.wordpress.com/761/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dennyglee.wordpress.com/761/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dennyglee.wordpress.com/761/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dennyglee.wordpress.com/761/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dennyglee.wordpress.com/761/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dennyglee.wordpress.com/761/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dennyglee.wordpress.com/761/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dennyglee.wordpress.com/761/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dennyglee.wordpress.com/761/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=761&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dennyglee.com/2012/01/15/sunny-sunday-rattlesnake-ledge/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/250a51c7bd19002fe660942b887c149d?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">dennyglee</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/026_23a_thumb.jpg" medium="image">
			<media:title type="html">026_23A</media:title>
		</media:content>
	</item>
		<item>
		<title>Hadoop on Azure: HiveQL query against Azure Blob Storage</title>
		<link>http://dennyglee.com/2012/01/15/hadoop-on-azure-hiveql-query-against-azure-blob-storage/</link>
		<comments>http://dennyglee.com/2012/01/15/hadoop-on-azure-hiveql-query-against-azure-blob-storage/#comments</comments>
		<pubDate>Sun, 15 Jan 2012 11:44:22 +0000</pubDate>
		<dc:creator>dennyglee</dc:creator>
				<category><![CDATA[BigData]]></category>
		<category><![CDATA[Azure]]></category>
		<category><![CDATA[Blobstore]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[hive]]></category>

		<guid isPermaLink="false">https://dennyglee.wordpress.com/?p=789</guid>
		<description><![CDATA[The posting Setup Azure Blob Store for Hadoop on Azure CTP provides a quick way to upload files to your Azure Blob storage account and connect Hadoop on Azure CTP to it.  Now that you have done that, one of the first things you may want to do is to interact with the data. To [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=789&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The posting Setup <a href="http://dennyglee.com/2012/01/15/setup-azure-blob-store-for-hadoop-on-azure-ctp/">Azure Blob Store for Hadoop on Azure CTP</a> provides a quick way to upload files to your Azure Blob storage account and connect Hadoop on Azure CTP to it.  Now that you have done that, one of the first things you may want to do is to interact with the data.</p>
<p>To do this, let’s create a Hive table within Hadoop on Azure CTP that is connected to the files you uploaded to your Azure Blob storage account and query it.  We will be referencing the scenario noted at: <a href="http://social.technet.microsoft.com/wiki/contents/articles/6628.aspx">Hadoop on Azure Scenario: Query a web log via HiveQL</a></p>
<p>The tasks we will be performing are:</p>
<ol>
<li>Setup Azure Blob Store for Hadoop on Azure CTP</li>
<li>Create a Hive table referencing the files in the Azure Blob Storage account</li>
<li>Execute a simple query</li>
</ol>
<p><strong>1) Setup Azure Blob Store for Hadoop on Azure CTP</strong></p>
<p>To do this, please refer to <a href="http://dennyglee.com/2012/01/15/setup-azure-blob-store-for-hadoop-on-azure-ctp/" target="_blank">Setup Azure Blob Store for Hadoop on Azure CTP</a></p>
<p><font color="#ffffff">.</font></p>
<p><strong>2) Create a Hive table referencing the files in the Azure Blob Storage account</strong></p>
<p>Following the <a href="http://social.technet.microsoft.com/wiki/contents/articles/6628.aspx">Hadoop on Azure Scenario: Query a web log via HiveQL</a> scenario</p>
<ul>
<li>Go to the Hadoop on Azure Interactive Hive Console</li>
<li>Create a Hive table using the statement below</li>
</ul>
<p><span style="font-family:Courier New;">CREATE EXTERNAL TABLE weblog_sample_asv (<br />
evtdate STRING,<br />
evttime STRING,<br />
svrsitename STRING,<br />
svrip STRING,<br />
csmethod STRING,<br />
csuristem STRING,<br />
csuriquery STRING,<br />
svrport INT,<br />
csusername STRING,<br />
cip STRING,<br />
UserAgent STRING,<br />
Referer STRING,<br />
scstatus STRING,<br />
scsubstatus STRING,<br />
scwin32status STRING,<br />
scbytes STRING,<br />
csbytes STRING,<br />
timetaken STRING<br />
)<br />
COMMENT &#8216;This is a web log sample ASV&#8217;<br />
ROW FORMAT DELIMITED FIELDS TERMINATED by &#8217;32&#8242;<br />
STORED AS TEXTFILE<br />
LOCATION<span style="color:#0000ff;"><strong> &#8216;asv://weblog/sample&#8217;</strong></span>;</span></p>
<p>Note that the only difference between the original HiveQL script (which goes to HDFS) and the one that goes to the Azure Blob storage is the highlighted LOCATION statement using the <em>asv</em> protocol.</p>
<blockquote><p>NOTE: As noted in <a href="http://dennyglee.com/2012/01/15/setup-azure-blob-store-for-hadoop-on-azure-ctp/" target="_blank">Setup Azure Blob Store for Hadoop on Azure CTP</a>, we are using the protocol of asv://&lt;container&gt;/&lt;folder&gt; so that way its possible for Hadoop to view any and all files uploaded to the sample folder.</p></blockquote>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image5.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb5.png?w=541&#038;h=355" alt="image" width="541" height="355" border="0" /></a></p>
<p>&nbsp;</p>
<p><strong>3. Execute a simple query</strong></p>
<p>Now that you have created a Hive EXTERNAL table that points to the files located in the weblog/sample folder of your Azure Blob storage account, you can now query it.</p>
<p>The query below is the result from:</p>
<p><span style="font-family:Courier New;">select * from weblog_sample_asv limit 10;</span></p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image6.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb6.png?w=539&#038;h=354" alt="image" width="539" height="354" border="0" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dennyglee.wordpress.com/789/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dennyglee.wordpress.com/789/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dennyglee.wordpress.com/789/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dennyglee.wordpress.com/789/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dennyglee.wordpress.com/789/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dennyglee.wordpress.com/789/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dennyglee.wordpress.com/789/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dennyglee.wordpress.com/789/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dennyglee.wordpress.com/789/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dennyglee.wordpress.com/789/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dennyglee.wordpress.com/789/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dennyglee.wordpress.com/789/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dennyglee.wordpress.com/789/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dennyglee.wordpress.com/789/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=789&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dennyglee.com/2012/01/15/hadoop-on-azure-hiveql-query-against-azure-blob-storage/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/250a51c7bd19002fe660942b887c149d?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">dennyglee</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb5.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb6.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>
	</item>
		<item>
		<title>Setup Azure Blob Store for Hadoop on Azure CTP</title>
		<link>http://dennyglee.com/2012/01/15/setup-azure-blob-store-for-hadoop-on-azure-ctp/</link>
		<comments>http://dennyglee.com/2012/01/15/setup-azure-blob-store-for-hadoop-on-azure-ctp/#comments</comments>
		<pubDate>Sun, 15 Jan 2012 11:16:01 +0000</pubDate>
		<dc:creator>dennyglee</dc:creator>
				<category><![CDATA[BigData]]></category>
		<category><![CDATA[Azure]]></category>
		<category><![CDATA[Blobstore]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Javascript]]></category>

		<guid isPermaLink="false">https://dennyglee.wordpress.com/?p=774</guid>
		<description><![CDATA[One of the cool ways to run Hadoop on Azure is to have it connect to Azure Blob storage via your Windows Azure Storage account.  To setup your Azure storage account, please refer to http://windows.azure.com. The tasks below will allow you to setup your Hadoop on Azure CTP account to connect to an existing Azure [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=774&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>One of the cool ways to run Hadoop on Azure is to have it connect to Azure Blob storage via your Windows Azure Storage account.  To setup your Azure storage account, please refer to <a href="http://windows.azure.com">http://windows.azure.com</a>. The tasks below will allow you to setup your Hadoop on Azure CTP account to connect to an existing Azure Blob Storage account using the <em>asv</em> protocol.  For example, within Hadoop, you normally would get a listing of files within HDFS using the command line interface:</p>
<p><span style="font-family:Courier New;">hadoop fs –ls /</span></p>
<p>In the case of accessing files within Azure Blob storage, you can run the command:</p>
<p><span style="font-family:Courier New;">hadoop fs –ls asv://&lt;container&gt;/&lt;folder&gt;</span></p>
<p>The basic steps are:</p>
<ol>
<li>Obtain the Azure Blobstore Storage Account Name and Access Key.</li>
<li>Set up ASV connection between Hadoop on Azure CTP and your Windows Azure Blob Storage account.</li>
<li>Upload files to your Azure Blob Storage account</li>
</ol>
<p><strong>1) Obtain the Azure Blobstore Storage Account Name and Access Key</strong></p>
<p>Access your Azure Blobstore Storage account through the Windows Azure Platform dashboard via <a href="http://windows.azure.com/">http://windows.azure.com/</a>.  From here, the navigation path is [Hosted Services, Storage Accounts &amp; CDN] (bottom left) –&gt; [Storage Accounts] (mid-top left).</p>
<ul>
<li>The name blobstore account name is the Storage Account under the subscription as noted within the middle pane.  In this case, I have a storage account called isocatstore.</li>
<li>To get the access key, click on the [View] button on the properties right pane after clicking on the storage account in question.</li>
</ul>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image1.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb1.png?w=476&#038;h=462" alt="image" width="476" height="462" border="0" /></a></p>
<p>&nbsp;</p>
<p><strong>2)</strong> <strong>Set up ASV connection between Hadoop on Azure CTP and your Windows Azure Blob Storage account.</strong></p>
<p>From the Hadoop on Azure CTP portal page, click on the [Manage Data] tile.  From here, click on the [Set up ASV] button on the right.</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/manage-data.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="Manage Data" src="http://dennyglee.files.wordpress.com/2012/01/manage-data_thumb.png?w=444&#038;h=217" alt="Manage Data" width="444" height="217" border="0" /></a></p>
<p>From here, you can supply the credentials of your Azure Blob Storage account that you had obtained in Step 1.</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image2.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border-width:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb2.png?w=436&#038;h=186" alt="image" width="436" height="186" border="0" /></a></p>
<p>Click on [Save Settings] and you are good to go.</p>
<p>&nbsp;</p>
<p><strong>3) Upload files to your Azure Blob Storage account</strong></p>
<p>A great way to upload files to your Azure Blob Storage account is to use CloudXplorer – you can download it from here: <a title="http://clumsyleaf.com/products/cloudxplorer" href="http://clumsyleaf.com/products/cloudxplorer">http://clumsyleaf.com/products/cloudxplorer</a></p>
<blockquote><p><strong>NOTE: </strong>When you upload the files, please ensure to place the files within a folder within a container of your blobstore account.  It is important to do this so that way Hadoop will be able to list all of the files within the folder instead of you needing to access each file individually (which is what would happen if you placed the files directly within the container).</p></blockquote>
<p>From CloudXplorer, you can quickly create a container and a folder; in this case, I had created the <strong>weblog</strong> container and the <strong>sample</strong> folder.</p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image3.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb3.png?w=428&#038;h=258" alt="image" width="428" height="258" border="0" /></a></p>
<p>Using the intuitive UI, copy your files from your local box to the Azure Blob Storage account.</p>
<p>By doing it in this fashion, you will be able to get a listing of your files from the Hadoop command line interface using the command:</p>
<p><span style="font-family:Courier New;">hadoop fs –ls asv://weblog/sample</span></p>
<p>As well, from the Hadoop on Azure JavaScript Interface, you can view a listing of files using the command</p>
<p><span style="font-family:Courier New;">#ls asv://weblog/sample</span></p>
<p><a href="http://dennyglee.files.wordpress.com/2012/01/image4.png"><img style="background-image:none;padding-left:0;padding-right:0;display:inline;padding-top:0;border:0;" title="image" src="http://dennyglee.files.wordpress.com/2012/01/image_thumb4.png?w=502&#038;h=203" alt="image" width="502" height="203" border="0" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/dennyglee.wordpress.com/774/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/dennyglee.wordpress.com/774/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/dennyglee.wordpress.com/774/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/dennyglee.wordpress.com/774/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/dennyglee.wordpress.com/774/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/dennyglee.wordpress.com/774/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/dennyglee.wordpress.com/774/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/dennyglee.wordpress.com/774/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/dennyglee.wordpress.com/774/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/dennyglee.wordpress.com/774/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/dennyglee.wordpress.com/774/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/dennyglee.wordpress.com/774/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/dennyglee.wordpress.com/774/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/dennyglee.wordpress.com/774/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=dennyglee.com&amp;blog=10400510&amp;post=774&amp;subd=dennyglee&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://dennyglee.com/2012/01/15/setup-azure-blob-store-for-hadoop-on-azure-ctp/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/250a51c7bd19002fe660942b887c149d?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">dennyglee</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb1.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/manage-data_thumb.png" medium="image">
			<media:title type="html">Manage Data</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb2.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb3.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>

		<media:content url="http://dennyglee.files.wordpress.com/2012/01/image_thumb4.png" medium="image">
			<media:title type="html">image</media:title>
		</media:content>
	</item>
	</channel>
</rss>
