Quick Tech Tip: SETting Cloudera Hue Beeswax to create a compressed Hive table

I’m currently playing with CDH 4.1 and was having fun with Hue – specifically Beeswax to execute Hive queries from a nice web UI. As noted in Hadoop compression codecs and optimizing Hive joins (and using compression to do it), using compression gives you more space and in many cases can improve query performance.  Yet to my dismay, when I tried to execute a bunch of SET statements, I ended up getting  the OK FAILED parse exception.

beeswax uui ParseException error

Of course this is what happens when you haven’t played a particular tech in awhile and don’t bother to do tutorials!  On the left panel of Beeswax, there is a Settings panel which allows you to add whatever key-value pair settings you deem fit (with autofill of various but not all settings).

set mapred.compress.map.output=true;
set mapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec; 
set hive.exec.compress.output=true;

In this case,  I just filled the settings directly to the settings panel, and then proceeded to run my Hive queries to create my compressed table (don’t forget to create the table as a SEQUENCEFILE).

beeswax ui q

Hope this helps!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s