Updated HDInsight on Azure ASV paths for multiple storage accounts

HDInsight-160x160 If you’ve joined the HDInsight Preview – you will notice many new changes including the tight integration with Windows Azure and that HDInsight defaults to ASV.  As noted in Why use Blob Storage with HDInsight on Azure, there are some interesting technical (performance) and business reasons for utilizing Azure storage accounts. But if you had been playing with the HadoopOnAzure.com beta and switched over to the Windows Azure HDInsight Service Preview – you’ll may have noticed a quick change in the way asv paths work.  Here’s a quick cheat sheet for you.

In general, to access ASV sources

#ls asv://$container$@$storage_account$.blob.core.windows.net/$path$

The exception is the default container which was created when you originally setup your cluster.  For example, my storage account is “doctorwho” and the container (which is the name of my HDInsight cluster) is “caprica” (Yes, I’m mixing Battlestar Galactica and Doctor Who – deal with it!):

#ls asv://caprica@doctorwho.blob.core.windows.net/

Yet because this is also the default container / storage account, you can also just go:

#ls /

If you want to access another container in the same storage account, you’ll have to specify the entire statement.  For example, if I wanted to access the rainier container, muir folder in my doctorwho account

#ls asv://rainier@doctorwho.blob.core.windows.net/muir

As well, if you want to access a completely separate storage account, provided you have specified the account information within the core-site.xml (more info below), then you can follow the same path.  For example, if I wanted to access the ultimate container, frisbee folder in my riversong account:

#ls asv://ultimate@riversong.blob.core.windows.net/frisbee

Note, for the above to work, you will need to modify your core-site.xml and add a fs.azure.account.key.$full account path$ – the template would look like:

<property>
<name>fs.azure.account.key.$account$.blob.core.windows.net</name>
<value>$account-key$</value>
</property>

For my riversong account, it would look like:

<property>
<name>fs.azure.account.key.riversong.blob.core.windows.net</name>
<value>$riversong-account-key$</value>
</property>

Enjoy!

9 thoughts on “Updated HDInsight on Azure ASV paths for multiple storage accounts

  1. Twice in the last week my day has gone something like this: I think to myself “I need this…” and your blog say “here you go….” Thanks.

  2. Pingback: Azure HDInsight – persistent data with blob storage and SQL Server (part 2) | robertsahlin.com

  3. Pingback: Access Azure Blob Stores from HDInsight - Cindy Gross - SQL Server and Big Data Troubleshooting + Tips - Site Home - MSDN Blogs

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s