If you’ve joined the HDInsight Preview – you will notice many new changes including the tight integration with Windows Azure and that HDInsight defaults to ASV. As noted in Why use Blob Storage with HDInsight on Azure, there are some interesting technical (performance) and business reasons for utilizing Azure storage accounts. But if you had been playing with the HadoopOnAzure.com beta and switched over to the Windows Azure HDInsight Service Preview – you’ll may have noticed a quick change in the way asv paths work. Here’s a quick cheat sheet for you.
In general, to access ASV sources
The exception is the default container which was created when you originally setup your cluster. For example, my storage account is “doctorwho” and the container (which is the name of my HDInsight cluster) is “caprica” (Yes, I’m mixing Battlestar Galactica and Doctor Who – deal with it!):
Yet because this is also the default container / storage account, you can also just go:
If you want to access another container in the same storage account, you’ll have to specify the entire statement. For example, if I wanted to access the rainier container, muir folder in my doctorwho account
As well, if you want to access a completely separate storage account, provided you have specified the account information within the core-site.xml (more info below), then you can follow the same path. For example, if I wanted to access the ultimate container, frisbee folder in my riversong account:
Note, for the above to work, you will need to modify your core-site.xml and add a fs.azure.account.key.$full account path$ – the template would look like:
For my riversong account, it would look like: