Note, the title of this post (and for that matter, most of this post) is a direct quote from Kovas Boguta’s post: Hadoop & Startups: Where Open Source Meets Business Data.
Amy: Please tell me you have a plan.
The Doctor: No, I have a thing. It’s like a plan, but with more greatness
For starters, I highly suggest you read this insightful article about Hadoop and Big Data by Kovas Boguta: Hadoop & Startups: Where Open Source Meets Business Data.
It is a great article and I would also like to call out that this new way of thinking is not just for startups. It makes sense for established enterprise businesses that want to do the same thing: make sense of the massive volume of data using a cluster of commodity class hardware.
While Hadoop may be the technology du jour – it is actually quite powerful and it is impressive how robust map/reduce is to solving some pretty complex problems. Even if you feel there are better algorithms to solve these problems, I will borrow Werner Vogels (@Werner) tweet:
scaling data systems in real life has humbled me. I would not dare criticize an architecture that the holds social graphs of 750M and works
It’s not just about the technology
Yet with all of Hadoop’s capabilities, the movement isn’t just about the technology itself. With students, engineers and companies building up their Hadoop know-how, there are some key facets that Boguta covers
- It’s all about the data – as noted in my own post The potential of Big Data“, it’s becoming every more apparent that the real IP isn’t the technology but the data
- The Big Data movement in general (which is synonymous with Hadoop) is the golden child of OSS achievement – getting many different engineers with very different backgrounds working together to solve some of the most complex problems.
- There are so many amazing offshoots from this Big Data movement – Cassandra, CouchDB, MongoDB, Membase – just to name a few. Heck, there is a great presentation talking about how Netflix migrated its DataCenter Oracle to Global Cassandra (Slides)
Again, Boguta’s quote covers it the best about this movement:
In short, Hadoop has the potential to make the enterprise compatible with the entire rest of the open-source and startup world…
“It’s like a plan, but with more greatness”
I want to add more but Boguta’s post says it more eloquently than I every would:
This illustrates a new thesis or collective wisdom emerging from the Valley: If a technology is not your core value-add, it should be open-sourced because then others can improve it, and potential future employees can learn it. This rising tide has lifted all boats, and is just getting started.
Why am I excited about Hadoop and Big Data even though I’m a Microsoft BI person for most of my career? Because first and foremost, BI is all about making sense of the information. And the greatness of Big Data isn’t just about exploring, understanding, and asking even more questions of this information, but doing it in distribution (vs. silos) and putting more emphasis on the data (i.e. this is where the real IP is)
The quote “It’s like a plan, but more greatness” is from Doctor Who (Season 5) episode “Vincent and the Doctor”. For those of you who are Vincent Van Gogh fans (or art fans in general), the episode has an enjoyable and creative interpretive peek into Van Gogh’s depiction of a starry night and the world around him.
We’re so lucky we’re still alive to see this beautiful world. Look at the sky. It’s not dark and black and without character. The black is in fact deep blue. And over there! Lighter blue. And blowing through the blueness and the blackness, the winds swirling through the air. And there shining, burning, bursting through, the stars! Can you see how they roll their light? Everywhere we look, complex magic of nature blazes before our eyes!