Table of Contents
This chapter explores some of the challenges in adopting Hadoop in to a company.
Hadoop is a new technology, and as with adopting any new technology, finding people who know the technology is difficult!
Hadoop is designed to solve Big Data problems encountered by Web and Social companies. In doing so a lot of the features Enterprises need or want are put on the back burner. For example, HDFS does not offer native support for security and authentication.
The development and admin tools for Hadoop are still pretty new. Companies like Cloudera, Hortonworks, MapR and Karmasphere have been working on this issue. How ever the tooling may not be as mature as Enterprises are used to (as say, Oracle Admin, etc.)
Hadoop runs on 'commodity' hardware. But these are not cheapo machines, they are server grade hardware. For more see Chapter 14, Hardware and Software for Hadoop
So standing up a reasonably large Hadoop cluster, say 100 nodes, will cost a significant amount of money. For example, lets say a Hadoop node is $5000, so a 100 node cluster would be $500,000 for hardware.
Solving problems using Map Reduce requires a different kind of thinking. Engineering teams generally need additional training to take advantage of Hadoop.
Hadoop version 1 had a single point of failure problem because of NameNode. There was only one NameNode for the cluster, and if it went down, the whole Hadoop cluster would be inoperable. This has prevented the use of Hadoop for mission critical, always-up applications.
This problem is more pronounced on paper than in reality. Yahoo did a study that analyzed their Hadoop cluster failures and found that only a tiny fraction of failures were caused by NameNode failure.
TODO : link
However, this problem is being solved by various Hadoop providers. See ??? chapter for more details.