Hadoop for Developers :: Hadoop Illuminated

	Hadoop Illuminated > Hadoop for Developers

Chapter 6. Hadoop for Developers

This section is a quick 'fact sheet' in a Q&A format.

What is Hadoop?

Hadoop is an open source software stack that runs on a cluster of machines. Hadoop provides distributed storage and distributed processing for very large data sets.

Is Hadoop a fad or here to stay?

Sure, Hadoop and Big Data are all the rage now. But Hadoop does solve a real problem and it is a safe bet that it is here to stay.

Below is a graph of Hadoop job trends from Indeed.com. As you can see, demand for Hadoop skills has been up and up since 2009. So Hadoop is a good skill to have!

Figure 6.1. Hadoop Job Trends

What skills do I need to learn Hadoop?

A hands-on developer or admin can learn Hadoop. The following list is a start - in no particular order

Hadoop is written in Java. So knowing Java helps
Hadoop runs on Linux, so you should know basic Linux command line navigation skills
Some Linux scripting skills will go a long way

What kind of technical roles are available in Hadoop?

The following should give you an idea of the kind of technical roles in Hadoop.

Table 6.1. Hadoop Roles

Job Type	Job functions	Skills
Hadoop Developer	develops MapReduce jobs, designs data warehouses	Java, Scripting, Linux
Hadoop Admin	manages Hadoop cluster, designs data pipelines	Linux administration, Network Management, Experience in managing large cluster of machines
Data Scientist	Data mining and figuring out hidden knowledge in data	Math, data mining algorithms
Business Analyst	Analyzes data!	Pig, Hive, SQL superman, familiarity with other BI tools

I am not a programmer, can I still use Hadoop?

Yes, you don't need to write Java Map Reduce code to extract data out of Hadoop. You can use Pig and Hive. Both Pig and Hive offer 'high level' Map Reduce. For example you can query Hadoop using SQL in Hive.

What kind of development tools are available for Hadoop?

Hadoop development tools are still evolving. Here are a few:

Karmasphere IDE : tuned for developing for Hadoop
Eclipse and other Java IDEs : When writing Java code
Command line editor like VIM : No matter what editor you use, you will be editing a lot of files / scripts. So familiarity with CLI editors is essential.

Where can I learn more?

Books
- Tom White's Hadoop Book : This is the 'Hadoop Bible'
- This book :-)
Newsgroups : If you have any questions:
- Apache mailing lists
Meetups : to meet like minded people. Find the one closest to you at meetup.com


Chapter 5. Hadoop for Executives		Chapter 7. Soft Introduction to Hadoop