Chapter 13.  Business Intelligence Tools For Hadoop and Big Data

Table of Contents

13.1. The case for BI Tools
13.2. BI Tools Feature Matrix Comparison
13.3. Glossary of terms

13.1.  The case for BI Tools

Analytics for Hadoop can be done by the following:

  • Writing custom Map Reduce code using Java, Python, R ..etc
  • Using high level Pig scripts
  • Using SQL using Hive

How ever doing analytics like this can feel a little pedantic and time consuming. Business INtelligence tools (BI tools for short) can address this problem.

BI tools have been around since before Hadoop. Some of them are generic, some are very specific towards a certain domain (e.g. Telecom, Health Care ..etc). BI tools provide rich, user friendly environment to slice and dice data. Most of them have nice GUI environments as well.

13.2.  BI Tools Feature Matrix Comparison

Since Hadoop is gaining popularity as a data silo, a lot of BI tools have added support to Hadoop. In this chapter we will look into some BI tools that work with Hadoop.

We are trying to present capabilities of BI tools in an easy to compare feature matrix format. This is a 'living' document. We will keep it updated as new versions and new features surface.

This matrix is under construction

How to read the matrix?
Y - feature is supported
N - feature is NOT supported
? or empty - unknown

Read the legend for feature descriptions.

Table 13.1. BI Tools Comparison : Data Access and Management

BI toolAccess raw data on HadoopManage data on HDFSImport/Export data into/out of HDFSTransparent compressionData RetentionData validation
Datameer Y Y Y Y Y
Tableau Y
Pentaho Y

Table 13.2. BI Tools Comparison : Analytics

BI toolpre-built analyticsPredictive analyticsTime series forecastingRecommendation engineAnalytics app store
Datameer Y Y Y
Tableau Y
Pentaho Y Y Y

Table 13.3. BI Tools Comparison : Visualizing

BI toolVisual query designerRich widgetsMultiple platforms (web,mobile)Interactive dashboardsShare with othersLocal rendering
Datameer Y Y Y Y
Tableau Y Y Y Y Y Y
Pentaho Y Y Y Y

Table 13.4. BI Tools Comparison : Connectivity

BI toolHadoopHBaseCloudera ImpalaCassandraMongoDBRelational databasesVerticaTeradata / AsterNetezzaSAP HANAAmazon RedShift
Datameer Y Y Y Y Y
Tableau Y Y
Pentaho Y Y Y Y Y Y Y Y

Table 13.5. BI Tools Comparison : Misc

BI toolSecurityRole based permissionsSupports multiple Hadoop DistributionsSupports Hadoop on CloudHosted analyticsFree evaluation
Datameer Y

(LDAP, Active Directory, Kerberos)
Y Y Y (Amazon) N Y
Tableau Y

(LDAP, Active Directory, Kerberos)
Pentaho Y

(LDAP, Active Directory, Kerberos)

13.3.  Glossary of terms

Data Validataion

Can validate data confirms to certain limits, can do cleansing and de-duping.

Share with others

Can share the results with others within or outside organization easily. (Think like sharing a document on DropBox or Google Drive)

Local Rendering

You can slice and dice data on locally on a computer or tablet. This uses the CPU power of the device and doesn't need a round-trip to a 'server' to process results. This can speed up ad-hoc data exploration

Analytics 'app store'

The platform allows customers to buy third party analytics app. Think like APple App Store

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Creative Commons License