Dashboard Engine for Hadoop

Think Big Introduces Dashboard Engine for Fast Access to Hadoop Data


Today, Think Big, a Teradata company, introduces Dashboard Engine for Hadoop.  As the name would suggest, it is a backend component that allows you to build dashboards from data in your Hadoop cluster.  The Dashboard Engine gives you fast access to Hadoop data so that you can make quicker business decisions.  We realized the need for such a product after watching many of our customers try to build dashboards using the various SQL-on-Hadoop technologies available in the Hadoop ecosystem, such as Impala, Presto and Spark SQL.  These products are terrific, and in many situations can answer questions in a second or two.  But, as “engines” for deploying a dashboard they fall woefully short.


Impala, Hive, Presto and Spark SQL achieve their performance by locking down the underlying source data in RAM as much as possible and distributing both data and computation to nodes throughout the cluster.  However data volumes of today’s enterprises are growing far faster than RAM prices are dropping.


Why You Need the Dashboard Engine for Hadoop


Consider for a moment the example of a high-traffic website. Let’s suppose you want to know how much traffic has grown year over year.  You may want your dashboard to present the number of unique users. Or, the number of page views for this year compared with last year.


Performing this analysis requires scanning two years’ worth of data.  Now there may be a Hadoop cluster big enough to handle this query with an acceptable latency, but having multiple people simultaneously firing off queries similar to this one—at the same time—would quickly bring your dashboard to its knees.


The Dashboard Engine for Hadoop takes a different approach.  Instead of answering questions at query time in a compute-intensive manner, it pre-computes answers to the values of all possible aggregates that may be used to answer these questions at regular intervals (i.e. hourly, or some other defined timeframe).


Pre-computed answers are stored in HBase where they can be rapidly retrieved and served via an API.  If the user wants to know the value of a metric for a longer period, like a day or a year, the API can—in milliseconds—retrieve all stored values and then compute the value for the longer period.


Dashboard Engine for Hadoop can compute and store billions of pre-computed values in minutes on a small cluster.   No matter how many aggregates are being pre-computed, the Dashboard Engine for Hadoop only has to scan the source data once.  Dashboard users can now get almost immediate answers to questions involving deep dimensional drill down, even when there may be hundreds of simultaneous users.


Dashboard Engine for Hadoop currently supports dashboards written either using JavaScript or Tableau, with other BI tools to follow soon.  When using Tableau with the Dashboard Engine there is no middle-tier server required and no need to ever do an extract.  Instead, queries go directly to HBase using a live connection.  This approach provides significant advantage in that there are no limits in the amount of data that may be queried by the Tableau dashboard!


With Dashboard Engine for Hadoop enterprises can report on big data sets that don’t reside in the corporate data warehouse. They can do so at scale, and quickly with very few limitations.


To find out more about the new Think Big Dashboard Engine for Hadoop, you can read the announcement under Think Big industry news.

Leave a Reply

Your email address will not be published. Required fields are marked *