By Harpreet Saini, TGP Technology Programmer
TGP Associates is working with TxVia, a pre-paid card
processing company, developing reporting solutions. TGP reviewed a couple of
reporting platforms before deciding on Jaspersoft, a platform that comes with
the ability to connect to a wide variety of data sources. It is very important
to choose the right data source for your data.
OLAP Data Source
It would be a wise decision to create an OLAP data source
for business data that is expected to grow very rapidly. It would be a waste of
a report for a sales manager to wait 15 minutes for a number which is computed
from say, 1 million rows of data.
Mondrian - A BI Engine
Jaspersoft has an inbuilt Business Intelligence engine in
the form of Mondrian. Mondrian is a ROLAP engine, which churns a whole lot of
data and stores it in cache when it is accessed for the first time, and when
the same data is requested at a later time, it pulls this data from cache and
takes a flash of a second to display it. From this, it is understandable that Mondrian
would take a lot of time to display data for the first time if it had to parse
huge dataset for a request. This is where Aggregate tables (aka Summary tables)
come into the picture.
Aggregate Tables
Aggregate tables pre-aggregate the measures so that when
Mondrian has to parse a huge dataset to compute something, it should be able to
find an easy and quick way out by getting data from these aggregate tables
instead. An aggregate table may be 1/50th the size of the of the
original data source. However, choosing your aggregate tables is not a small
decision. It takes some work to look at the kind of data which is expected to
be pulled from the OLAP data source to figure out which level you want to
aggregate your data to. There are some tools available that suggest these
aggregate tables depending upon two factors – how much time you have available
for the tables to get updated and the limit on the number of aggregate tables
that you want.
Reports and dashboards developed from an OLAP data source
using appropriate aggregate tables and cache size are very quick and handy.
That’s the first thing in the morning that every manager wants to see for his/her
business.