OLAP and Data Warehousing - Data in Jail

Wednesday May 15th 2002 by Alexzander Nepomnjashiy

Alexzander Nepomnjashiy continues his analysis of OLAP and Data Warehousing with a discussion of 'Data In Jail' crises.

Please find below the follow-up to the first part in my OLAP and Data Warehousing (The Problem and Solution) series, a series dedicated to the discussion of OLAP and Data Warehousing technologies.

A lack of world economic stabilization over the last few years has resulted in a growth of competitiveness and a rise in importance of correct solutions for successful company operation. Modern business requires, at a minimum, from top-managers responsible for decision making:

  • Exntensive knowledge of the company's most significant clients;
  • Knowledge of:
    • Last 2... 5... 10...;
    • Most important strategies;
    • Most successful solutions;
    • Unsuccessful bargains;
  • Analysis of income levels;
  • Analysis of sales dynamics.
The modern manager who wants to achieve success needs to completely understand the nature of the processes he or she controls. And by understand, I mean understand at a point in time not after financial activity has obstinately proved an inefficiency, but rather when there still remains the possibility of making the process effective. A unique method to carry out efficient control and parse company activity (based on available data and information instead of certain suppositions tat are frequently far from the truth) asks about the value of those or other metrics and tries to answer the question: « Why are those so important? ».

All this knowledge cannot be received without the analysis of large volumes of raw data which have been accumulated during the history of company existence. Of course, you might ask whether each organization owns / has such data for the analysis? The answer is most likely YES!

In my experience, almost all organizations have used (it was folded historically) recording systems, among which have been tracked:

  • Technologies, databases, and systems;
  • Specializations:
    • Accounting;
    • Economic;
    • Administrative systems;
    • . . .
However, the usage of such separated systems, as a rule, were limited to the description of company activity in some isolated fields, with each record fixed in such separate systems. Such systems are in the class of online transaction processing systems (OLTP). You can also find them by another name, operating analysis, that underlines their main role storing raw data. Raw in this case means data of a very low level of abstraction.

Similar objectives (i.e. obtaining information from poorly agreed systems with subsequent analysis) will demand huge and completely unjustified costs of those people whose time and compensation is very expensive (i.e. top managers). I dare to assume that none of them will risk to undertake this operation.

So, what we watch for are available crises of the operating analysis or so-called DIJ (Data In Jail) crises. The reason behind this is that regardless of all the abundance of data, the people responsible for decision-making are not capable of extracting information from data and require knowledge about processes existent around them -- mainly because they deal with too detailed, non-interconnected data that represents isolated company activities.

Thus, on a currently presented technological level the level of automation of the analysis and decision-making process is extremely low. No wonder then that some tendencies remain either unnoticed or misunderstood.

So the operating analysis, which in fact should be the dynamic, iterative and constant process of critical learning about company activity really is only fulfilled from time to time (and most likely by company IT-staff).

Correct administrative decisions are impossible without having the information, usually quantitative, necessary for them. For this purpose we have Data Warehouse (DW) creation, the process of collecting, sifting and preprocessing raw data into resulting information, appropriated for a statistical analysis (frequently also for analytical reporting).

Ralph Kimball, father of DW concepts, describes DW as "a place where people may gain access to the data" (according to Ralph Kimball, "The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses", John Wiley and Sons, 1996 and "The Data Webhouse Toolkit: Building the Web-Enabled Data Warehouse", John Wiley and Sons, 2000). Kimball also formulates the main requirements of DW:

  • High speed data acquisition support (from the DW storage);
  • Internal data consistency checking support;
  • Possibility of obtaining and matching so-called "cuts of data" ("slice and dice" procedure);
  • Availability of convenient utilities of review of the data in DW storage;
  • Entirety and reliability of the stored data;
  • Support of the qualitative process of addition of the data.
Typical DW, as a rule, differs from the ordinary relational OLTP database:
  • OLTP databases are intended to help users perform daily business operation whereas DW is intended for decision making. For example, the sale of goods made with usage of the database intended for transaction processing, analysis of sales dynamics - with the help of DW;
  • OLTP databases are subject to constant changes during daily user operation, but DW is rather stable: DW data is usually updated according to schedule (for example, weekly, daily or hourly - depending on business requirements). Ideally, the process of data addition to DW should be the simple addition of new data (from different operating systems / sources) for the defined period of time without changing the information already placed in storage;
  • OLTP databases serve as data sources for DW.

See All Articles by Columnist Alexzander Nepomnjashiy

Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved