Understanding Multidimensional Databases Skip Navigation
Essbase® Analytic Services Database Administrator's Guide | Update Contents | Previous | Next | Print | ? |
Information Map

Understanding Multidimensional Databases


Essbase Analytic Services contains multidimensional databases that support analysis and management reporting applications. This chapter discusses multidimensional concepts and terminology.

Note: The information in this chapter is designed for block storage databases. Some of the information is not relevant to aggregate storage databases. For detailed information on the differences between aggregate and block storage, see Comparison of Aggregate and Block Storage.

This chapter contains the following topics:

OLAP and Multidimensional Databases

Online analytical processing (OLAP) is a multidimensional, multi-user, client-server computing environment for users who need to analyze enterprise data. OLAP applications span a variety of organizational functions. Finance departments use OLAP for applications such as budgeting, activity-based costing (allocations), financial performance analysis, and financial modeling. Sales departments use OLAP for sales analysis and forecasting. Among other applications, marketing departments use OLAP for market research analysis, sales forecasting, promotions analysis, customer analysis, and market/customer segmentation. Typical manufacturing OLAP applications include production planning and defect analysis.

Important to all of the applications mentioned in the previous paragraph is the ability to provide managers with the information they need to make effective decisions about an organization's strategic directions. A successful OLAP application provides information as needed, that is, it provides "just-in-time" information for effective decision-making.

Providing "just-in-time" information requires more than a base level of detailed data. "Just-in-time" information is computed data that usually reflects complex relationships and is often calculated on the fly. Analyzing and modeling complex relationships are practical only if response times are consistently short. In addition, because the nature of data relationships may not be known in advance, the data model must be flexible. A truly flexible data model ensures that OLAP systems can respond to changing business requirements as needed for effective decision making.

Although OLAP applications are found in widely divergent functional areas, they all require the following key features:

Key to OLAP systems are multidimensional databases. Multidimensional databases not only consolidate and calculate data; they also provide retrieval and calculation of a variety of data subsets. A multidimensional database supports multiple views of data sets for users who need to analyze the relationships between data categories. For example, a marketing analyst might want answers to the following questions:

With a multidimensional database, the number of data views is limited only by the database outline, the structure that defines all elements of the database. Users can pivot the data to see information from a different viewpoint, drill down to find more detailed information, or drill up to see an overview.

Dimensions and Members

This section introduces the concepts of outlines, dimensions and members within a multidimensional database. If you understand dimensions and members, you are well on your way to understanding the power of a multidimensional database.

A dimension represents the highest consolidation level in the database outline. The database outline presents dimensions and members in a tree structure to indicate a consolidation relationship. For example, in Figure 2, Time is a dimension and Qtr1 is a member.

Analytic Services has two types of dimensions: standard dimensions and attribute dimensions.

Standard dimensions represent the core components of a business plan and often relate to departmental functions. Typical standard dimensions are Time, Accounts, Product Line, Market, and Division. Dimensions change less frequently than members.

Attribute dimensions are a special type of dimension that are associated with standard dimensions. Through attribute dimensions, you group and analyze members of standard dimensions based on the member attributes (characteristics). For example, you can compare the profitability of non-caffeinated products that are packaged in glass to the profitability of non-caffeinated products that are packaged in cans.

Members are the individual components of a dimension. For example, Product A, Product B, and Product C might be members of the Product dimension. Each member has a unique name. A dimension can contain an unlimited number of members. Analytic Services can store the data associated with a member (referred to as a stored member in this chapter) or it can dynamically calculate the data when a user retrieves it.

Outline Hierarchies

All Analytic Services database development begins with creating a database outline. A database outline accomplishes the following:

Analytic Services uses the concept of members to represent data hierarchies. Each dimension consists of one or more members. The members, in turn, may consist of other members. When you create a dimension, you tell Analytic Services how to consolidate the values of its individual members. Within the tree structure of the database outline, a consolidation is a group of members in a branch of the tree.

For example, many businesses summarize their data monthly, roll up the monthly data to obtain quarterly figures, and roll up the quarterly data to obtain annual figures. Businesses may also summarize data by zip code, by city, state, and country. Any dimension can be used to consolidate data for reporting purposes.

In the Sample Basic database included with Analytic Server, for example, the Year dimension consists of five members: Qtr1, Qtr2, Qtr3, and Qtr4, each storing data for an individual quarter, plus Year, storing summary data for the entire year. Qtr1 consists of four members: Jan, Feb, and Mar, each storing data for an individual month, plus Qtr1, storing summary data for the entire quarter. Likewise, Qtr2, Qtr3, and Qtr4 consist of the members that represent the individual months plus the member that stores the quarterly totals.

The database outline in Figure 2 uses a hierarchical structure to represent the data consolidations and relationships in Qtr1.

Figure 2: Hierarchical Structure

Some dimensions consist of relatively few members, while others may have hundreds or even thousands of members. Analytic Services does not limit the number of members within a dimension and enables the addition of new members as needed.

Dimension and Member Relationships

Analytic Services uses the terms defined in the following sections to describe a database outline. These terms are used throughout Analytic Services documentation.

Analytic Services uses hierarchical and family history terms to describe the roles and relationships of the members in an outline. You can describe the position of the members of the branches in Figure 3 in several ways.

Figure 3: Member Generation and Level Numbers

Parents, Children, and Siblings

Figure 3 illustrates the following parent, child, and sibling relationships:

Descendants and Ancestors

Figure 3 illustrates the following descendant and ancestral relationships:

Roots and Leaves

Figure 3 illustrates the following root and leaf member relationships:

Generations and Levels

Figure 3 illustrates the following generations levels:

Generation and Level Names

To make your reports easier to maintain, you can assign a name to a generation or level and then use the name as a shorthand for all members in that generation or level. Because changes to an outline are automatically reflected in a report, when you use generation and level names, you do not need to change the report if a member name is changed or deleted from the database outline.

Standard Dimensions and Attribute Dimensions

Analytic Services has two types of dimensions: standard dimensions and attribute dimensions. This chapter primarily considers standard dimensions because Analytic Services does not allocate storage for attribute dimension members. Instead it dynamically calculates the members when the user requests data associated with them.

An attribute dimension is a special type of dimension that is associated with a standard dimension. For comprehensive discussion of attribute dimensions, see Working with Attributes.

Sparse and Dense Dimensions

Most data sets of multidimensional applications have two characteristics:

Analytic Services maximizes performance by dividing the standard dimensions of an application into two types: dense dimensions and sparse dimensions. This division allows Analytic Services to cope with data that is not smoothly distributed. Analytic Services speeds up data retrieval while minimizing the memory and disk requirements.

Most multidimensional databases are inherently sparse: they lack data values for the majority of member combinations. A sparse dimension is a dimension with a low percentage of available data positions filled.

For example, the Sample Basic database shown in Figure 6 includes the Year, Product, Market, Measures, and Scenario dimensions. Product represents the product units, Market represents the geographical regions in which the products are sold, and Measures represents the accounts data. Because not every product is sold in every market, Market and Product are chosen as sparse dimensions.

Most multidimensional databases also contain dense dimensions. A dense dimension is a dimension with a high probability that one or more data points is occupied in every combination of dimensions. For example, in the Sample Basic database, accounts data exists for almost all products in all markets, so Measures is chosen as a dense dimension. Year and Scenario are also chosen as dense dimensions. Year represents time in months, and Scenario represents whether the accounts values are budget or actual values.

In Sample Basic Database Outline, Caffeinated, Intro Date, Ounces, and Pkg Type are attribute dimensions that are associated with the Product dimension. Population is an attribute dimension that is associated with the Market dimension. Members of attribute dimensions describe characteristics of the members of the dimensions with which they are associated. For example, each product has a size in ounces. Attribute dimensions are always sparse dimensions and must be associated with a sparse standard dimension. Analytic Services does not store the data for attribute dimensions, Analytic Services dynamically calculates the data when a user retrieves it. For a comprehensive discussion about attribute dimensions, see Working with Attributes.

Figure 6: Sample Basic Database Outline

Selection of Dense and Sparse Dimensions

In most data sets, existing data tends to follow predictable patterns of density and sparsity. If you match patterns correctly, you can store the existing data in a reasonable number of fairly dense data blocks, rather than in many highly sparse data blocks.

Analytic Services can make recommendations for the sparse-dense configuration of dimensions based on the following factors:

You can apply a recommended configuration or you can turn off automatic configuration and manually set the sparse or dense property for each dimension. Attribute dimensions are always sparse dimensions. Keep in mind that you can associate attribute dimensions only with sparse standard dimensions.

Note: The automatic configuration of dense and sparse dimensions provides only an estimate. It cannot take into account the nature of the data you will load into your database or multiple user considerations.

Dense-Sparse Configuration for Sample Basic

Consider the Sample Basic database that is provided with Analytic Services. The Sample Basic database represents data for The Beverage Company (TBC).

TBC does not sell every product in every market; therefore, the data set is reasonably sparse. Data values do not exist for many combinations of members in the Product and Market dimensions. For example, if Caffeine Free Cola is not sold in Florida, then data values do not exist for the combination Caffeine Free Cola (100-30)->Florida.

However, consider combinations of members in the Year, Measures, and Scenario dimensions. Data values almost always exist for some member combinations on these dimensions. For example, data values exist for the member combination Sales->January->Actual because at least some products are sold in January.

The sparse-dense configuration of the standard dimensions in the Sample Basic database may be summarized as follows:

Analytic Services creates a data block for each unique combination of members in the Product and Market dimensions (for more information on data blocks, see Data Storage). Each data block represents data from the dense dimensions. The data blocks are likely to have few empty cells.

For example, consider the sparse member combination Caffeine Free Cola (100-30), New York, illustrated by Figure 7:

Dense and Sparse Selection Scenario

Consider a database with four standard dimensions: Time, Accounts, Region, and Product. In the following example, Time and Accounts are dense dimensions, and Region and Product are sparse dimensions.

The two-dimensional data blocks shown in Figure 8 represent data values from the dense dimensions: Time and Accounts. The members in the Time dimension are J, F, M, and Q1. The members in the Accounts dimension are Rev, Exp, and Net.

Figure 8: Two-dimensional Data Block for Time and Accounts

Analytic Services creates data blocks for combinations of members in the sparse standard dimensions (providing at least one data value exists for the member combination). The sparse dimensions are Region and Product. The members of the Region dimension are East, West, South, and Total US. The members in the Product dimension are Product A, Product B, Product C, and Total Product.

Figure 9 shows 11 data blocks. No data values exist for Product A in the West and South, for Product B in the East and West, and for Product C in the East. Therefore, Analytic Services has not created data blocks for these member combinations. The data blocks that Analytic Services has created have very few empty cells.

Figure 9: Data Blocks Created for Sparse Members on Region and Product

This example effectively concentrates all the sparseness into the index and concentrates all the data into fully utilized blocks. This configuration provides efficient data storage and retrieval.

Now consider a reversal of the dense and sparse dimension selections. In the following example, Region and Product are dense dimensions, and Time and Accounts are sparse dimensions.

As shown in Figure 10, the two-dimensional data blocks represent data values from the dense dimensions: Region and Product.

Figure 10: Two-Dimensional Data Block for Region and Product

Analytic Services creates data blocks for combinations of members in the sparse standard dimensions (providing at least one data value exists for the member combination). The sparse standard dimensions are Time and Accounts.

Figure 11 shows 12 data blocks. Data values exist for all combinations of members in the Time and Accounts dimensions; therefore, Analytic Services creates data blocks for all the member combinations. Because data values do not exist for all products in all regions, the data blocks have many empty cells. Data blocks with many empty cells store data inefficiently.

Figure 11: Data Blocks Created for Sparse Members on Time and Accounts

Data Storage

This topic describes how data is stored in a multidimensional database. Each data value is stored in a single cell in the database. You refer to a particular data value by specifying its coordinates along each standard dimension.

Note: Analytic Services does not store data for attribute dimensions. Analytic Services dynamically calculates attribute dimension data when a user retrieves the data.

Consider the simplified database shown in Figure 12.

Figure 12: A Multidimensional Database Outline

This database has three dimensions: Accounts, Time, and Scenario:

Data Values

The intersection of one member from one dimension with one member from each of the other dimensions represents a data value. The example in Figure 13 has three dimensions; thus, the dimensions and data values in the database can be represented in a cube.

Figure 13: Three-Dimensional Database

The shaded cells in Figure 14 illustrate that when you specify Sales, you are specifying the portion of the database containing eight Sales values.

Figure 14: Sales Slice of the Database

Slicing a database amounts to fixing one or more dimensions at a constant value while allowing the other dimensions to vary.

When you specify Actual Sales, you are specifying the four Sales values where Actual and Sales intersect as shown by the shaded area in Figure 15.

Figure 15: Actual, Sales Slice of the Database

A data value is stored in a single cell in the database. To refer to a specific data value in a multidimensional database, you specify its member on each dimension. In Figure 16, the cell containing the data value for Sales, Jan, Actual is shaded. The data value can also be expressed using the cross-dimensional operator (->) as Sales -> Actual -> Jan.

Figure 16: Sales ->  Jan ->  Actual Slice of the Database

Data Blocks and the Index System

Analytic Services uses two types of internal structures to store and access data: data blocks and the index system.

Analytic Services creates a data block for each unique combination of sparse standard dimension members (providing that at least one data value exists for the sparse dimension member combination). The data block represents all the dense dimension members for its combination of sparse dimension members.

Analytic Services creates an index entry for each data block. The index represents the combinations of sparse standard dimension members. It contains an entry for each unique combination of sparse standard dimension members for which at least one data value exists.

For example, in the Sample Basic database outline shown in Figure 17, Product and Market are sparse dimensions.

Figure 17: Product and Market Dimensions from the Sample Basic Database

If data exists for Caffeine Free Cola in New York, then Analytic Services creates a data block and an index entry for the sparse member combination of Caffeine Free Cola (100-30) -> New York. If Caffeine Free Cola is not sold in Florida, then Analytic Services does not create a data block or an index entry for the sparse member combination of Caffeine Free Cola (100-30) -> Florida.

The data block Caffeine Free Cola (100-30) -> New York represents all the Year, Measures, and Scenario dimensions for Caffeine Free Cola (100-30) -> New York.

Each unique data value can be considered to exist in a cell in a data block. When Analytic Services searches for a data value, it uses the index to locate the appropriate data block.Then, within the data block, it locates the cell containing the data value. The index entry provides a pointer to the data block. The index handles sparse data efficiently because it includes only pointers to existing data blocks.

Figure 18 shows part of a data block for the Sample Basic database. Each dimension of the block represents a dense dimension in the Sample Basic database: Time, Measures, and Scenario. A data block exists for each unique combination of members of the Product and Market sparse dimensions (providing that at least one data value exists for the combination).

Figure 18: Part of a Data Block for the Sample Basic Database

Each data block is a multidimensional array that contains a fixed, ordered location for each possible combination of dense dimension members. Accessing a cell in the block does not involve sequential or index searches. The search is almost instantaneous, resulting in optimal retrieval and calculation speed.

Analytic Services orders the cells in a data block according to the order of the members in the dense dimensions of the database outline.

A (Dense)
   a1
   a2
B (Dense)
   b1
       b11
       b12
   b2
       b21
       b22
C (Dense)
   c1
   c2
   c3
D (Sparse)
   d1
   d2
       d21
       d22
E (Sparse)
   e1
   e2
   e3 
 

The block in Figure 19 represents the three dense dimensions from within the combination of the sparse members d22 and e3 in the preceding database outline. In Analytic Services, member combinations are denoted by the cross-dimensional operator. The symbol for the cross-dimensional operator is ->. So d22, e3 is written d22 -> e3. A, b21, c3 is written A -> b21 -> c3.

Figure 19: Data Block Representing Dense Dimensions for d22 -> e3

Analytic Services creates a data block for every unique combination of the members of the sparse dimensions D and E (providing that at least one data value exists for the combination).

Data blocks, such as the one shown in Figure 19, may include cells that do not contain data values. A data block is created if at least one data value exists in the block. Analytic Services compresses data blocks with missing values on disk, expanding each block fully as it brings the block into memory. Data compression is optional, but is enabled by default. For more information, see Data Compression.

By carefully selecting dense and sparse standard dimensions, you can ensure that data blocks do not contain many empty cells, minimizing disk storage requirements and improving performance.

Multiple Data Views

A multidimensional database supports multiple views of data sets for users who need to analyze the relationships between data categories. Slicing the database in different ways gives you different perspectives of the data. The slice of January in Figure 20, for example, examines all data values for which the Year dimension is fixed at Jan.

Figure 20: Data for January

The slice in Figure 21 shows data for the month of February:

Figure 21: Data for February

The slice in Figure 22 shows data for profit margin:

Figure 22: Data for Profit Margin



Hyperion Solutions Corporation link