Understanding Data Loading and Dimension Building Skip Navigation
Essbase® Analytic Services Database Administrator's Guide | Update Contents | Previous | Next | Print | ? |
Information Map

Understanding Data Loading and Dimension Building


An Analytic Services database contains dimensions, members, and data values.

This chapter describes the components involved in loading data values and loading dimensions and members-data sources and rules files. This chapter contains the following sections:

Some rules file options and data source requirements vary for aggregate storage databases. See Preparing Aggregate Storage Databases for information about aggregate storage differences.

Process for Data Loading and Dimension Building

To load data values or dimensions and members into an Analytic Services database, follow these steps:

  1. Set up the data source.

    If you are not using a rules file, you must set up the data source outside Analytic Services. For information on data sources, see Data Sources.

  2. If necessary, set up the rules file.

    For a definition and discussion of rules files, see Rules Files.

  3. Perform the data load or dimension build.

    For a comprehensive discussion of how to load data and members, see Performing and Debugging Data Loads or Dimension Builds.

Data Sources

Data sources contain the information that you want to load into the Analytic Services database. A data source can contain data values; information about members, such as member names, member aliases, formulas and consolidation properties; generation and level names; currency name and category; data storage properties; attributes; and UDAs (user-defined attributes).

The following sections describe the components of any kind of data source.

Supported Data Sources

Analytic Services supports the following types of data sources:

Note: If you are using Administration Services Console to load data or build an outline, spreadsheet files are not supported if the Administration Server is installed on a computer with a UNIX operating system.

Items in a Data Source

As illustrated in Figure 85, a data source is composed of records, fields, and field delimiters. A record is a structured row of related fields. A field is an individual value. A delimiter indicates that a field is complete and that the next item in the record is another field.

Analytic Services reads data sources starting at the top and proceeding from left to right.

Figure 85: Records and Fields

As illustrated in Figure 86, data sources can contain dimension fields, member fields, member combination fields, and data fields.

Figure 86: Kinds of Fields

Data fields are used only for data loading; dimension builds ignore data fields. The following sections describe each item in a data source:

Valid Dimension Fields

In a data load, every dimension in the Analytic Services database must be specified in the either the data source or the rules file. If the data source does not identify every dimension in the database, you must identify the missing dimensions in a rules file. For example, the Sample Basic database has a dimension for Year. If several data sources arrive with monthly numbers from different regions, the month itself may not be specified in the data sources. You must specify the month in the data source header or the rules file. For information on setting header records, see Defining Header Records.

A dimension field must contain a valid dimension name. If you are not performing a dimension build, the dimension must already exist in the database. If you are performing a dimension build, the dimension name can be new, but the new name must be specified in the rules file.

Valid Member Fields

A member field can contain the name of a valid member or an alias. In Figure 86, for example, Texas and Ohio are valid members of the Market dimension. Analytic Services must know how to map each member field of the data source to a member of the database.

In order to be valid, a member field must meet the following criteria:

Valid Data Fields

If you are performing a dimension build, you can skip this section. Data fields are ignored during a dimension build.

Either the data source or the rules file must contain enough information for Analytic Services to determine where to put each data value. A data field contains the data value for its intersection in the database. In Figure 86, for example, 42 is a data field. It is the dollar sales of 100-10 (Cola) in Texas in January.

In a data field, Analytic Services accepts numbers and their modifiers, with no spaces or separators between them, and the text strings #MI and #MISSING.


Valid Modifiers
Examples

Currency symbols:

  • Dollar $

  • Euro

  • Yen ¥

$12 is a valid value.

$ 12 is not a valid value because there is a space between the dollar sign and the 12.

Parentheses around numbers to indicate a negative number

(12)

Minus sign before numbers. Minus signs after numbers are not valid.

-12

Decimal point

12.3

Large numbers with or without commas

Both 1,345,218 and 1345218 are valid values.

#MI or #MISSING to represent missing or unknown values

You must insert #MI or #MISSING into any data field that has no value. If you do not, the data source may not load correctly. For instructions on how to replace a blank field with #MI or #MISSING, see Replacing an Empty Field with Text.



If the data source contains a member field for every dimension and one field that contains data values, you must define the field that contains data values as a data field in the rules file. To read Figure 88 into the Sample Basic database, for example, define the last field as a data field.

Figure 88: Setting Data Fields

Jan    Cola    East    Sales    Actual    100
Feb    Cola    East    Sales    Actual    200 
 

To define a data field, see "Defining a Column as a Data Field" in Essbase Administration Services Online Help.

If there is no value in the data field (or the value is #MISSING), Analytic Services does not change the existing data value in the database. Analytic Services does not replace current values with empty values.

Note: If the data source contains blank fields for data values, replace them with #MI or #MISSING. Otherwise, the data may not load correctly. For instructions on how to replace a blank field with #MI or #MISSING, see Replacing an Empty Field with Text.

Valid Delimiters

You must separate fields from each other with delimiters. If you are loading data without a rules file, you must uses spaces to delimit fields.

If you are using a rules file, delimiters can be any of the following:

Extra Delimiters Without a Rules File

In data sources that are loaded without a rules file, Analytic Services ignores extra delimiters. In Figure 89, for example, the fields are separated by spaces. Analytic Services ignores the extra spaces between the fields.

Figure 89: File Delimiters

East      Cola   Actual   Jan   Sales   10
East   Cola   Actual   Feb   Sales   21
East   Cola   Actual   Mar   Sales   30
 
 

Extra Delimiters with a Rules File

In data sources that are loaded with a rules file, Analytic Services reads extra delimiters as empty fields. For example, if you try to use a rules file to load the file in Figure 90 into the Sample Basic database, the load fails. Analytic Services reads the extra comma between East and Cola in the first record as an extra field. Analytic Services then puts Cola into Field 3. In the next record, however, Cola is in Field 2. Analytic Services expects Cola to be in Field 3 and stops the data load.

Figure 90: File Delimiters

East,,Cola,Actual,Jan,Sales,10
East,Cola,Actual,Feb,Sales,21
East,Cola,Actual,Mar,Sales,30 
 

To solve the problem, delete the extra delimiter from the data source.

Valid Formatting Characters

Analytic Services views some characters in the data source as formatting characters only. For that reason, Analytic Services ignores the following characters:

==

Two or more equal signs, such as for double underlining

--

Two or more minus signs, such as for single underlining

_ _

Two or more underscores

==

Two or more IBM PC graphic double underlines (ASCII character 205)

_ _

Two or more IBM PC graphic single underlines (ASCII character 196)



Ignored fields do not affect the data load or dimension build.

For example, Analytic Services ignores the equal signs in Figure 91 and loads the other fields normally.

Figure 91: Ignoring Formatting Characters During Loading

East     Actual    "100-10"
         Sales     Marketing
         =====     =========
Jan       10           8
Feb       21          16 
 

Rules Files

Rules are a set of operations that Analytic Services performs on data values or on dimensions and members when it processes a data source. Use rules to map data values to an Analytic Services database or to map dimensions and members to an Analytic Services outline.

Figure 92: Loading Data Sources Through Rules Files

Rules are stored in rules files. A rules file tells Analytic Services which build method to use, specifies whether data values or members are sorted or in random order, and tells Analytic Services how to transform data values or members before loading them. It is best to create a separate rules file for each dimension.

Analytic Services reads the data values or members in the data source, changes them based on the rules in the rules file, and loads the changed data values into the database and the changed members into the outline. Analytic Services does not change the data source. You can re-use a rules file with any data source that requires the same set of rules.

After you create a dimension build rules file, you may want to automate the process of updating dimensions. Using ESSCMD.

Situations That Do and Do Not Need a Rules File

You need a rules file if the data source does not map perfectly to the database or if you are performing any of the following tasks:

You do not need a rules file if you are performing a data load and the data source maps perfectly to the database. For a description of a data source that maps perfectly, see Data Sources That Do Not Need a Rules File.

Note: If you are using a rules file, each record in the rules file must have the same number of fields. See Dealing with Missing Fields in a Data Source.

Data Sources That Do Not Need a Rules File

If you are performing a dimension build, skip this section. You cannot perform a dimension build without a rules file.

If a data source contains all information required to load the data values in it into the database, you can load the data source directly. This kind of load is called a free-form data load.

To load a data value successfully, Analytic Services must encounter one member from each dimension before encountering the data value. For example, in Figure 86, Analytic Services loads the data value 42 into the database with the members Texas, 100-10, Jan, Sales, and Actual. If Analytic Services encounters a data value before a member of each dimension is specified, it stops loading the data source.

To map perfectly, a data source must contain all of the following and nothing other than the following:

The fields in the data source must be formatted in an order that Analytic Services understands. The simplest way to format a record is to include a member from each dimension and a data field, as illustrated in Figure 93:

Figure 93: Sample Free-Form Data Source

Sales "100-10" Ohio Jan Actual 25
Sales "100-20" Ohio Jan Actual 25
Sales "100-30" Ohio Jan Actual 25

If the data source is not correctly formatted, it will not load. You can edit the data source using a text editor and fix the problem. If you find that you must perform many edits (such as moving several fields and records), it might be easier to use a rules file to load the data source. For a definition and discussion of rules files, see Rules Files.

The following sections describe more complicated ways to format free-form data sources:

Formatting Ranges of Member Fields

If you are performing a dimension build, skip this section. You cannot perform a dimension build without a rules file.

You can express member names as ranges within a dimension. For example, Sales and COGS form a range in the Measures dimension. Ranges of member names can handle a series of values.

A data source can contain ranges from more than one dimension at a time.

In Figure 94, for example, Jan and Feb form a range in the Year dimension and Sales and COGS form a range in the Measures dimension.

Figure 94: Multiple Ranges of Member Names

Actual Texas      Sales        COGS
                Jan    Feb    Jan   Feb
"100-10"        98     89     26    19
"100-20"        87     78     23    32
 
 

In Figure 94, Sales is defined for the first two columns and COGS for the last two columns.

The following sections describe additional types of ranges:

Setting Ranges Automatically

If you are performing a dimension build, skip this section. You cannot perform a dimension build without a rules file.

When Analytic Services encounters two or more members from the same dimension with no intervening data fields, it sets up a range for that dimension. The range stays in effect until Analytic Services encounters another member name from the same dimension, at which point Analytic Services replaces the range with the new member or new member range.

Figure 95, for example, contains a range of Jan to Feb in the Year dimension. It remains in effect until Analytic Services encounters another member name, such as Mar. When Analytic Services encounters Mar, the range changes to Jan, Feb, Mar.

Figure 95: Ranges of Member Names

Texas Sales
                    Jan   Feb   Mar
Actual    "100-10"  98    89    58
          "100-20"  87    78    115 
 

Handling Out of Range Data Values

If you are performing a dimension build, skip this section. You cannot perform a dimension build without a rules file.

When Analytic Services encounters a member range, it assumes that there is a corresponding range of data values. If the data values are not in the member range, the data load stops. Analytic Services loads any data fields read before the invalid field into the database, resulting in a partial load of the data.

Figure 96, for example, contains more data fields than member fields in the defined range of members. The data load stops when it reaches the 10 data field. Analytic Services loads the 100 and 120 data fields into the database.

Figure 96: Extra Data Values

Cola    Actual   East
          Jan    Feb
Sales     100    120    10
COGS      30     34     32 
 

For information on restarting the load, see Loading Dimension Build and Data Load Error Logs.

Interpreting Duplicate Members in a Range

If you are performing a dimension build, skip this section. You cannot perform a dimension build without a rules file.

Be sure to structure ranges in the source data so that Analytic Services interprets them correctly. If the a member appears more than once in a range, Analytic Services ignores the duplicates.

The file in Figure 97 contains two ranges: Actual to Budget and Sales to COGS. It also contains duplicate members.

Figure 97: Duplicate Members in a Range

Cola East
        Actual    Budget    Actual    Budget
        Sales     Sales     COGS      COGS
Jan     108       110       49        50
Feb     102       120       57        60 
 

Analytic Services ignores the duplicate members. The members that Analytic Services ignores have a line through them in the following example:

Figure 98: Ignored Duplicate Members

Cola East
        Actual    Budget    Actual    Budget
        Sales     Sales     COGS      COGS
Jan     108       110       49        50
Feb     102       120       57        60 
 

For Actual, the first member of the first range, Analytic Services maps data values to each member of the second range (Sales and COGS). Analytic Services then proceeds to the next value of the first range, Budget, similarly mapping values to each member of the second range. As a result, Analytic Services interprets the file as shown in Figure 99.

Figure 99: How Analytic Services Interprets the File in Figure 97

Cola East
           Actual            Budget
           Sales     COGS    Sales    COGS
Jan        108       110     49       50
Feb        102       120     57       60 
 

Reading Multiple Ranges

If you are performing a dimension build, skip this section. You cannot perform a dimension build without a rules file.

As Analytic Services scans a file, it processes the most recently encountered range first when identifying a range of data values. In Figure 99, for example, there are two ranges: Actual and Budget and Sales and COGS. While reading the file from left to right and top to bottom, Analytic Services encounters the Actual and Budget range first and the Sales and COGS range second. Because the Sales and COGS range is encountered second, Analytic Services puts data fields in the Sales and COGS part of the database first.

Formatting Columns

If you are performing a dimension build, skip this section. You cannot perform a dimension build without a rules file.

Files can contain columns of fields. Columns can be symmetric or asymmetric. Symmetric columns have the same number of members under them. Asymmetric columns have different numbers of members under them. Analytic Services supports loading data from both types of columns.

Symmetric Columns

If you are performing a dimension build, skip this section. You cannot perform a dimension build without a rules file.

Symmetric columns have the same number of members under them. In Figure 100, for example, each dimension column has one column of members under it. For example, Product has one column under it (100-10 and 100-10) and Market has one column under it (Texas and Ohio).

Figure 100: Symmetric Columns

Product    Measures    Market    Year    Scenario
"100-10"   Sales       Texas     Jan     Actual      112
"100-10"   Sales       Ohio      Jan     Actual      145 
 

The columns in the following file are also symmetric, because Jan and Feb have the same number of members under them:

Figure 101: Groups of Symmetric Columns

                              Jan           Feb
                        Actual  Budget  Actual  Budget
"100-10"  Sales  Texas  112     110     243     215
"100-10"  Sales  Ohio   145     120     81      102 
 

Asymmetric Columns

If you are performing a dimension build, skip this section. You cannot perform a dimension build without a rules file.

Columns can also be asymmetric. In Figure 102, the Jan and Feb columns are asymmetric because Jan has two columns under it (Actual and Budget) and Feb has only one column under it (Budget):

Figure 102: Valid Groups of Asymmetric Columns

                         Jan     Jan     Feb
                         Actual  Budget  Budget
"100-10"  Sales  Texas   112     110      243
"100-10"  Sales  Ohio    145     120      81 
 

If a file contains asymmetric columns, you must label each column with the appropriate member name.

The file in Figure 103, for example, is not valid because the column labels are incomplete. The Jan label must appear over both the Actual and Budget columns.

Figure 103: Invalid Asymmetric Columns

                         Jan             Feb
                         Actual  Budget  Budget
"100-10"  Sales  Texas   112     110     243
"100-10"  Sales  Ohio    145     120     81 
 

This file in Figure 104 is valid because the Jan label is now over both Actual and Budget. It is clear to Analytic Services that both of those columns map to Jan.

Figure 104: Valid Asymmetric Columns

                         Jan     Jan     Feb
                         Actual  Budget  Budget
"100-10"  Sales  Texas   112     110      243
"100-10"  Sales  Ohio    145     120      81 
 

Security and Multiple-User Considerations

Analytic Services supports concurrent multiple users reading and updating the database. Thus, users can use the database while you are dynamically building dimensions, loading data, or calculating the database. In a multi-user environment, Analytic Services protects data by using the security system described in Managing Security for Users and Applications.



Hyperion Solutions Corporation link