Business Intelliigence is the integral part of your data platform, by keeping up the design and development practices you can reduce the excess baggage that can reduce the performance of your data flow.
Just on bascis, on OLAP world the database dimension is treated as colleciton of related objects (referred as attributes) that can provide required information about factual data in cubes. By default these objects are bound to one or more columns in one or more tables in a data source view. The attributes can be arranged as hierarchies that will help to navigate the user path when browsing data in a cube. By default the cube will consist all the dimensions that will base upon the analysis of fact data. For this reason a dimension is used multiple times in a cube and required attributes will need to built before defining the dimensions.
Usin the Cube wizard from BI Development studio (BIDS) it is simple and easy in defining database and cube dimensions including attributes and hierarchies. The wizard will help to create dimensions baed on the dimensions tables in a data source view that will identify the uses of cube. Once the data-source and attributes are defined, the creation of dimensions will add the relevant information to the cube by creating a cube dimension.
In the recent times I have offered consulting services surrounding BI solutions by assisting database designers for a retail sales shop who are very active in the world of gadgets, selling the products over web and popular across the Europe. The environment and design is as follows:
- SQL Server 2008 R2 on Windows 2008 R2 X64
- 82 GB RAM, 32 CPU and 5 TB storage (SAN)
- CUBE with 5 dimensions, 2 of them holds around 3.5 million rows
- FACT table has around 2.5 million rows
- TIME based hierarchy is used
- STAR schema
- MOLAP base
All the dimensons and conditions values are left to default (measure for the cube) and removed reference on All-Element and set IsAggregatable to false. In the multi-dimensional world the nature of cubes must have the ability to create/modify views, which in this case data-source views are useful. Also the star schema usage has avoid the usage of foreign key relationship.
Over the time the CUBE processing has been functioning very well without any errors or warnings, the system performance has been continuting until 4 week period where a new dimension is introduced to provide drill-through mechanism on web sales information. THis new dimension has caused high CPU spikes and at times timeout for reports too.
The design of new dimensions has been tested in the development platform and it satisfies the data information that is required within the CUBE. As the new dimension has many hierarchies that will start with static attributes such as Sales –> Debtor –> Creditor –> Revenue. Among these attributes the debtor and creditor as slowly changing dimensions. To avoid further complexity on the development platform I have created a sample attribute for each debtor & creditor which appearsin a hierarchy that is a combination of attributes above it. Somehow this has complicated the design but enabled the performance benefit on the queries (reporting), still the problem of high CPU, memory usage and timeout happening.
The problem in production was the priority than spending time on new dimension design, without further assesment I had a look at recent cumulative update package releases for SQL Server 2008 R2. Somehow the Cumulative Update Package 6 for SQL Server 2008 R2 refers about a fix “A memory error occurs when you perform a full process of a dimension in SSAS 2008 R2” and a thorough testing of this hotfix on development environment has solved the problem. So it proved that always best practice is to refer on cumulative update package releases that are specific on a particular problem and which will equipped in service pack release when they are scheduled to release.
Further see the references about dimensions best pratcies (that I refer) as follows:
- Adding Account Intelligence to a Dimension
- Adding Dimension Intelligence to a Dimension
- Adding a Custom Aggregation to a Dimension
- Setting Custom Member Formulas for Attributes in a Dimension
- Specifying Ordering for a Dimension
- Setting Dimension Writeback Capability
On a real-world practices I would recommend to go through the Project_REAL– ETL design practices that is developed within Microsoft by using SQL Server technologies, also a refrence on OLAP_DataManagement articles.
Satya Shyam K Jayanty