About the Series ...
This
article is a member of the series Introduction
to MSSQL Server Analysis Services. The series is designed to provide hands-on
application of the fundamentals of MS SQL Server Analysis Services (Analysis
Services), with each installment progressively presenting features and
techniques designed to meet specific real-world needs. For more information on
the series, please see my initial article, Creating Our First Cube. For the
software components,
samples and tools needed to complete the hands-on portions of this article see Usage-Based
Optimization in Analysis Services 2005, another article within this
series.
Introduction
In Dimensional
Model Components: Dimensions Parts I and II, we introduced the dimensional
model in general, noting its wide acceptance as the preferred structure for
presenting quantitative and other organizational data to information consumers.
We then began our examination of dimensions, the analytical perspectives
upon which the dimensional model relies in meeting the primary
objectives of business intelligence, including its capacity to support:
-
the presentation
of relevant and accurate information representing business operations and
events;
-
the rapid and
accurate return of query results;
-
slice and
dice query creation and modification;
-
an environment
wherein information consumers can pose quick and easy questions, and achieve
rapid results datasets.
In this, the
third of several articles focusing upon dimensional model component
structures as they are implemented within Analysis Services 2005, we
will introduce attributes, another key component. Our examination will
include:
-
An
introduction to dimension attributes from a conceptual perspective;
-
Coverage of
the general characteristics of attributes;
-
An examination
of the Advanced properties (including what they define and support, and
how we can manage them) underpinning attributes;
-
A look ahead
to Part II of this article, where we explore the Basic, Misc, Parent-Child and
Source groups of attribute properties.
Dimensions in Analysis Services: Attributes
We
learned, in Dimensional
Model Components: Dimensions Parts I and II that dimensions form the
foundation of the dimensional model. They represent the perspectives
of a business or other operation, and reflect the intuitive ways that
information consumers need to query and view data. We noted that we might consider
dimensions as nouns that take part in, or are otherwise
associated with, the verbs (or actions / transactions undertaken by the
business) that are represented by the facts or measures contained
within our business intelligence systems.
We
discovered in the earlier two articles that, within the Analysis Services
model, database dimensions underlie all other dimensions, whose added
properties distinguish them from the database dimensions they reference,
within the model. Each dimension within our model contains one or more hierarchies.
As we will learn in later articles of this subseries, two types of hierarchies
exist within Analysis Services: attribute hierarchies and user
(sometimes called multi-level) hierarchies. For purposes of
our article, the term attribute means the same thing as attribute
hierarchy. (We will examine user hierarchies, to which we will simply
refer as hierarchies, in a subsequent article.)
To extend
the metaphor we used earlier in describing dimensions as nouns and
measures as verbs, we might consider attributes as
somewhat similar to adjectives. That is, attributes help us to
define with specificity what dimensions cannot define by themselves. Dimensions
alone are like lines in geometry: they don't define area within multidimensional
space, nor do they themselves even define the hierarchies that they
contain. A database dimension is a collection of related objects called
attributes, which we use to specify the coordinates required to define
cube space.
Within the
table underlying a given dimension (assuming a more-or-less typical star
schema database) are individual rows supporting each of the members of
the associated dimension. Each row contains the set of attributes that
identify, describe, and otherwise define and classify the member upon
whose row they reside. For instance, a member of the Patient dimension,
within the Analysis Services implementation for a healthcare provider,
might contain information such as patient name, patient ID, gender, age group, race,
and other attributes. Some of these attributes might relate to
each other hierarchically, and, as we shall see in subsequent articles of this
subseries (as well as within other of my articles), multiple hierarchies of
this sort are common in real-world dimensions.
Dimensions and dimension attributes
should support the way that management and information consumers of a given
organization describe the events and results of its business operations.
Because we maintain dimension and related attribute information
within the database underlying our Analysis Services implementation, we
can support business intelligence for our clients and employers even when these
details are not captured within the system where transaction processing takes
place. Within the analysis and reporting capabilities we supply in this
manner, dimensions and attributes are useful for aggregation,
filtering, labeling, and other purposes.
In
addition to a few key values, several properties (each of which
has, in its own right, multiple possible values) are associated with
each attribute residing in a given model. We will get some hands-on
exposure to these key values and properties in the practice
session below. Before
we get started working within a sample cube clone, we will need to prepare the
local environment for the practice session. We will take steps to accomplish
this within the section that follows.
Preparation: Locate and Open the Sample Basic UDM Created Earlier
In
Dimensional Model Components: Dimensions Part I, we created a sample basic UDM within which to perform the steps of
the practice sessions we set out to undertake in the various articles of this
subseries. Once we had ascertained that the new practice database appeared to
be in place, and once we had renamed it to ANSYS065_Basic AS DB, we
began our examination of dimension properties. We will perform our
examination of attributes within the same practice environment, which we
will access using the following steps within the SQL Server Business
Intelligence Development Studio, as we did within Dimensional
Model Components: Dimensions Part I.
NOTE: Please access the UDM which we
prepared in Dimensional
Model Components: Dimensions Part I before proceeding with this
article. If you have not completed the preparation to which I refer in the
previous article, or if you cannot locate / access the Analysis Services
database with which we worked there, please consider taking the preparation
steps provided in Dimensional
Model Components: Dimensions Part I before continuing, and prospectively
saving the objects with which you work, so as to avoid the need to repeat the
preparation process we have already undertaken for subsequent related articles
within this subseries.
1.
Click Start.
2.
Navigate to,
and click, the SQL
Server Business Intelligence Development Studio, as appropriate.
We
briefly see a splash page that lists the components installed on the PC, and
then Visual Studio .NET 2005 opens at the Start page.
3.
Close the Start
page, if desired.
4.
Select File
-> Open from the main menu.
5.
Click Analysis
Services Database ... from the cascading menu, as depicted in Illustration
1.
Illustration 1: Opening
the Analysis Services Database ...
The Connect
to Database dialog appears.
6.
Ensuring that
the Connect to existing database radio button is selected, type the Analysis
Server name into the Server input box atop the dialog.
7.
Using the
selector just beneath, labeled Database, select ANSYS065_Basic AS DB,
as shown in Illustration
2.
Illustration 2:
Selecting the New Basic Analysis Services Database ...
8.
Leaving other
settings on the dialog at default, click OK.
SQL
Server Business Intelligence Development Studio briefly reads the database from
the Analysis Server, and then we see the Solution Explorer
populated with the database objects. Having overviewed dimension attributes,
we will get some hands-on exposure to properties for an example
attribute, from within our sample UDM.
Procedure: Examine Attribute Properties in Analysis Services 2005
In the practice procedures that
follow, we will examine the properties that define and support a representative
attribute. We will perform our practice sessions within the SQL
Server Business Intelligence Development Studio, from which we will perform
our examination of attribute properties within our Analysis Services
database, ANSYS065_Basic AS DB.
In Dimensional Model Components: Dimensions Parts I and II, we overviewed the properties
underpinning Database and Cube dimensions, once we had explored
both types in general to gain an understanding of the characteristics of each,
together with the differences between the two. In this article, we will
overview the properties supporting dimension attributes. To
access these settings for attributes within a representative dimension,
we will need to open that dimension within the Dimension Designer
first. (Because database dimensions, and not cube dimensions,
contain attributes, we access properties supporting dimension
attributes via the Dimension Designer, and not the Cube Designer.)
1.
Within the Solution
Explorer, right-click the Geography dimension (expand the Dimensions
folder as necessary).
2.
Click Open
on the context menu that appears, as depicted in Illustration 3.
Illustration
3: Opening the Dimension via the Dimension Designer ...
The
tabs of the Dimension Designer open.
3.
Click the Dimension
Structure tab, if we have not already arrived there by default.
4.
Examine the
member attributes that appear within the Attributes pane of the Dimension Structure tab.
The attributes belonging to the Geography dimension appear as shown in Illustration
4.
Illustration
4: The Member Attributes, Geography Dimension
We note
that five attributes appear within the Attributes pane. Let's
get some exposure to the properties associated with attributes by
examining a representative member among the attributes we see here.
Review Attribute Properties
Analysis Services exposes many properties that determine how
dimensions and dimension attributes function. We will examine the properties
for a select attribute within our sample
UDM by taking the following steps.
1.
Within the Attributes
pane of the Dimension Structure tab, right-click the Geography
Key attribute.
2.
Click Properties
on the context menu that appears, as depicted in Illustration 5.
Illustration
5: Select Properties from the Context Menu ...
The Properties
pane appears for the Geography Key attribute. (The Properties
pane likely appeared when we selected the Product dimension within
the Dimensions pane, by default, below the Solution Explorer.
The design environment can, of course, be customized in many ways to
accommodate your local environment and development needs.)
We can,
at this stage, see the thirty DimensionAttribute properties for
the Geography Key attribute within the Properties pane. The
first eleven properties, members of the Advanced properties
group, include the following:
-
AttributeHierarchyDisplayFolder
-
AttributeHierarchyEnabled
-
AttributeHierarchyOptimizedState
-
AttributeHierarchyVisible
-
DefaultMember
-
DiscretizationBucketCount
-
DiscretizationMethod
-
EstimatedCount
-
IsAggregatable
-
OrderBy
-
OrderByAttribute
(We will examine the members of the Advanced properties group in
this, the first half of a two-part article. We will examine the remaining
properties in Part II.)
The five
Basic properties, appearing underneath the Advanced properties
group, include the following:
-
Description
-
ID
-
Name
-
Type
-
Usage
The Misc
group comes next, and includes the following four properties:
-
AttributeHierarchyOrdered
-
GroupingBehavior
-
InstanceSelection
-
MemberNamesUnique
Beneath
the Misc group in the Properties pane lies the Parent-Child
group, which includes the following five properties:
-
MembersWithData
-
MembersWithDataCaption
-
NamingTemplate
-
RootMemberIf
-
UnaryOperatorColumn
Finally,
the five Source properties, appearing underneath the Parent-Child
properties group, include the following:
-
CustomRollupColumn
-
CustomRollupPropertiesColumn
-
KeyColumns
-
NameColumn
-
ValueColumn
The Properties
pane for the Geography Key attribute, with Advanced, Basic,
Misc, Parent-Child, and Source properties groups expanded,
appears as shown in Illustration
6.
Illustration
6: The Properties Pane for the Geography Key Attribute
Let's
take a look at each of the individual properties, starting with the Advanced
properties group, discussing the purpose of the property, and
examining possible settings with which we can come into contact.
Advanced
Property: AttributeHierarchyDisplayFolder
The AttributeHierarchyDisplayFolder
property serves as a point of identification for the folder in which we
intend to display the associated attribute hierarchy to information
consumers. (This property is for use by client applications).
Advanced
Property: AttributeHierarchyEnabled
The AttributeHierarchyEnabled property, with True or False
as setting options, determines whether an attribute hierarchy is enabled
for this attribute. If the attribute hierarchy is not enabled,
then the attribute cannot be employed within a user-defined hierarchy,
nor can the attribute hierarchy be referenced within MDX statements.
Advanced
Property: AttributeHierarchyOptimizedState
The AttributeHierarchyOptimizedState property allows us to specify the level of optimization
applied to the attribute hierarchy. Two options exist for this setting,
as depicted in Illustration 7.
Illustration
7: Optional Settings for the AttributeHierarchyOptimizedState Property
The attribute
hierarchy is fully optimized (the top setting within the selector)
by default. When our selection is FullyOptimized, Analysis Services is
directed to build indexes for the hierarchy to improve query performance. The
other selection, NotOptimized, means that these indexes are not built -
an appropriate setting in cases where the attribute hierarchy is not
used for querying per se, but is used for another purpose, such as the ordering
of another attribute hierarchy.
Advanced Property: AttributeHierarchyVisible
AttributeHierarchyVisible controls whether the attribute hierarchy is visible to client applications. The default value is True.
If we do not anticipate the need for querying the attribute hierarchy, however, we can change the value of this property to False,
and still retain the capability to use the attribute
hierarchy within
a user-defined hierarchy, or to reference it within MDX statements.
Advanced
Property: DefaultMember
The DefaultMember property allows us to
specify an
MDX expression that defines the default member for the attribute.
3.
Click the ellipses
(...) button to the right of the setting box for the DefaultMember property, shown circled in Illustration 8.
Illustration 8: Click
the Ellipses Button to the Right of the DefaultMember Property
The Set
Default Member dialog for the Geography Key appears.
Here
we can select, via the associated radio buttons atop the dialog, a default
member for the attribute, to be used for queries where members from the attribute
are not explicitly specified. If we do not to specify a default member
for the attribute, then the default member becomes the (All)
member (assuming the presence of (All) member), or an arbitrary member
if the (All) member is not present.
4.
Click the radio
button to the immediate left of the Choose a member to be the default
label in the upper portion of the Set Default Member dialog.
The
pane underneath the radio button we have selected becomes enabled, and the Geography
attribute members appear underneath the expanded All Geographies level,
as depicted in Illustration 9.
Illustration 9: Expanded
Geography Tree within the Choose a Member to Be the Default Pane ...
We
note the (grayed) appearance of the associated MDX expression for the selected default
member in the box at the bottom of the dialog, underneath the radio button
selection labeled Enter an MDX expression that specifies the default member.
(We might have selected this button and entered an MDX expression manually,
instead of building one through our selection within the tree above).
5.
Click the Cancel
button in the bottom right corner of the Set Default Member dialog, to dismiss the dialog without making
changes.
Advanced
Property: DiscretizationBucketCount
The DiscretizationBucketCount property allows us to specify the number of buckets in which to discretize
(group) members of the attribute. Discretization is the process
of putting values of a continuous set of data into buckets, so that there are
a discrete number of possible states. (Say we have numeric attribute
with many continuous values. Presenting the member values grouped into buckets
might be more useful to information consumers than displaying the values
themselves.) The buckets are, in turn, treated as ordered and discrete values.
We can discretize both string and numeric columns.
The method
used to discretize the members is set by the next property we will examine, DiscretizationMethod.
Advanced
Property: DiscretizationMethod
The DiscretizationMethod
property allows us to define the method to be used for discretization. Analysis
Services UDM supports multiple discretization algorithms which we can select via this property.
6.
Click the
selector button (the downward pointing arrow shown circled in Illustration 10) to the right of the setting box for the DiscretizationMethod property.
Illustration 10: Click
the Selector Button to the Right of the DiscretizationMethod Property
As we can see within the list that appears next, selection options for Discretization Method includes:
- None - the default;
- Automatic Analysis Services determines which discretization method to use;
- EqualAreas The algorithm divides the data into groups that contain an equal number of values. (This method is best used for normal distribution curves, but does not work well if the distribution includes a large number of values that occur in a narrow group in the continuous data);
- Clusters The algorithm divides the data into groups by sampling the training data, initializing to a number of random points, and then running several iterations of the Microsoft Clustering algorithm using the Expectation Maximization (EM) clustering method. (The Clusters method can only be used with numeric columns.) The Clusters method is useful because it works on any distribution curve, although it requires more processing time than the other discretization methods.
Advanced Property: EstimatedCount
We use EstimatedCount to specify the estimated number of members within the attribute. This value defaults to zero, until we first run the Aggregation Design Wizard. Thereafter, the number is either the amount last counted by Analysis Services, or a user-provided estimate of the member count. That is, we can either manually enter an estimated value, or we can allow the wizard to count the number of records for us. (We can enter a value manually in those cases where we know the number, and want to save the time it takes to query the database for the count.)
Advanced Property: IsAggregatable
IsAggregatable allows us to specify whether the values of the attribute members can be aggregated within a hierarchy. The default value is True, which means that the attribute hierarchy contains an (All) level. If the value for this property is False, the attribute hierarchy does not contain an (All) level.
Advanced Property: OrderBy
The OrderBy property affords us a means of describing how the members contained in the attribute hierarchy should be ordered. The default value is Name, which specifies that ordering of the attribute members is based on the value of the NameColumn property, if any. If the NameColumn property is empty, members are ordered by the value of the key column.
7. Click the selector button (the downward pointing arrow shown circled in Illustration 11) to the right of the setting box for the OrderBy property.
Illustration 11: Click the Selector Button to the Right of the OrderBy Property
As we can see within the list that appears next, the entire set of options that we can specify as a basis for ordering includes the following:
- Key - Order by the value of the key column of the attribute member;
- Name - Order by the value of the NameColumn property (the default, as we have noted);
- AttributeKey - Order by the value of the member key of a specified attribute, which must have an attribute relationship to the attribute;
- AttributeName - Order by the value of the member name of a specified attribute, which must have an attribute relationship to the attribute.
Advanced Property: OrderByAttribute
The OrderByAttribute property allows us to identify the attribute by which we wish to order the members of the attribute hierarchy (assuming we have selected AttributeKey or AttributeName within the OrderBy property discussed above).
8. Click the selector button (the downward pointing arrow) to the right of the setting box for the OrderByAttribute property, shown circled in Illustration 12.
Illustration 12: Click the Selector Button to the Right of the OrderByAttribute Property
As we can see from the list of options that appears, we can order by attributes which have an attribute relationship with the attribute we are ordering (in this case, Geography Key).
We will extend our introductory examination of dimension attributes, specifically continuing our discussion with the Basic, Misc, Parent-Child and Source groups of properties, within Part II of this article.
NOTE: Please consider saving the project we have created to this point for use in subsequent related articles of this subseries, so as to avoid the need to repeat the preparation process we have undertaken initially, to provide a practice environment.
9. Select File -> Save All to save our work, up to this point, within the originally chosen location, where it can be easily accessed for our activities within subsequent articles of this subseries.
10. Select File -> Exit to leave the design environment, when ready, and to close the Business Intelligence Development Studio.
Conclusion
In this, the first half of a two-part article introducing dimension attributes, we continued our current subseries focusing upon dimensional model components, with an objective of discussing the associated concepts, and of providing hands-on exposure to the properties supporting each. We reviewed our initial introduction to the dimensional model and summarized its role in meeting the primary objectives of business intelligence. Next, we provided a brief overview of dimension attributes in general.
Having covered the general characteristics and purposes of attributes, we began our focus upon the properties underlying them, based upon the examination of a representative attribute within our sample cube. In this article, we discussed the Advanced group of properties. Finally, we looked forward to Part II of this article, where we explore the remaining attribute properties, those belonging to the Basic, Misc, Parent-Child and Source groups.
» See All Articles by Columnist William E. Pearson, III
Discuss this article in the MSSQL Server 2000 Analysis Services and MDX Topics Forum.