Attribute Discretization: Customize Grouping Names

Friday Jul 17th 2009 by William Pearson
Share:

Join BI Architect Bill Pearson as he continues to explore attribute discretization, as a part of an extended examination of the dimensional model lying at the heart of the integrated Microsoft Business Intelligence solution.

This article continues my exploration of attribute discretization, a capability in Analysis Services that allows us to group members of an attribute into a number of member groups. Our concentration here will be to go beyond the individual examinations of the methods themselves that we undertook in other articles, and to get some exposure to custom naming of the groups created by the discretization process we choose. The practice we undertake to this end will, once again, surround a representative dimension attribute within our sample UDM.

Introduction

This article continues the overview of Attribute Discretization in Analysis Services we began in Introduction to Attribute Discretization, and continued in Attribute Discretization: Using the Automatic Method, Attribute Discretization: Using the “Equal Areas” Method and Attribute Discretization: Using the Clusters Method. Both this article and its Discretization-related predecessors extend the examination of the dimensional model begun in Dimensional Model Components: Dimensions Parts I and II, and continued through Dimensional Attributes: Introduction and Overview Parts I through V, as well as our focus upon the properties underlying attributes, extending our overview into attribute member Keys, Names, Values and Relationships within several subsequent articles.

Note: For more information about my Introduction to MSSQL Server Analysis Services column in general, see the section entitled “About the MSSQL Server Analysis Services Series” that follows the conclusion of this article.

In Introduction to Attribute Discretization, Attribute Discretization: Using the Automatic Method, Attribute Discretization: Using the “Equal Areas” Method and Attribute Discretization: Using the Clusters Method, I summarized preceding articles within the current subseries, consisting of a general introduction to the dimensional model. I noted the wide acceptance of the dimensional model as the preferred structure for presenting quantitative and other organizational data to information consumers. The articles of the series then undertook an examination of dimensions, and, subsequently, dimension attributes, the analytical “perspectives” and structures upon which the dimensional model relies in meeting the primary objectives of business intelligence, including its capacity to support:

  • the presentation of relevant and accurate information representing business operations and events;
  • the rapid and accurate return of query results;
  • “slice and dice” query creation and modification;
  • an environment wherein information consumers can pose questions quickly and easily, and obtain rapid results datasets.

With Introduction to Attribute Discretization, we introduced a capability in Analysis Services – to which we refer as attribute discretization - that allows us to group members of an attribute into a number of member groups. We discussed design and other considerations involved in the discretization of attributes, and touched upon best practices surrounding the use of this capability. In Attribute Discretization: Using the Automatic Method, we introduced the first of multiple pre-defined discretization methods supported within the Analysis Services UDM. We discussed the options that are available, focusing upon the employment of the Automatic discretization method within the sample cube, to meet the business requirements of a hypothetical client. We then began our practice session with an inspection of the contiguous members of a select attribute hierarchy, noting the absence of grouping and discussing shortcomings of this default arrangement. Next, we enabled the Automatic discretization method within the dimension attribute Properties pane, and then reprocessed the sample cube with which we were working to enact the new Automatic discretization of the select attribute members. Finally, we performed further inspections of the members of the attribute hierarchy involved in the request for assistance by our hypothetical client, noting the new, more intuitive grouping established by the newly enacted Automatic discretization method.

Next, in Attribute Discretization: Using the Equal Areas Method, we introduced the second of the pre-defined discretization methods supported within the Analysis Services UDM. We discussed the options that are available with this particular approach, as we did in the article previous for the Automatic method, focusing upon the employment of the Equal Areas discretization method, again within the sample cube, to meet the business requirements of a hypothetical client. We then began our practice session with an inspection, via the browser in the Dimension Designer, of the contiguous members of another select attribute hierarchy, noting the absence of grouping and discussing shortcomings of this default arrangement. Next, we enabled the Equal Areas discretization method within the dimension attribute Properties pane, and again reprocessed the sample cube with which we were working to enact the new Equal Areas discretization of the select attribute members. Finally, we performed additional inspections, via the Dimension Designer and Cube Designer browsers, of the members of the attribute hierarchy involved in the request for assistance by our hypothetical client, noting the new, more intuitive grouping established by the newly enacted Equal Areas discretization method.

In last month’s article, Attribute Discretization: Using the Clusters Method, we introduced the third of the pre-defined discretization methods supported within the Analysis Services UDM. We again discussed the options that are available with this particular approach, as we did in the previous articles for the Automatic and Equal Areas methods, focusing this time upon the employment of the Clusters discretization method, to meet the business requirements of a hypothetical client within the sample cube. Our practice session again followed with an inspection of the “pre-discretized” contiguous members of another select attribute hierarchy, where we again noted the absence of grouping and discussed shortcomings of this default arrangement. We then enabled the Clusters discretization method within the dimension attribute Properties pane, and after reprocessing the sample cube with which we were working, we performed another inspection, again via the Dimension Designer and Cube Designer browsers, of the members of the affected attribute hierarchy, noting the more intuitive grouping established by the newly enacted Clusters discretization method.

In this article, we will gain some hands-on exposure to going beyond the First Group Member – Last Group Member default that Analysis Services uses in creating group labels within the various discretization methods, and enacting the generation of custom labels for our groups. Our examination will include:

  • A brief review (for those joining our “discretization subseries” for the first time) of attribute discretization in Analysis Services, potential benefits that accrue from discretization in our UDMs, and how the process can help us to meet the primary objectives of business intelligence.
  • Examination, via the browser in the Dimension Designer, of the pre-existing members of a select attribute hierarchy, noting the absence of grouping and discussing shortcomings of this default arrangement.
  • A discussion of customized discretization group labeling within Analysis Services.
  • Enablement of the Automatic discretization method within the dimension attribute Properties pane.
  • Reprocessing the cube to enact the new Automatic discretization of the select attribute members.
  • Another examination, via the browsers in both the Dimension Designer and the Cube Designer, of the members of our attribute hierarchy selection, noting the new, more intuitive grouping established by the newly enacted Automatic discretization method, together with the default labels provided by Analysis Services.
  • Modification of a copy of the naming template supplied with Analysis Services
  • Reprocessing the cube to enact the modified naming template we have supplied, within the context of Automatic discretization of the selected attribute members.
  • A final examination, via the browsers in both the Dimension Designer and the Cube Designer, of the members of a select attribute hierarchy, noting the new, more intuitive grouping, together with more user-friendly labels, established by the newly enacted Automatic discretization method.

Attribute Discretization with Custom Group Names

In this article, we will perform attribute discretization once again, but in this practice session, we will extend hands-on exposure with discretization to include the addition of custom group labels.

For those first joining my subset of articles surrounding discretization, let’s do a brief overview of Analysis Services discretization in general. (Those who have “been along for the ride” can, of course, skip directly to the sections below.) As we learned in Introduction to Attribute Discretization, whenever we work with attributes, we can expect to encounter two general types of values, discrete and contiguous. Discrete values stand apart distinctly, and have clearly defined logical “boundaries” between themselves. Citing the Gender attribute, within the Customer dimension of the Adventure Works sample UDM, wherein the attribute is considered to have only one of two discrete values, female or male, we noted that possible values are naturally discrete for the lion’s share of attributes occurring in the business world.

In contrast to discrete values, we noted that contiguous values do not stand apart distinctly, but flow along, as if in a continuous line. Moreover, we discussed the fact that contiguous values, especially within large populations, can have very large numbers of possible values, and that information consumers can find it difficult to work effectively and efficiently within such wide ranges of values. As an example, we cited the Vacation Hours attribute, within the Employee dimension of the Adventure Works sample UDM, an attribute which could have a wide range of possible values, depending upon how many employees are involved, whether there are limits on how many vacation days they can accumulate, and considerations of this nature. (The member values are based directly upon the values contained within the VacationHours column of the DimEmployee table - with many of the values shared among multiple employees.) The sheer number of values might make working with them cumbersome for information consumers, if they are simply made available in their existing state.

As we discussed in Introduction to Attribute Discretization, discretization can help us to make it easier for information consumers to work with large numbers of possible attribute member values. As discretization creates a manageable number of groups of attribute values that are clearly separated by boundaries, we can thereby group contiguous values into sets of discrete values, via a system-generated collection of consecutive dimension members known as member groups.

We also discussed, in Introduction to Attribute Discretization, that once the discretization process groups the attribute members into the member groups, the member groups are then housed within a level within the dimensional hierarchy. (A given level within a dimensional hierarchy can contain either members or member groups, but not both.) When information consumers browse a level that contains member groups, they see the names and cell values of the member groups. The members generated by Analysis Services to support member groups are called grouping members, and they look like ordinary members.

Analysis Services affords us flexibility in methods of attribute discretization, based upon algorithms of varying complexity. The different methods of discretization all have the same function – to group contiguous values into sets of discrete values. They simply manage grouping via different approaches. Beyond the Dimension Designer, Analysis Services also supports user-defined discretization, via data definition language (DDL), should the “out – of – the – box” approaches not meet the business needs of our local environments. Moreover, we can alternatively implement custom discretization via the underlying data warehouse, using views at the relational level; named calculations in the data source view; calculated members in Analysis Services; or via other approaches.

Analysis Services supports four DiscretizationMethod property settings, which include three pre-defined discretization methods:

  • None (default): Analysis Services performs no grouping, and simply displays the attribute members.
  • Automatic: Analysis Services selects the method that best represents the data: either the EqualAreas method or the Clusters method.
  • EqualAreas: Analysis Services attempts to divide the attribute members into groups that contain an equal number of members.
  • Clusters: Analysis Services attempts to divide the members of the attribute into groups that contain an equal number of members. (Per the Books Online, this method is useful because it works on any distribution curve, but is more expensive in terms of processing time.)

The DiscretizationMethod property determines whether Analysis Services is to create groupings, and then determines the type of grouping that is performed. As we learned in Introduction to Attribute Discretization, Analysis Services does not perform any groupings by default (the default setting for the DiscretizationMethod property is “None”). When we enable Automatic grouping, as we shall see in the practice session that follows, we direct Analysis Services to automatically determine the best grouping method based upon the structure of the attribute for which we are performing discretization. We then process the affected dimension / cube and Analysis Services creates group ranges, and then distributes the total population of attribute members appropriately across those groups. As we shall see, once we specify a grouping method, we next specify the number of groups, by using the DiscretizationBucketCount property (its default value is zero).

As we have noted throughout my MSSQL Server Analysis Services column, as well as throughout my other Database Journal series’, one of the most important objectives in building a high performance Analysis Services solution is an efficient and effective dimension design. The identification of opportunities where we can effectively use attribute discretization, and the effective design, placement and labeling of the member groups that we generate thereby, can mean the provision of a much more consumer-friendly interface for our clients and employers. We will gain hands - on exposure to the Automatic method of attribute discretization in the practice session that follows, primarily as a means of supporting the focus of our article, the custom naming of groups, versus the simple acceptance of the default names provided by Analysis Services, anytime we discretize attributes. We will first enable Automatic discretization, and then process the cube, which we will examine for default naming of the attribute groups created. We will then focus upon the setup for customized group labeling, reprocess the cube, and then examine the changes in the naming of our attribute groups.

Before we get started working within a sample cube clone, we will need to prepare the local environment for the practice session. We will take steps to accomplish this within the section that follows.

Preparation: Locate and Open the Sample Basic UDM Created Earlier

In Dimensional Model Components: Dimensions Part I, we created a sample basic Analysis Services database within which to perform the steps of the practice sessions we set out to undertake in the various articles of this subseries. Once we had ascertained that the new practice database appeared to be in place, and once we had renamed it to ANSYS065_Basic AS DB, we began our examination of dimension properties. We continued with our examination of attributes within the same practice environment, which we will now access (as we did within the earlier articles of this subseries)) by taking the following steps within the SQL Server Business Intelligence Development Studio.

NOTE: Please access the Analysis Services database which we prepared in Dimensional Model Components: Dimensions Part I (and have used in subsequent articles) before proceeding with this article. If you have not completed the preparation to which I refer, or if you cannot locate / access the Analysis Services database with which we worked in the referenced previous articles, please consider taking the preparation steps provided in Dimensional Model Components: Dimensions Part I before continuing, and prospectively saving the objects with which you work, so as to avoid the need to repeat the preparation process we have already undertaken for subsequent related articles within this subseries.

1.  Click Start.

2.  Navigate to, and click, the SQL Server Business Intelligence Development Studio, as appropriate.

We briefly see a splash page that lists the components installed on the PC, and then Visual Studio .NET 2005 opens at the Start page.

3.  Close the Start page, if desired.

4.  Select File -> Open from the main menu.

5.  Click Analysis Services Database ... from the cascading menu, as shown in Illustration 1.

Opening the Analysis Services Database
Illustration 1: Opening the Analysis Services Database ...

The Connect to Database dialog appears.

6.  Ensuring that the Connect to existing database radio button atop the dialog is selected, type the Analysis Server name into the Server input box (also near the top of the dialog).

7.  Using the selector just beneath, labeled Database, select ANSYS065_Basic AS DB, as depicted in Illustration 2.

Selecting the Basic Analysis Services Database
Illustration 2: Selecting the Basic Analysis Services Database ...

8.  Leaving other settings on the dialog at default, click OK.

SQL Server Business Intelligence Development Studio briefly reads the database from the Analysis Server, and then we see the Solution Explorer populated with the constituent database objects. Having overviewed attribute discretization in this and the previous article, we will now get some hands-on exposure to the use of Automatic attribute discretization for the members of a representative dimension attribute within our practice UDM. Our objective will be to then examine the default labeling of the newly created attribute groups, before enabling the custom labeling of the same attribute groups through the use of the naming template supplied by Analysis Services, reprocessing the cube, and re-examining the attribute groups for the custom naming we have enacted.

Procedure: Employ the Automatic Discretization Method for Attribute Members in Analysis Services

When we browse a cube, we typically dimension the members of one attribute hierarchy by the members of another attribute hierarchy. For example, we might group customer sales by product purchased, by customer geography (state, province or country), or by customer gender. However, with certain types of attributes, it is useful to have Analysis Services automatically create groupings of attribute members, particularly when large numbers of contiguous values are involved, based upon the distribution of the members within an attribute hierarchy.

As an example, let’s assume that we have been approached by representatives of our hypothetical client, the Adventure Works organization. These representatives, members of the Human Resources department of the business, tell us that one of the existing attributes of the Employee dimension, Vacation Hours, does not serve them well in browses and reports, as the many different values are simply listed: the members are simply derived from the unique Vacation Hours values in the VacationHours column of the DimEmployee table that underlies the Analysis Services layer of the business intelligence solution. (The developer of the cube, having returned to his home country, is no longer available to assist the HR department in modifying the Vacation Hours presentation into a more useful format.)

We listen to the description of the problem, and then suggest grouping the Vacation Hours information via the pre-defined Automatic discretization method offered within Analysis Services. We suggest this as a starter approach, so that our client colleagues can see how grouping the values in this manner might make the data more meaningful in browsing / reporting. We tell them that, once we do this, information consumers who browse the Vacation Hours attribute hierarchy will see the names / values of the groups instead of the members themselves. This limits the number of levels that are presented to users, which can be less confusing, and more useful for analysis. (We inform the client representatives, too, that other methods of discretization can be leveraged if the Automatic method proves less than ideal – we are simply suggesting the Automatic method as a good starting point to illustrate an approach to meeting the end objective – an approach that we can easily “tweak” once our colleagues understand the general concept of discretization, and can then make intelligent choices with regard to specific nuances that they might deem desirable.

Browse the Existing, Ungrouped Members within the Attribute Hierarchy

We will begin our practice with the Automatic discretization method within the Vacation Hours attribute hierarchy of the Employee dimension.

1.  Within the Solution Explorer, right-click the Employee dimension (expand the Dimensions folder as necessary).

2.  Click Open on the context menu that appears, as shown in Illustration 3.

Opening the Dimension via the Dimension Designer
Illustration 3: Opening the Dimension via the Dimension Designer ...

The tabs of the Dimension Designer open. Let’s look at the current state of the attribute under consideration, Vacation Hours.

3.  Click the Browser tab.

4.  Select Vacation Hours within the dropdown Hierarchy selector atop the Browse tab, as depicted in Illustration 4.

Select the Vacation Hours Hierarchy to Browse
Illustration 4: Select the Vacation Hours Hierarchy to Browse ...

The attributes belonging to the Vacation Hours attribute hierarchy appear as partially shown in Illustration 5.

The Member Attributes, Vacation Hours Attribute Hierarchy of the Employee Dimension
Illustration 5: The Member Attributes, Vacation Hours Attribute Hierarchy of the Employee Dimension

We note that many contiguous values appear. It is easy to see how grouping these values into logical “buckets” might make the data easier for information consumers to analyze.

Add Automatic Discretization within the Attribute Hierarchy

We will enact the Automatic discretization method from the Dimension Structure tab of the Employee dimension.

1. Click the Dimension Structure tab.

2. Within the Attributes pane, select Vacation Hours (the bottom attribute hierarchy in the pane).

The Properties window for the attribute appears (by default in the bottom right corner of the design environment), as depicted in Illustration 6.

The Properties Window for the Vacation Hours Attribute
Illustration 6: The Properties Window for the Vacation Hours Attribute

3. In the Properties window, click the setting box to the right of the DiscretizationMethod property (which currently contains the word “None”), to enable the selector on the right edge of the box.

4. Select Automatic within the selector, as shown in Illustration 7.

Changing the DiscretizationMethod Property Setting to “Automatic” ...
Illustration 7: Changing the DiscretizationMethod Property Setting to “Automatic” ...

The DiscretizationMethod property specifies the method used to group the members of the owner attribute. Once we tell Analysis Services the method of discretization, we can tell it how many “buckets” to create for purposes of grouping the attribute members.

5. In the setting box to the right of the DiscretizationBucketCount property (which currently contains the “0”), replace the value for the DiscretizationBucketCount property with the number 10.

The DiscretizationBucketCount property specifies the number of buckets in which to discretize members of the attribute to which the property belongs. The default setting is “0”. When we leave the DiscretizationBucketCount property at default, Analysis Services generates the number of groups independently, after sampling the underlying data involved.

The affected portion of the Properties window for the Vacation Hours attribute hierarchy appears as depicted (modifications surrounded by the red box) in Illustration 8.

Properties Window for the Vacation Hours Attribute Hierarchy, with Modifications
Illustration 8: Properties Window for the Vacation Hours Attribute Hierarchy, with Modifications

We are now ready to process the Analysis Services database, and then to examine the results of our handiwork with the browsers, as before.

Process the Analysis Services Database to Update Structure for Our Discretization Settings

We will process the Analysis Services database within which we have been working, and then browse the Vacation Hours attribute members to confirm that we have provided preliminary answers to the client’s stated requirements.

1. Right-click the Analysis Services project atop the tree in the Solution Explorer.

2. Select Process ... from the context menu that appears, as shown in Illustration 9.

Select Process ... to Process the Analysis Services Project
Illustration 9: Select Process ... to Process the Analysis Services Project

3. Click Yes on the dialog asking if you would like to save changes, which appears as depicted in Illustration 10.

Click Yes to Save All Changes before Processing
Illustration 10: Click Yes to Save All Changes before Processing

Information updates on the server, and then the Process Database dialog appears, as shown in Illustration 11.

The Process Database Dialog Appears ...
Illustration 11: The Process Database Dialog Appears ...

4. Click the Run button on the dialog.

The Process Progress viewer appears, and generates periodic status updates for various processing events. When processing is complete, we see a Process succeeded message appear in the Status bar in the lower part of the viewer, as depicted in Illustration 12.

Process Succeeded Message Appears ...
Illustration 12: Process Succeeded Message Appears ...

5. Click the Close button on the Process Progress viewer to dismiss the viewer.

6. Click the Close button on the Process Database dialog to dismiss the dialog.

We are now ready to return to the Dimension Designer browser to examine the results of our handiwork.

Browse the Newly Discretized Attribute with the Dimension Browser

1. Click the Browser tab in the Dimension Designer.

2. Ensure that Vacation Hours remains selected in the Hierarchy selector atop the Browser tab.

3. Click the Reconnect button atop the tab, as shown in Illustration 13.

Partial Browser View - before Reconnecting
Illustration 13: Partial Browser View - before Reconnecting

The browser details update, and (assuming the All Employees level remains expanded in the browser), we see ten groups appear, as depicted in Illustration 14.

The Discretized Attribute Member’s Groups
Illustration 14: The Discretized Attribute Member’s Groups

We see the groups appear, as expected. Ten groups have been created, based upon the algorithm selected by the Automatic discretization method.

4.  Leave the Employee Dimension Designer open for a later section.

Browse the Newly Discretized Attribute with the Cube Browser

Let’s go one step further and examine the results of our selection of the Automatic discretization method from another practical perspective, that of the cube browser. This will give us an appreciation for the improvements seen by the information consumers in querying / analyzing from the affected data.

1.  Within the Solution Explorer, once again, right-click the Basic cube (expand the Cubes folder as necessary).

2.  Click Open on the context menu that appears, as shown in Illustration 15.

Opening the Cube via the Cube Designer ...
Illustration 15: Opening the Cube via the Cube Designer ...

The tabs of the Cube Designer open, and we arrive, by default, at the Cube Structure tab.

3.  Click the Browser tab.

4.  Click the Reconnect button atop the tab.

5.  In the Metadata pane, expand the Employee dimension by clicking the “+” sign to its immediate left.

6.  Expand the newly exposed Organization folder.

7.  Right-click the Vacation Hours attribute within the expanded Organization folder.

8.  Select Add to Row Area from the context menu that appears, as depicted in Illustration 16.

Adding the Vacation Hours Attribute to the Browser Row Area ...
Illustration 16: Adding the Vacation Hours Attribute to the Browser Row Area ...

We see all ten Vacation Hours buckets appear in the rows of the browser pane.

9.  Click and drag the Employee Name attribute to the immediate right of the physical column containing the newly placed Vacation Hours buckets (a line will form at the drop point), juxtaposing the Employee Names on rows to the immediate right of the Vacation Hours, as shown in Illustration 17.

Juxtaposing the Employee Name alongside the Vacation Hours in Rows
Illustration 17: Juxtaposing the Employee Name alongside the Vacation Hours in Rows

All ten Vacation Hours buckets continue to appear in the rows of the browser pane – with “+” sign “expand” buttons appearing to the immediate left of the bucket labels.

10.  Expand the 70-79 and the 90-99 Vacation Hours buckets by clicking the “+” sign to its immediate left of each label.

The two Vacation Hours buckets expand, revealing lists of the employee members whose total Vacation Hours place them within the respective buckets in which they appear, as partially depicted in Illustration 18.

Select Vacation Hours Buckets, Expanded to Show Membership (Partial View)
Illustration 18: Select Vacation Hours Buckets, Expanded to Show Membership (Partial View)

In what is but one example of how we can use the Vacation Hours buckets, we can see lists of employees that have Vacation Hours on the books for each of the ten buckets we have created via Automatic discretization. We tell our client colleagues that they might use this arrangement to do far more than present lists of the members of the various strata. They might also flesh out the browser with other dimension members (such as Calendar Years, etc.), drop in various measures (such as Reseller Sales and the like), and perform analysis (as another simple example) upon employee balances of “unused vacation” when viewed within the perspective of the sales figures attributed to those employees.

Procedure: Employ the Naming Template to Support Custom Attribute Group Naming for Attribute Members in Analysis Services

The client representatives with whom we are working state that they are pleased with the overall results, but then they make one further request: they tell us that they would prefer that each of the “buckets” we have created to group the Vacation Hours be assigned a more explicit label. They tell us that they would rather that the label be “beefed up” to include more explanatory verbiage – in accordance the following example:

Instead of:

“0 - 8”

they would rather see:

“8 & Under”

and, for all members of the “intermediate ranges,” instead of, say,

“9 - 19”

They would rather see:

“Between 9 & 19 hours (incl)”.

Moreover, instead of

“90 - 99”

they would rather see:

“90 & Over”

We inform our client colleagues that Analysis Services generates member group names automatically when it creates the groups, based upon a default naming template, which is used unless we specify another naming template within the Format setting for the NameColumn property of an attribute, as we shall see. A separate naming template can be defined for each language specified in the Translations collection of the respective column underlying the NameColumn property of the attribute.

The Format setting uses the following string expression to define the naming template:

<Naming template> ::= <First definition> [;<Intermediate definition>;<Last definition>]

<First definition> ::= <Name expression>

<Intermediate defintion> ::= <Name expression>

<Last definition> ::= <Name expression>

The <First definition> parameter applies only to the first (or only, if there is only one) member group generated by the discretization method. If the optional parameters <Intermediate definition> and <Last definition> are not provided, the <First definition> parameter is used for all member groups generated for that attribute.

The <Last definition> parameter applies only to the last member group generated by the discretization method. The <Intermediate bucket name> parameter applies to every member group other than the first or last member group generated by the discretization method. (If two or fewer member groups are generated, this parameter is ignored.) The <Bucket name> parameter is a string expression that can incorporate a set of variables to represent member or member group information as part of the name of the member group, as presented in Table 1 below:

Variable

Description

%{First bucket member}

The member name of the first member to be included in the current member group.

%{Last bucket member}

The member name of the last member to be included in the current member group.

%{Previous bucket last member}

The member name of the last member assigned to the previous member group.

%{Next bucket first member}

The member name of the first member to be assigned to the next member group.

%{Bucket Min}

The minimum value of the members to be assigned to the current member group.

%{Bucket Max}

The maximum value of the members to be assigned to the current member group.

%{Previous Bucket Max}

The maximum value of the members to be assigned to the previous member group.

%{Next Bucket Min}

The minimum value of the members to be assigned to the next member group.


Table 1: Set of Variables Incorporated by the <Bucket Name> Parameter

Use the Naming Template to Support Custom Attribute Group Naming

Let’s put what we have learned to work to help our client colleagues reach their objectives in providing more descriptive names for Employee Vacation Hours.

1.  Click the Employee.dim tab to return to the Dimension Designer for the Employee dimension.

2.  Click the Dimension Structure tab, if necessary, within the Dimension Designer to expose, once again, the Attributes pane, as we did earlier.

3.  Click the Vacation Hours attribute, once again.

4.  In the Properties pane, scroll to the Source section, focusing upon the KeyColumns and NameColumn settings, as shown in Illustration 19.

KeyColumns and NameColumn Settings for the Vacation Hours Attribute
Illustration 19: KeyColumns and NameColumn Settings for the Vacation Hours Attribute

We note that the NameColumn setting is empty, leaving us to conclude that the member names are being derived from the KeyColumns setting.

5.  Click the ellipses (“ ... ”) button to the immediate right of the KeyColumns property setting box, as depicted (encircled in red) in Illustration 20.

Click the Ellipses (“ ... “) Button to the Right of the KeyColumns Property
Illustration 20: Click the Ellipses (“ ... “) Button to the Right of the KeyColumns Property

The DataItem Collection Editor appears.

6.  Type (or cut and paste) the following into the Format property.

%{Last Bucket member} & Under; Between %{First Bucket member} & %{Last Bucket member} (incl); %{First Bucket member} & Over

An example of the intended output of each of the three delimited sections in our syntax above is presented in Table 2 below.

Syntax Section

Example Intended Output

%{Last Bucket member} & Under

“8 & Under”

Between %{First Bucket member} & %{Last Bucket member} (incl)

“Between 9 & 19 hours (incl)”

%{First Bucket member} & Over

“90 & Over”


Table 2: The Three Sections of Our Syntax and their Intended Output ...

Our syntax partially appears within the Format property box of the DataItem Collection Editor as shown (modifications surrounded by the red box) in Illustration 21.

Our Syntax with the Format Property of the DataItem Collection Editor (Partial View)
Illustration 21: Our Syntax with the Format Property of the DataItem Collection Editor (Partial View)

7. Click OK on the DataItem Collection Editor to save our input and to dismiss the Editor.

We are now ready to process the Analysis Services database, and then to examine the results of our handiwork with the browsers, as before.

Process the Analysis Services Database to Update for Our Custom Attribute Group Names

We will process the Analysis Services database within which we have been working, as we did earlier, and then browse the Vacation Hours attribute members, once again to confirm that we have adequately provided support for the custom naming requirements that our client colleagues have requested.

1. Right-click the Analysis Services project atop the tree in the Solution Explorer, as we did earlier.

2. Select Process ... from the context menu that appears, once again.

3. Click Yes on the dialog asking if we would like to save changes, as before.

Information updates on the server, and then the Process Database dialog appears.

4. Click the Run button on the dialog, once again.

The Process Progress viewer appears, and generates periodic status updates for various processing events, as we noted earlier. When processing is complete, we see a Process succeeded message appear in the Status bar in the lower part of the viewer, once again.

5. Click the Close button on the Process Progress viewer to dismiss the viewer.

6. Click the Close button on the Process Database dialog to dismiss the dialog.

We are now ready to return to the Dimension Designer browser to examine the results of our handiwork.

Browse the Newly Named Attribute Groups with the Dimension Browser

1. Click the Browser tab in the Dimension Designer.

2. Ensure that Vacation Hours remains selected in the Hierarchy selector atop the Browser tab.

3. Click the Reconnect button atop the tab, as we did in the earlier section.

The browser details update, and (assuming the All Employees level remains expanded in the browser), we see ten groups appear, as depicted in Illustration 22.

The Custom-Named, Discretized Attribute Member Groups
Illustration 22: The Custom-Named, Discretized Attribute Member Groups

We see the groups appear, as expected. Moreover, we note that the ten groups created by Analysis Services via the Automatic Discretization method, display the names in the format requested by the client representatives with which we have been working.

4.  Close the Employee Dimension Designer, as desired.

Browse the Newly Named Attribute Groups with the Cube Browser

Let’s go one step further and examine the results of use of the attributes group naming template from another practical perspective, that of the cube browser. As we noted in the similar browse of the earlier section, this will give us an appreciation for the improvements seen by the information consumers in querying / analyzing from the affected data.

1.  Click the Browser tab within the Cube Designer, once again.

2.  Click the Reconnect button atop the tab, as before.

3.  In the Metadata pane, expand the Employee dimension by clicking the “+” sign to its immediate left.

4.  Expand the newly exposed Organization folder.

5.  Right-click the Vacation Hours attribute within the expanded Organization folder.

6.  Select Add to Row Area from the context menu that appears, as we did in the earlier section.

We see all ten Vacation Hours buckets, with new custom labeling, appear in the rows of the browser pane.

7.  Click and drag the Employee Name attribute to the immediate right of the physical column containing the newly placed Vacation Hours buckets, as we did earlier, juxtaposing the Employee Names on rows to the immediate right of the Vacation Hours, once again.

All ten Vacation Hours buckets continue to appear in the rows of the browser pane – with “+” sign “expand” buttons appearing to the immediate left of the bucket labels.

8.  Expand the Between 70 & 79 (incl) and the 90 & Over Vacation Hours buckets by clicking the “+” sign to its immediate left of each custom label.

The two Vacation Hours buckets expand, revealing lists of the employee members whose total Vacation Hours place them within the respective buckets in which they appear, as partially shown in Illustration 23.

Select Vacation Hours Buckets, Expanded to Show Membership
Illustration 23: Select Vacation Hours Buckets, Expanded to Show Membership

We again emphasize to our client colleagues that they might use this arrangement to do far more than present lists of the members of the various strata. We also note that the naming template can be modified to meet similar, or even somewhat different, custom naming needs for the attribute groups they create via discretization, and encourage them to experiment further with the procedures we have examined together.

Having demonstrated the potential effects that we can achieve using the naming template, in conjunction with Automatic discretization (but certainly available to the other discretization methods, as well), we turn the development environment over to the client representatives with which we have worked. Our colleagues express satisfaction with our efforts, and state that they grasp the concepts adequately to leverage the naming template anytime they apply any of the discretization methods offered within Analysis Services to other attributes within their cubes.

9.  Experiment further within the cube browser, as desired.

10.  Select File -> Exit to leave the design environment, when ready, and to close the Business Intelligence Development Studio.

Conclusion

In this article, we continued our exploration of discretization in Analysis Services, this time with the objective of introducing, and gaining some hands-on exposure to employing, custom naming for the attribute groups created within the various discretization methods. Our focus was to go beyond the First Group Member – Last Group Member default that Analysis Services uses in creating group labels during discretization, and enacting the generation of custom labels for our groups.

We first discussed the discretization options that are available (referring to individual articles covering each within this Database Journal subseries), before choosing to work with Automatic discretization in the sample cube, to serve as a basis for meeting the custom naming requirements of a hypothetical client. Our examination included a brief, general review of attribute discretization in Analysis Services, potential benefits that accrue from discretization in our UDMs, and how the process can help us to meet the primary objectives of business intelligence. We performed an overview of the multiple pre-defined discretization processes supported within the Analysis Services UDM. We then began our practice session with an inspection, via the browser in the Dimension Designer, of the contiguous members of a select attribute hierarchy, noting the absence of grouping and discussing shortcomings of this default arrangement.

Next, we enabled the Automatic discretization method within the dimension attribute Properties pane. We then reprocessed the sample cube with which we were working to enact the new Automatic discretization of the select attribute members. Finally, we performed further inspections, via the Dimension Designer and Cube Designer browsers, of the members of the attribute hierarchy involved in the request for assistance by our hypothetical client, noting the new, more intuitive grouping (and the default naming of the attribute groups) established by the most recent use of the Automatic discretization method.

Once we had a discretized the attribute members and examined the default naming of the resulting groups, we discussed the naming template supplied with Analysis Services, describing its parts and their uses. We next modified the template to meet the naming requirements requested by our hypothetical client, and placed the syntax within the Format property of the affected attribute. We then reprocessed the cube to enact the new naming template we had supplied, within the context of Automatic discretization of the selected attribute members. Finally, we performed a final examination, via the browsers in both the Dimension Designer and the Cube Designer, of the members of a select attribute hierarchy, noting the new, more intuitive grouping, together with more user-friendly labels, established by the newly enacted Automatic discretization method.

About the MSSQL Server Analysis Services Series

This article is a member of the series Introduction to MSSQL Server Analysis Services. The series is designed to provide hands-on application of the fundamentals of MS SQL Server Analysis Services (“Analysis Services”), with each installment progressively presenting features and techniques designed to meet specific real-world needs. For more information on the series, please see my initial article, Creating Our First Cube. For the software components, samples and tools needed to complete the hands-on portions of this article, see Usage-Based Optimization in Analysis Services 2005, another article within this series.

» See All Articles by Columnist William E. Pearson, III

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved