A Fresh Look at Data Striping in DB2 Universal Database Version 8.1

Tuesday Jan 20th 2004 by DatabaseJournal.com Staff
Share:

A key concurrency-enabling feature, introduced in version 8.1 of DB2 UDB was the ability to start a new data stripe on the underlying storage units. In this article, we introduce you to this new feature and the problems that it solves, and take you through a step-by-step example of how it works.

By Paul C. Zikopoulos and Roman B. Melnyk

The Version 8.1 release of IBM DB2 Universal Database (DB2 UDB) late last year came with the ability to start a new data stripe on the underlying storage units. In this article, we introduce you to this new feature and the problems that it solves, and take you through a step-by-step example of how it works.

Mapping Data to Storage 101

Before we discuss this new enhancement, let's review the basics of the DB2 UDB storage model. Although we assume that you have a good handle on this topic already, a quick summary might be helpful.

To be useful, the data in a database must be persistent - that is the whole point of data management. The DB2 UDB object that is used to map the logical schema of the database to disk is called a table space. (If you come from a Microsoft SQL Server background, you would call this object a data file.) A table space is mapped to storage objects using containers. For example, a file, directory or raw device can act as a container for a DB2 UDB table space.

DB2 UDB has two types of table spaces: system-managed space (SMS) table spaces and database-managed space (DMS) table spaces.

  • SMS table spaces are managed by the operating system and, for the most part, allocate and deallocate space as needed. This type of table space uses a directory on the operating system to store database objects.
  • DMS table spaces are managed by the database manager. The storage objects that are attached to a DMS table space can be files or raw devices. DMS table spaces come with more management overhead, because you have to define the size of the containers.

Generally, those who want to keep the database schema and its maintenance as simple as possible favor SMS table spaces because their storage allocations are automatic and are handled by the underlying operating system. Those who are willing to add a little more complexity to their environment favor DMS table spaces, which have the following advantages:

  • They let you add new containers (underlying data storage objects) easily without the need for redirected restore operations.
  • They let you place specific database objects on specific containers such as, for example, placing indexes on faster disks, and rarely accessed large objects on slower disks.
  • They avoid operating system overhead when accessing storage objects; DMS table spaces are generally thought to deliver 15 to 30% better performance than their SMS counterparts.

Pages in a DMS table space are logically numbered from 0 to n-1, where n is the number of usable pages in the table space. Pages are grouped into extents, and an extent is a container space that is allocated to a single database object. If a DMS table space has several containers, and they are all of equal size, the first extent is located in the first container, the second extent is located in the second container, and so on. After the last container, the process repeats in a round-robin fashion, starting with the first container (see Figure 1). A layer of extents across the set of the containers is called a stripe. If all the stripes in a table space contain the same set of containers, the resulting configuration represents a single range.

 

Cont 0

Cont 1

Cont 2

Stripe 0

Ext 0

Ext 1

Ext 2

Stripe 1

Ext 3

Ext 4

Ext 5

Stripe 2

Ext 6

Ext 7

Ext 8

Stripe 3

Ext 9

Ext 10

Ext 11

Stripe 4

Ext 12

Ext 13

Ext 14

Stripe 5

Ext 15

Ext 16

Ext 17

Figure 1. A DMS table space with three containers of equal size. Extents are grouped into logical units called stripes. This is a single range table space.

Adding More Storage to a DMS Table Space

You can add more containers to a DMS table space when you need to increase the storage capacity of your database. When new containers are added to a table space, a rebalancing of the table space data might occur. Rebalancing is the moving of table space extents from one location to another in an attempt to keep the data striped (Figure 2). It is a good idea to spread your data across all the disks that have containers, thereby increasing the parallelism that you can get out of DB2 UDB. Even if you are not running in a symmetric multiprocessing (SMP) environment, DB2 UDB will take advantage of this parallelism.

Figure 2. In DB2 UDB Version 7.2, adding a new container to a table space could result in a rebalancing operation to keep the data evenly distributed among the containers.

The issue in DB2 UDB Version 7.2 was that when you added a new container to the table space (unless this was part of a redirected restore operation), the operation often forced a rebalancing of the data across all the containers (including the new one), such that the data was evenly distributed.

What's so bad about automatically rebalancing data? After all, the rebalance utility operates online. The truth is that rebalancing a large table space can take a very long time, and the process can have a significant negative impact on performance: in highly concurrent environments, the physical movement of data could cause delays that a database administrator may want to avoid. (We should point out that DB2 UDB Version 8.1.2 adds a throttling feature to the rebalance utility. Using this feature, you can scale back a rebalancing operation so that it will not impact business operations during peak times, and allow it to run at "full throttle" during off-hours.)

In DB2 UDB Version 8.1, you can add new containers to a DMS table space without rebalancing. This is done through the creation of stripe sets. Each stripe set is independent of the other stripe sets in the table space. Because all of the table space data can be found in the existing stripe sets, adding a new stripe set does not require rebalancing (see Figure 3).

 

C0

C1

C2

C3

C4

S0

E

E

E


<- Stripe set 0

S1

E

E

E

S2

E

E

E

S3



Stripe set 1 ->

E

E

S4

E

E

S5

E

E

S6

E

E

S7

E

E

Figure 3. Adding containers to a DMS table space without rebalancing.

A table space starts with three containers (C0, C1, and C2) that are each three extents in size (stripe set 0, shaded blue). Two new containers (C3 and C4) are added using the BEGIN NEW STRIPE SET option on the ALTER TABLESPACE statement. These new containers are each five extents in size (stripe set 1, shaded yellow). The stripe sets are independent of each another.

By Paul C. Zikopoulos and Roman B. Melnyk

The BEGIN NEW STRIPE SET Option

DB2 UDB Version 8.1 introduces the BEGIN NEW STRIPE SET option on the ALTER TABLESPACE statement (Figure 4).

Click for larger image

Figure 4. The BEGIN NEW STRIPE SET option on the ALTER TABLESPACE statement allows you to add new containers to a DMS table space without the need for rebalancing the containers.

The easiest way to do this is through the Control Center, which gives you access to database objects, including table spaces. Figure 5 shows a DMS table space called TS1 that we created in the SAMPLE database. All of the table spaces in the SAMPLE database appear in the contents pane with Table Spaces selected in the object tree.

Click for larger image

Figure 5. The table spaces that are associated with a particular database appear in the contents pane of the Control Center with Table Spaces selected in the object tree.

Clicking Alter in the pop-up menu launches the Alter Table Space window (Figure 6).

Figure 6. The Containers tab of the Alter Table Space window. Selecting the Manage stripe sets check box adds a Stripe Set column to the window.

Clicking the Add button launches the Define Container window (Figure 7). If the Manage stripe sets check box in the Alter Table Space window was selected, you will see a Stripe set drop-down list in the Define Container window, from which you can select a new (or existing) stripe set for the new container.

Figure 7. The Define Container window lets you specify a new container for a DMS table space. You can associate this new container with an existing stripe set, or with a new one, which is equivalent to specifying the BEGIN NEW STRIPE SET clause on the ALTER TABLESPACE statement.

The following code shows the underlying SQL statements that defined the new container TSC2 and stripe set 1:

db2 connect to sample
db2 alter tablespace ts1 begin new stripe set (file 'D:\WorkDir\tsc2' 5120)
db2 connect reset

If you were to add another container (TSC3, for example) to the table space without specifying the BEGIN NEW STRIPE SET option, TSC3 would become part of the new stripe set (1). In addition, if you subsequently added container TSC4, this time specifying the BEGIN NEW STRIPE SET option, a new stripe set (2) would be created, and TSC4 would be part of stripe set 2.

You can add a container to an existing stripe set by specifying the ADD option, along with the TO STRIPE SET clause, as follows:

db2 connect to sample
db2 alter tablespace ts1 add to stripe set 0 (file 'D:\WorkDir\tsc5' 512)
db2 connect reset

If you do not specify the TO STRIPE SET clause, the current (that is, the most recently created) stripe set is assumed.

Conclusion

DB2 UDB Version 8.1 introduces the BEGIN NEW STRIPE SET option on the ALTER TABLESPACE statement. This option allows you to add more storage to DMS table spaces without the need for rebalancing containers.

About the Authors

Paul C. Zikopoulos, BA, MBA, is with IBM Canada Ltd.  Paul has written numerous magazine
articles and books about DB2. Paul has co-authored the books: DB2 Version 8:
The Official Guide, DB2: The Complete Reference, DB2 Fundamentals
Certification for Dummies, DB2 for Dummies, and A DBA's Guide to Databases
on Linux. Paul is a DB2 Certified Advanced Technical Expert (DRDA and
Cluster/EEE) and a DB2 Certified Solutions Expert (Business Intelligence
and Database Administration). You can reach him at: paulz_ibm at msn.com.

Roman B. Melnyk, PhD, is with IBM Canada Ltd., specializing in database administration, DB2
utilities, and SQL. Roman has written numerous DB2 books, articles, and
other related materials. Roman co-authored DB2 Version 8: The Official
Guide, DB2: The Complete Reference, DB2 Fundamentals Certification for
Dummies, and DB2 for Dummies. You can reach him at roman_b_melnyk at
hotmail.com.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved