Is Your Database Healthy?

Despite the sophistication of the latest DB2 software versions and the power of current IBM z/server technology, it is still possible for performance and data availability to deteriorate due to a variety of things, including increased dataset extents, loss of clustering, index page splits, and other factors.

This article presents simple SQL statements*  that the database administrator (DBA) can execute against the DB2 catalog to determine if one or more application databases suffer from common maladies, and what the DBA can do to fix or mitigate potential problems.

Application Table and Tablespace Performance

In DB2 for z/OS, tablespaces are collections of physical datasets that contain table rows.  As table rows are inserted, updated, and deleted the internal structure of the tablespace can become disorganized, increasing unused space, lengthening row retrieval elapsed times, and increasing CPU used.  Use the following queries to find specific issues.

  • Tablespace partitions with relocated rows. When an application updates a row, DB2 attempts to write it back to the same location in the tablespace. However, sometimes no space is available; for example, if the row length has increased. In these cases, DB2 inserts the row somewhere else in the tablespace, and replaces the original row location with a pointer to the new row location. As this happens to more and more rows, the pointers take up more and more space. In addition, rows may no longer be stored in clustering sequence.  Address this issue by executing the Reorg utility on the tablespace.
    SELECT  TPT.DBNAME, TPT.TSNAME   
    ,SUBSTR(DIGITS(TPT.PARTITION),4,2)    AS P#       
    ,DECIMAL(TPT.CARDF,11,0)          AS CARD    
    ,'FARINDREF OVER 5%    (FARINDREF, % OF CARD)'   
                                     AS DESCRIPTION        
    ,TPT.FARINDREF  
    ,DECIMAL(TPT.FARINDREF * 100. / TPT.CARDF,11,0)   
             AS  PCT_OF_CARD       
    ,DATE(TPT.STATSTIME)  AS STATS_DATE  
    FROM    SYSIBM.SYSTABLEPART   TPT         
    WHERE  TPT.DBNAME NOT LIKE 'SYS%'        
       AND TPT.DBNAME NOT LIKE 'DSN%'    
    AND  (TPT.FARINDREF * 20 > TPT.CARD)  -- OVER 5%  
    ORDER BY  7 DESC, 1, 2  ;    
  • Tablespaces that are poorly compressed.  The Compress option specifies that DB2 is to use a common data compression algorithm to store table rows. In general, textual data can benefit greatly from this, with compression rates as high as 75% or greater.  However, there is a cost: as rows are stored or retrieved, DB2 must compress or decompress the rows, and this costs CPU cycles. The following query determines if a tablespace is poorly compressed (the criteria used is below 10% compression).  If you have tablespaces like this, consider removing the Compress option and running the Reorg utility, since table access is costing CPU cycles without significant space gains.
    SELECT  TPT.DBNAME, TPT.TSNAME, TPT.PARTITION    AS P#   
      ,DECIMAL(TPT.CARDF,11,0)  
      ,TPT.PAGESAVE     AS SAVED      
      FROM    SYSIBM.SYSTABLEPART  TPT  
      WHERE  TPT.DBNAME NOT LIKE 'SYS%'      
        AND TPT.DBNAME NOT LIKE 'DSN%'  
        AND TPT.COMPRESS = 'Y'    
        AND TPT.PAGESAVE < 10    
    ORDER BY  5, 1, 2, 3 ;
  • Tablespaces without recent backups. Database administrators (DBAs) back up tablespaces using the Image Copy utility. This is done for several reasons: to provide point-in-time recovery data if a rogue application overlays or changes data; to serve as a restore point if an application fails and must be re-run with the tablespace in its original condition; and as a contingency in case of a disaster. Any tablespaces without backups, or without recent backups, create potential recovery issues. The following query lists all tablespaces without backups within the last 30 days.  Analyze this list, and update regular backup procedures to include any tablespaces that may require recovery.
    SELECT  TPT.DBNAME, TPT.TSNAME, TPT.PARTITION  
    FROM  SYSIBM.SYSTABLEPART   TPT   
    WHERE  TPT.DBNAME NOT LIKE 'DSN%'    
     AND NOT EXISTS  
    (SELECT  1  FROM  SYSIBM.SYSCOPY   CPY   
      WHERE  CPY.DBNAME = TPT.DBNAME      
       AND CPY.TSNAME = TPT.TSNAME      
       AND  (    CPY.DSNUM = TPT.PARTITION       
             OR  CPY.DSNUM = 0)   
       AND  CPY.ICTYPE IN ('F', 'I')    
       AND  CPY.TIMESTAMP >  (CURRENT TIMESTAMP - 30 DAYS)   )      
    ORDER BY  1, 2, 3     

Application Index Performance

In DB2, table indexes are used for multiple reasons:

  • To support Primary Keys;
  • To support uniqueness constraints;
  • To provide a mechanism for clustering table rows in order by key value; and
  • To serve as a high-performance access path to keyed data.

It is the third and fourth reasons that we will concentrate on.  Many applications issue queries for single rows or small numbers of rows that contain the same key value.  Examples include: an on-line order entry program that accesses a product table row by its product number; a payroll application that updates an employee table row based on the employee id; and a health care app that accesses a patient’s current treatments based on a medical id number.

Indexes can lose their performance edge by frequent use, lack of good design, or lack of frequent reorganization. Here are a few queries to help you find possible low-performance indexes.

  • Clustering indexes with poor clustering. A clustering index is used to assist DB2 in maintaining table rows in physical sequence by a column value. For example, a customer table may benefit from having rows stored physically ascending by customer number. DB2 will not attempt to maintain this clustering without a clustering index. In addition, there may be times when DB2 cannot keep this sequence. For example, DB2 may attempt to store a row in a preferred physical location but no free space exists for that row. As time goes on, tables become less and less clustered, leading to increased query elapsed times.  Use the following query to identify tables currently having a poor clustering percentage.  Execute the Reorg utility on the table to return it to full clustering.
    SELECT  IDX.DBNAME, IDX.NAME, IDX.TBNAME     
    ,DECIMAL(IDX.FULLKEYCARDF,11,0)  AS  FULLKEYCARD    
    ,'CL. IDX, POOR CL% (CL-RATIO)'  AS DESCRIPTION  
    ,IDX.CLUSTERRATIO     
    ,DATE(IDX.STATSTIME) AS STATS_DATE      
    FROM   SYSIBM.SYSINDEXES     IDX   
    WHERE  IDX.DBNAME NOT LIKE 'SYS%'  
      AND IDX.DBNAME NOT LIKE 'DSN%'    
      AND IDX.CLUSTERING = 'Y'    
      AND IDX.CLUSTERRATIO  < 66  
    ORDER BY 6, 1, 2      
  • Indexes with poor cardinality.  Indexes work best when they contain entries for unique rows. However, sometimes indexes are not designed in this way, and instead contain duplicate entries. Consider a customer address table with a column called State. An index on this column only would not generally be useful, as there are a relatively small number of states compared to the large number of customers. This can be generalized as follows: any index where each key entry consists of an average of 100 duplicates is probably not very useful.  Use the following query to determine these indexes. Address the issue by re-defining the indexes to contain a larger cardinality of values.
    SELECT TBL.DBNAME, TBL.TSNAME, TBL.NAME, IDX.NAME       
    ,DECIMAL(IDX.FULLKEYCARDF,11,0)  AS IDX_FKYCRD      
    ,DECIMAL(TBL.CARDF       ,11,0)  AS TBL_CARD  
    FROM  SYSIBM.SYSTABLES      TBL   
         ,SYSIBM.SYSINDEXES     IDX  
    WHERE  TBL.DBNAME  = IDX.DBNAME    
       AND TBL.NAME    = IDX.TBNAME      
       AND TBL.CREATOR = IDX.TBCREATOR   
       AND TBL.DBNAME NOT LIKE 'SYS%'    
       AND TBL.DBNAME NOT LIKE 'DSN%'    
       AND TBL.TYPE = 'T'    
    AND  IDX.FULLKEYCARDF * 100  <  TBL.CARDF     
    ORDER BY 5      
  • Proliferation of Indexes.  While indexes are frequently used for performance enhancements, having a large number of indexes on a single table may indicate one or more of the following:  poor index design, especially if two or more indexes are identical; poor index design if two indexes share the same initial columns (Index #1 contains columns A and B, while Index #2 contains columns A, B and C); potential table maintenance issues, as a reorganization of the table usually forces reorganization of all indexes, thus extending Reorg utility elapsed time; and potentially elongated recovery time, since recovering a table after a failure or a disaster also requires recovering or rebuilding all the indexes. Use the following query to list tables having more than seven indexes. Use this list to analyze the noted tables and their indexes for possible issues.
    SELECT  TBL.CREATOR , TBL.NAME , COUNT(*)   AS #_IDX    
    FROM  SYSIBM.SYSTABLES  TBL       
         ,SYSIBM.SYSINDEXES IDX      
    WHERE  TBL.TYPE = 'T'  
     AND  TBL.NAME    = IDX.TBNAME     
     AND  TBL.CREATOR = IDX.TBCREATOR  
  
     AND TBL.DBNAME NOT LIKE 'DSN%'    
     AND TBL.DBNAME NOT LIKE 'SYS%'    
    GROUP BY TBL.CREATOR, TBL.NAME       
    HAVING  COUNT(*) > 7    
    ORDER BY  3 DESC      

Summary

As application tables and indexes age, their internal structures may change due to frequent insertion, update, and deletion of rows.  Use the queries shown above to review the health of your application databases and review the results.  Once you have determined and taken the appropriate actions, consider running these queries on a regular basis. As problems reoccur, the queries will pinpoint tables and indexes that may require maintenance more frequently, as well as the health of newly-created objects. Finally, consider giving these queries to developers to use in the non-production environments. This will allow them to analyze their databases during development and testing, and will give the DBAs prior warning of some potential issues.

* Disclaimer. This information is based on research, techniques, and documentation available at the time of creation. It is therefore subject to change at any time without notice or warning. The following queries are presented “as-is”, without any warranty either expressed on implied.  Please also note that these queries access DB2 catalog tables containing performance and data distribution statistics. It is assumed that the DB2 support staff has procedures in place (such as the RunStats utility) to keep these statistics up-to-date.

See all articles by Lockwood Lyon

Lockwood Lyon
Lockwood Lyon
Lockwood Lyon is a systems and database performance specialist. He has more than 20 years of experience in IT as a database administrator, systems analyst, manager, and consultant. Most recently, he has spent time on DB2 subsystem installation and performance tuning. He is also the author of The MIS Manager's Guide to Performance Appraisal (McGraw-Hill, 1993).

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles