Saved by the APAR or How to Use IBM's Self Help Web Support

Friday Jul 11th 2003 by Marin Komadina
Share:

IBM's Self Help Web Support is a great starting point for finding answers about DB2 database software. Marin Komadina shares tips on getting the most out of this resource.

So, what exactly does a DBA need to know to be able to solve all his problems?

Complexity of the modern relational database goes beyond the regular DBA's knowledge. When looking at dozens of manuals, books and web pages, you may wonder, "Where else can I search?" A regular resource for questions about products, installation procedure, configuration, functionality, technical inquiries, trends and directions can be found in many places. The following resources are available to help you:

  • Frequently asked questions (FAQs) - answers for the most frequently asked questions
  • Hints & Tips - installation and support information
  • Technotes - documented problems and solutions
  • Product information - various books, manuals and the IBM product documentation
  • Redbooks - technical manuals with detail instructions
  • Whitepapers and up-to-date bulletins - in depth documentation about products, with coding tips and techniques
  • Forums/Newsgroups - experience exchange between DB2 users
  • Software downloads - patches, fixes and other product downloads

In this article, I'll show how to use IBM Self Help Web Support, a great starting point for getting the right answer about DB2 database software.

This article covers:

  • FixPack, HotFix, APAR, Hiper APAR
  • IBM Self Help Web Support
  • Call from the Customer
  • Using the Self Help Web Support
  • Conclusion

FixPack, HotFix, APAR, Hiper APAR

IBM DB2 database software is just like any other software. It is buggy and has a certain amount unknown defects in the code. During the database software usage, customers and IBM support personnel often discover some of those defects in the database code. Many different resources with DB2 UDB database information exist on the Internet. Some of them are out of date and without any useful information, while many of them are related to the Host technology (zSeries formerly S/390 IBM's mainframe line). IBM's official support Web site provides a very good, sorted and organized source of information. Even then, if the solution for the problem is not found, we can contact IBM customer support.

For a known problem with DB2 software code, IBM support can direct the customer to apply a new FixPack or HotFix, install a new database release, change database settings or change the application code. All communication with the customer will be logged in the Incident/Support Case Log.

A PMR (Problem Management Record) is opened on a case-by-case basis by IBM support, for new, unidentified problems in the database software code. Every PMR gets a unique identification number (for example PMR 31045,025,722); all communication and support activity is handled under this number. PMR is first investigated by the Support Team and then handed over to the Development Team. IBM specialists will then try to isolate the error and detect the failing database software component. When the problem is reproducible and isolated, an APAR will be opened.

An APAR (Authorized Program Analysis Report) is a named issue with an IBM program, and is opened after the customer or IBM support personnel discover a problem with the database software code. Every APAR has a unique identification number. As an example, I have summary information for APAR IY39922 below:

IY39922: CRASH WHEN LOADING INTO TABLE WITH GENERATED COLUMN(S) USING GENERATEDMISSING MODIFIER. 
Fixes are available 
DB2 Universal Database Version 8 FixPak 1
DB2 Universal Database Version 8 FixPak 2
APAR number IY39922 
Reported component name DB2 UDB ESE SOL 
Reported component ID 5765F4102 
Reported release 810 
Status CLOSED PER 
HIPER NoHIPER 
Submitted date 2003-01-29 
Closed date 2003-01-29 

After the IBM development Team solves the problem defined in the APAR, the database software fixes will be provided through DB2 FixPack or HotFix.

HIPER APARs ( HIgh Impact PERvasive APARs) are critical DB2 bugs of which all customers should be aware. Fixes for HIPER APARs are provided through DB2 FixPack and HotFix.

FixPack is the software update for the main database release. IBM Acronyms & other terms documentation says: "the means by which some products deliver service".

Fixpacks are delivered for separate database versions. FixPacks can be cumulative or separated. A short overview over the amount of the solved problems in different FixPack version is listed below:

DB2 FixPack Version

Number of APARs fixed

Number of HIPER APARs fixed

DB2 V7 FixPack 1

82

20

DB2 V7 FixPpack 2

289

DB2 V7 FixPack 3

371

DB2 V7 FixPack 4

319

DB2 V7 FixPack 5

224

DB2 V7 FixPack 6

171

DB2 V7 FixPack 7

220

DB2 V7 FixPack 8

273

DB2 V7 FixPack 9

283

DB2 V7 FixPack 10

269

     

DB2 V8 FixPack 1

154

 

DB2 V8 FixPack 2

286

HotFix is a separate database fix, aimed to solve a problem for a particular customer or a particular problem not covered so far by any FixPack or the regular database release.

A short description for the one HotFix:

APAR number IC32650 
Local fix  A hotfix has been sent to the customer.
Problem conclusion  Fix required
Temporary fix  Temp fix sent to customer and approved
Reported component name IM SCORING AIX 
Reported component ID 5724A6000 
Reported release 710 
Status CLOSED PER 
PE NoPE 
HIPER NoHIPER 
Fix information 
Fixed component name IM SCORING AIX 
Fixed component ID 5724A6000 
Applicable component levels 
R710 PSY IP22398 UP02/02/22 I 1000

IBM Self Help Web Support

Self Help Web Support is IBM's Web knowledge database with search capability over official IBM support web sites and knowledge databases. You can find the start page for the Self Help Support at the Web address: www.ibm.com/software/support.

One doesn't need to be a registered user to use this technical knowledge database.

Some of the articles are protected, and only registered user (user with valid support key) can access this type of document. On this Web site, we can search among closed APARs and software fixes. It is my opinion that this should be the most important source of technical information for everyday troubleshooting routines, for any DB2 DBA. Although, the IBM support model has many possibilities, this would be the fastest and quickest way to get self-help.

Call from the Customer

We had a call from a customer stating that he had a problem with a production database. The database was a DB2 UDB V7.1 EEE on Sun Solaris, with two database partitions. The customer was in the middle of work, when the application suddenly crashed with trace information written in the database dump directory. Unix administrators reported a machine restart, due to the system error.

A check in the machine system log revealed more information about the crash:

$ cat /var/adm/messages
Jan 01 18:43:34 ARTIST0 SUNW,UltraSPARC-III+: [ID 266074 kern.warning] WARNING: [AFT1] 
Uncorrectable system bus (UE) Event detected by CPU2 
User Instruction Access at TL=0, errID 0x001c63a6.bf3009e0
Jan 01 18:43:34 ARTIST0 AFSR 0x00000004.00000131 AFAR 0x00000021.f081cb00
Jan 01 18:43:34 ARTIST0 Fault_PC 0x1cb00 Esynd 0x0131  /N0/SB1/P0/B1
Jan 01 18:43:34 ARTIST0 SUNW,UltraSPARC-III+: [ID 357985 kern.notice] [AFT1] errID 
0x001c63a6.bf3009e0 More than four Bits were in error and is fatal: will reboot

The machine had a hardware problem, and was rebooted. We expected that the database would need to be recovered due the abnormal end of db2 processes. In the database log however, we could not find any information indicating the database had restarted or crashed.

The hardware problem with the machine was soon solved, and the machine was up and running. A system check showed that there was no main DB2 system process, indicating that the DB2 database was not running. Diagnose information from the database message log (db2diag.log) indicated a problem with the database automatic recovery procedure. Here are the extracted messages from the db2diag.log:

Crash Recovery is needed.      
-> Crash recovery was started
Crash recovery has been initiated.  Lowtran LSN is "0008DB961070", Minbuff LSN is 
"0008E02AC2CC".
Using parallel recovery with 3 agents 7 QSets 96 queues and 8 chunks
Forward phase of crash recovery has completed.  Next LSN is "0008E257133C". 
-> Rollforward finished 
2003-01-01-18.43.35.640690   Instance:db2inst1   Node:000
PID:10061(db2loggr 0)   Appid:none
data_protection  sqlpgarl   Probe:120 
Bp 13212000, blkOffSet 9911, ReadCount 81000 0000 0000 0034 ffff ffff 0000 0000       
.......4........                
0000 0000 0054 0050 0053 0005 0000 2001       .....T.P.S.... .                
4942 4d4c 4f47 0008 deff 0000 0000 0196       IBMLOG..........                
0000 2710 0000 26bc 3db5 4b2e 3d4a 7cec       ..'...&.=.K.=J|.                
3db5 4b2f 0000 0000 0000 0000 0000 0000       =.K/............                
0000 0000 0000 0000 0000 0000 0000 0000       ................                
0000 0000 0000 0000 0053 0000                 .........S..                
2003-01-01-18.43.35.783967   Instance:db2inst1   Node:000
PID:10061(db2loggr 0)   Appid:none
data_protection  sqlpgarl   Probe:120 
DIA3806C Unexpected end of file was reached.
ZRC=0xFFFFF609
-> Recovery canceled while reading log files
Error -2551 when reading LSN 0008 E16A B2C7 from log file S0000406.LOG
LSN being undone: 0008 e16a b2c7                                ...j..                
In-doubt transaction(s) exists at the end of crash recovery.
-> Transactions was not cleared  
Crash recovery completed. Return Code = "-2551"
-> Recovery finished unseccussfully 
Recovery started on log file: 5330 3030 3034 3034 2e4c 4f47                 S0000404.LOG                
-> Last touched log file S0000404.LOG 
Restart failed with sqlcode: ffff fbee                                     ....                
Dirty BDS CB at agentActivationTerm! Correcting. 
BDS CB before cleanup = 1244 4620 0000 0002 
Marking the database bad.
-> Database marked as bad 

The recovery finished without success. The database was rolled forward and then marked bad, due the "Unexpected end of file was reached" error.

Using the Self Help Web Support

After reading all of the available documentation and searching over the Internet for a similar problem, we could not find a solution. Every attempt to recover the database finished with a corrupted database. Finally, we decided to go to the IBM support Web site and search in the APAR database for a similar problem.

An error message from the db2diag.log was copied and pasted in the Self Help IBM support Search Web page.

The first result from our search was document IY30334, which had the same message that we had in our database log, in the header, with the following extract:

IY30334: DB2 CRASHES AFTER 'SQLPGARL' GETS "DIA3806C UNEXPECTED END OF FILE WAS REACHED.", 
	ZRC=0XFFFFF609
A fix is available.DB2 Universal Database Version 7 fixpack 8.
APAR Status: Closed as program error. 
Error Description: 
At the time 'BACKUP ... ONLINE' or 'ARCHIVE ... LOG' DB2 may crash if the truncation is near 
the end of a log file.  This problem was introduced in v7.2 fixpak 6 - s020313 (by APAR 
IY26397) and will be resolved in the next fixpak.  This problem can be completely 
circumvented, and recovered from if it does occur.  This APAR does *not* result in corruption.
Without adjusting the LOGBUFSZ crash recovery will fail again with the same messages in the 
'db2diag.log'
Local Fix: 
CIRCUMVENTION: 'db2set DB2_DISABLE_FLUSH_LOG=ON' and do not use 'ARCHIVE LOG FOR DATABASE ...'
WORKAROUND: if the problem is encountered, set the database configuration parameter LOGBUFSZ 
<= 16 then DB2 will not take the problematic code path:
DB2 may potentially crash if the truncation is near the end of the log file.
Problem Details:  
The db2diag.log may report 0xfffff609,unexpected EOF reached, in sqlpgarl.
In the case of this, set LOGBUFSZ to 16, and db2 will not take the problematic codepath.
After the database has successful recovered, 'db2set DB2_DISABLE_FLUSH_LOG=ON' and then set 
the LOGBUFSZ back to the original value.
APAR information
APAR number  IY30334

Following the instruction in the IBM APAR note we had to set the instance level registry variable DB2_DISABLE_FLUSH_LOG and change the database parameter LOGBUFSZ to the recommended value. The current instance registry settings are:

$db2set -i
[i] DB2_STRIPED_CONTAINERS=ON
[i] DB2_HASH_JOIN=YES
[i] DB2_BINSORT=YES
[i] DB2COMM=TCPIP
[i] DB2_PARALLEL_IO=*

Adding the new instance level registry variable DB2_DIABLE_FLUSH_LOG:

$db2set DB2_DISABLE_FLUSH_LOG=ON -i ARTIST0
$db2set -i
[i] DB2_STRIPED_CONTAINERS=ON
[i] DB2_HASH_JOIN=YES
[i] DB2_BINSORT=YES
[i] DB2COMM=TCPIP
[i] DB2_PARALLEL_IO=*
[i] DB2_DISABLE_FLUSH_LOG=ON

Changing the LOGBUFSZ settings to the recommended value:

$db2 "get db cfg for ARTIST0" | grep 'LOGBUFSZ'"
Log buffer size (4KB)                        (LOGBUFSZ) = 512

$db2 "update db cfg for ARTIST0 using LOGBUFSZ 16"
DB20000I  The UPDATE DATABASE CONFIGURATION command completed successfully.
DB21026I  For most configuration parameters, all applications must disconnect 
from this database before the changes become effective.

$ db2stop 
01-01-2003 19:51:13     0   0   SQL1064N  DB2STOP processing was successful.
01-01-2003 19:51:14     1   0   SQL1064N  DB2STOP processing was successful.
SQL1064N  DB2STOP processing was successful.

$ db2_kill
ARTIST01: ipclean: Removing DB2 engine and client's IPC resources for db2inst1.
ARTIST01: db2nkill [] completed ok

The database is down, and all resources for the database cleaned. New database start:

$ db2start
01-01-2003 19:53:56     1   0   SQL1063N  DB2START processing was successful.
01-01-2003 19:53:57     0   0   SQL1063N  DB2START processing was successful.
SQL1063N  DB2START processing was successful.

The database was started regularly. A check for the active databases revealed that the database had finished with automatic recovery and had opened.

$ db2 "list active databases"
                           Active Databases

Database name                              = ARTIST0
Applications connected currently           = 3
Database path                              = C:\DB2\NODE0000\SQL00001\

In the database db2diag.log file, we have the log from the last successful recovery. The database problem is solved and the parameters have to be changed back.

Conclusion

Clever usage of the Self Help Web support database will help DBAs to avoid making too many mistakes in a production environment. A DBA must never stop learning how to effectively use all resources available and never start to believe that the database vendor code is absolutely error free. Even, when coming from IBM.

» See All Articles by Columnist Marin Komadina

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved