Oracle Database 11g: Database Replay, Part 3

Monday Jun 30th 2008 by Jim Czuprynski
Share:

The final article in this series illustrates how to use Oracle Database 11gR1's new Database Replay (DBR) features to capture and prepare a workload from a current Oracle 10gR2 single-instance database environment and then replay that identical workload in an Oracle 11gR1 Real Application Clusters (RAC) testing environment.

Synopsis. Oracle Database 11gR1’s new Database Replay (DBR) features allow an Oracle DBA the capability to capture a workload from an Oracle 10gR2 database environment and then replay that same workload against an Oracle 11gR1 database environment to analyze how the transition to the new Oracle database release will impact overall system’s performance. The final article in this series illustrates how to use these features to capture and prepare a workload from a current Oracle 10gR2 single-instance database environment and then replay that identical workload in an Oracle 11gR1 Real Application Clusters (RAC) testing environment. This offers an Oracle DBA the unprecedented opportunity to identify any potential performance bottlenecks well in advance of the transition to a RAC environment.

The prior article in this series demonstrated a relatively straightforward scenario: how to capture a simulated application workload within the current Oracle 11g production environment (P+0) and then replay that same workload in the next iteration of that same environment (P+1). This article tackles a somewhat more ambitious task because it will illustrate how to:

  • Capture and record the simulated application workload from a single-instance Oracle 10gR2 database, including the corresponding Automatic Workload Repository (AWR) data for that recorded workload
  • Transfer the simulated workload to an Oracle 11gR1 Real Application Clusters (RAC) testing environment
  • Preprocess the workload for replay, including remapping the connections to different RAC load-balanced services
  • Replay the workload in the Oracle 11gR1 RAC test environment
  • Identify application performance issues, data divergence, or error divergence

Since I concentrated on how to perform these tasks almost exclusively within the Oracle 11gR1 Enterprise Manager GUI, I’ll concentrate on how to use Oracle 11gR1’s supplied PL/SQL packages, DBMS_WORKLOAD_CAPTURE and DBMS_WORKLOAD_REPLAY, to accomplish the same tasks via SQL*Plus and scripting.

Phase 1: Record a Workload in Single-Instance Environment

To keep this capture and replay scenario simple – and because I hate wasting anything useful - I’ll utilize the same PL/SQL objects that I constructed in the prior article to generate a workload for capture against a single-instance Oracle 10gR2 database. Since the minimum required release level to capture a workload in Oracle 10gR2 is 10.2.0.4. I first used the Database Upgrade Assistant (DBUA) to patch an existing Oracle 10gR2 database named DB10G up to Oracle Release 10.2.0.4 – a relatively painless process that takes about 30 minutes in my simulated Oracle 10gR2 production environment.

Feature Upgrade Alert
Note that as of July 18, 2008, it's now also possible to use an Oracle 9iR2 database as a potential target for Real Application Testing workload capture. The minimum release level required for execution of DBMS_WORKLOAD_CAPTURE in an Oracle 9i or 10g database environment, which patch levels to apply, and instructions for applying the patches is detailed in this document from Oracle Technical Support.

I then executed the same scripts and PL/SQL code to create and initialize a suitable environment just prior to executing a simulated application workload for capture. (See the first section of the prior article for a summary of the scripts and code to accomplish this.)

Preparing for Workload Capture. Now that my source database’s environment is initialized, I’ll initiate the capture of an actual workload. Listing 3.1 shows how I used procedure DBMS_WORKLOAD_CAPTURE.ADD_FILTER to first apply some appropriate filters to eliminate capture of user sessions that are producing “uninteresting” activity or need to be ignored during the capture period. In this case, I want to make sure that any activity related to either Enterprise Manager Database Control or Grid Control won’t be captured.

Starting Workload Capture. The code shown in Listing 3.2 illustrates how I used procedure DBMS_WORKLOAD_CAPTURE.START_CAPTURE to initiate workload capture. This procedure first checks the target directory (DBRCONTROL) for any prior executions of workload capture files; if any are found, it returns an error and won’t allow the current workload capture attempt to continue. If the capture startup is successful, however, the DB10G database’s alert log will recognize that a DBR capture operation is underway:

. . .
Mon Jun 23 19:40:41 2008
ALTER SYSTEM SET pre_11g_enable_capture=TRUE SCOPE=BOTH;
Mon Jun 23 19:40:44 2008
DBMS_WORKLOAD_CAPTURE.START_CAPTURE(): Starting database capture at 06/23/2008 19:40:44
. . .

Since I’m capturing the workload against an Oracle 10gR2 database, note that I also had to set the dynamic initialization parameter PRE_11G_ENABLE_CAPTURE to TRUE before starting the capture.

Generating a Workload. To simulate multiple executions of similar code by different users, I prepared and executed a simple shell script named 10gSI_RandomLoadGenerator.sh. Like its similar predecessors in the prior article, it starts up approximately 80 user sessions that perform a random set of tasks like executing simple queries that do CPU-intensive computations, generating complex queries against the AP schema, or performing intense bursts of DML that add several thousand rows into the AP schema’s tables.

Note that I’ve also configured the DB10G database to use just one service name (DB10G) regardless of the type of operation being performed. I’ve added this service name as a potential connection alias in my database’s TNSNAMES.ORA configuration file. (In later steps, I’ll illustrate how to remap this connection to a different service name during workload replay against my Oracle 11gR1 RAC environment with DBMS_WORKLOAD_REPLAY.REMAP_CONNECTION.)

Halting Workload Capture. To halt the workload’s capture, I executed procedure DBMS_WORKLOAD_CAPTURE.FINISH_CAPTURE to stop the workload capture operation (see Listing 3.3). Note that the successful conclusion of the DBR capture operation is also recorded in database DB10G’s alert log:

. . .
Mon Jun 23 19:42:21 2008
Thread 1 advanced to log sequence 43 (LGWR switch)
  Current log# 3 seq# 43 mem# 0: /u01/app/oracle/oradata/db10g/redo03.log
Mon Jun 23 19:44:35 2008
DBMS_WORKLOAD_CAPTURE.FINISH_CAPTURE(): Stopped database capture successfully at 06/23/2008 19:44:32

Viewing Workload Capture Results. To review the results of what the workload capture operation actually captured, I executed procedure DBMS_WORKLOAD_CAPTURE.REPORT to generate a summary report (see Listing 3.4). The output from this report can be viewed in text-only format in Report 3.1 or in HTML format here.

Phase 2: Preparing for Replay

After I successfully captured an appropriate workload against my Oracle 10gR2 single-instance database, I tackled the preparation of my target environment – an Oracle 11gR1 Real Application Cluster (RAC) clustered database:

  • I configured a two-node Real Applications Cluster (RACNODE1 and RACNODE2) using Oracle Clusterware 11.1.0.6.0 to configure and manage the clustered environment.
  • I deployed an Oracle 11.1.0.6.0 ASM instance on each node of the cluster and created two ASM disk groups, +DATA and +FRA, on shared disk storage for my RAC database’s files.
  • I created a new RAC database named RACDB using the standard Oracle 11gR1 “seed” database template. This deployed two RAC instances, RACDB1 and RACDB2, one on each node of the cluster, each serviced by a single Listener.
  • I deployed a new RAC service named TESTLBA as a preferred service on both nodes and tuned it for maximum connection management performance in a simulated OLTP environment that utilizes the Load Balancing Advisor to distribute connections across both database instances. Listing 3.5 shows the SRVCTL commands, the invocation of DBMS_SERVICE.MODIFY_SERVICE, and the TNSNAMES.ORA network configuration entries I deployed to configure the TESTLBA service appropriately.
  • Finally, I created precisely the same database tables, indexes, and PL/SQL packages in the RACDB clustered database that I had previously created in the DB10G Oracle 10.2.0.4.0 database. I then restored the initial state of tables AP.VENDORS, AP.INVOICES and AP_INVOICE_DETAILS by executing script APInitialization.sql.

Preparing the Workload. Now that my Database Replay target environment is in place, I’m ready to prepare the workload that I captured in single-instance mode for its eventual replay in a RAC database environment:

  • I added a new physical directory, /home/oracle/DBRControl, on nodes RACNODE1 and RACNODE2.
  • I created a new directory object named DBRCONTROL in the RACDB database, aimed it at the physical directory I just created on each node, and granted the appropriate access privileges to this directory object.
  • I copied the files that I generated against the DB10G database during DBR Workload Capture to the same physical directory on both nodes. Note that I could have copied these workload replay files to just one of the nodes – say, the one that normally handles the least work – because any node in the RAC cluster can be used to manage the replay of the captured DBR workload.
  • Finally, I executed procedure DBMS_WORKLOAD_REPLAY.PROCESS_CAPTURE to preprocess the captured workload for its eventual replay. This procedure parses the information in the files that recorded a shadow copy of each session’s operations against the DB10G database and prepares the workload for replay against the RACDB database.

Listing 3.6 shows the commands I issued to create the physical directories on each node, the corresponding directory object in the RACDB database, and (after all workload capture recorded files have been copied to the appropriate physical directories) how I executed procedure DBMS_WORKLOAD_REPLAY.PROCESS_CAPTURE to preprocess the captured workload



Phase 3: Replaying the Workload

My RAC database is now finally ready to accept execution of the previously captured workload. As the previous article illustrated, there are several steps that must be followed in precise order when using Oracle 11g Enterprise Manager to start the replay, and those same steps must be followed when calling DBMS_WORKLOAD_REPLAY procedures to initiate the captured workload’s replay, remap any connections, adjust any custom replay frequency settings, and start gathering replay performance and regression statistics collection.

Initiating Database Workload Replay. To initiate the replay of the captured workload, I’ll invoke procedure DBMS_WORKLOAD_REPLAY.INITIATE_REPLAY. This places the RACDB database into INIT FOR REPLAY state – a prerequisite to moving the database into the PREPARE state via procedure DBMS_WORKLOAD_REPLAY.PREPARE_REPLAY.

Remapping Connection Strings. To insure that all sessions that participated in workload generation against the DB10G connection on my Oracle 10gR2 single-instance database are remapped to the corresponding TESTLBA load-balanced connection on the RACDB database, I’ll use procedure DBMS_WORKLOAD_REPLAY.REMAP_CONNECTION.

See Listing 3.7 for examples of how to invoke these two procedures and how to view the resulting remapped connection strings.

Customizing Workload Replay Options. As I described in the Database Replay primer article, Oracle 11g permits the DBA to alter the frequencies at which a workload may be played back with extreme granularity. The workload replay client’s behavior can be tightly controlled by setting several additional parameters via procedure DBMS_WORKLOAD_REPLAY.PREPARE_REPLAY:

Table 3.1. Replay Client Options
Replay Option

Description

SYNCHRONIZATION

Defines whether synchronization will be used during workload generation:

  • TRUE: This is the default. The order of COMMITs in the captured workload will be preserved during replay; all replay actions are executed only after all dependent COMMIT actions are completed.

  • FALSE: The original order of COMMITs will not be honored. This most likely will result in a large data divergence, but it is useful for load or stress testing.

THINK_TIME_SCALE

Determines the elapsed time between two successive user calls within the same session, so it drives the replay speed:

  • The default value is 100, or 100% of the original workload generation speed.

  • If set to zero (0), calls are sent to the replay database in succession as rapidly as possible.

  • If set to > 100%, then the replay speed will decrease proportionally.

THINK_TIME_AUTO_CORRECT

Corrects the THINK_TIME_SCALE between user calls, based on the specified percentage value. Setting this parameter to TRUE forces the Replay Client to shorten the “think time” between calls so that the total elapsed time of Database Replay more accurately matches that which was initially gathered.

CONNECT_TIME_SCALE

Scales the elapsed time from when the workload capture started to when the session connects with the specified value. This is interpreted as the percentage of time that a user session should remain connected.

Note that Database Replay does differentiate between workload capture time and workload replay time:

  • During the capture of the workload, the elapsed time is measured by user time (the total elapsed time of a user call to the database) and user think time (the time the user waited between issuing another call).
  • During the replay of the workload, however, the elapsed time is measured by user time, user think time, and synchronization time.

As in the previous article, I’ve simply accepted the default options shown for each of these parameters by invoking procedure DBMS_WORKLOAD_REPLAY.PREPARE_REPLAY as shown in Listing 3.8.

Initiating a Workload Replay Client. It’s time to start the Workload Replay Client (WRC) to play back the previously captured workload. Just as in the prior article, I’ll start up a replay session by opening a terminal session and invoking the WRC client executable on just one node - RACNODE2 – of my clustered database:

$>  wrc replaydir=/home/oracle/DBRControl
Workload Replay Client: Release 11.1.0.6.0 - Production on Mon Jun 23 21:27:47 2008
Wait for the replay to start (21:27:48)

Once the WRC is started on RACNODE2, I need to tell Oracle 11g that it should take control of any active database replay operations. As shown in Listing 3.9, I’ll do this by invoking procedure DBMS_WORKLOAD_REPLAY.START_REPLAY. The successful execution of this procedure will be reflected within the WRC terminal session’s output:

Wait for the replay to start (21:27:48)
Replay started (21:28:16)

Monitoring Active Replay Operations. Listing 3.10 shows a query against the DBA_WORKLOAD_REPLAY view that produces a simple report of the current DBR workload replay state; Report 3.2 shows the result of executing this query during several different phases of the DBR workload replay until the replay operation is completed (which will also be reflected in the WRC’s session output):

Wait for the replay to start (21:27:48)
Replay started (21:28:16)
Replay finished (21:48:40)

The successful startup and completion of the DBR workload replay sessions will be recorded in the alert logs of both instances, as shown below:

>>> From RACDB1's alert log:
...
Tue Jun 24 21:28:01 2008
DBMS_WORKLOAD_REPLAY.START_REPLAY(): Starting database replay at 06/24/2008 21:28
Tue Jun 24 21:31:04 2008
Thread 1 advanced to log sequence 92
  Current log# 2 seq# 92 mem# 0: +DATA/racdb/onlinelog/group_2.262.649041349
  Current log# 2 seq# 92 mem# 1: +FRA/racdb/onlinelog/group_2.259.649041351
Tue Jun 24 21:48:39 2008
DBMS_WORKLOAD_REPLAY: Database replay ran to completion at 06/24/2008 21:48:40
...
>>> From RACDB2's alert log:
...
Tue Jun 24 21:28:01 2008
DBMS_WORKLOAD_REPLAY.START_REPLAY(): Starting database replay at 06/24/2008 21:28
Tue Jun 24 21:48:39 2008
DBMS_WORKLOAD_REPLAY: Database replay ran to completion at 06/24/2008 21:48:40
...

Phase 4: Regression Analysis

The playback of the captured workload against my RAC test environment is complete, so it’s time to turn my attention to what performance issues may arise if I decide to migrate this application code to a Real Application Clusters environment.

The Database Replay Report is probably most useful for a “macroscopic” view because it compares execution statistics between the captured and replayed workloads. I used the code in Listing 3.11 to create the report output in simple text format as shown in Report 3.3.

In addition, the Automatic Workload Repository (AWR) report summarizes and analyzes the database’s overall performance between the database replay operation’s starting and ending time periods. I’ve generated two versions of this report (one for each of the two RAC instances) using the code in Listing 3.12; the corresponding report output is shown in Report 3.4.

As these two reports demonstrate, I’ve got some analysis to perform before I implement this application in an 11gR1 Real Application environment because they indicate that there’s a lot of contention for a few data and index segments – most likely due to serialization of these resources and others. I’ll spend time in a later article deconstructing these performance issues with the new enhancements to the Automatic Database Diagnostic Monitor (ADDM) tool set.

Data Dictionary Views. Oracle 11gR1 also provides several data dictionary views that describe the results of database workload capture and replay activities, configuration details, and active DBR sessions, as shown in Table 3.2 below:

Table 3.2. Database Replay Performance and Metadata Views
Data Dicttonary View

Description

DBA_WORKLOAD_CAPTURES

Describes statistics from DBR Workload Capture operations

DBA_WORKLOAD_FILTERS

Tells which types of filters were applied during DBR Workload Capture operations

DBA_WORKLOAD_REPLAYS

Describes results from DBR Workload Replay operations

DBA_WORKLOAD_CONNECTION_MAP

Explains how connection mapping will affect any pending DBR Workload Replay operations

DBA_WORKLOAD_REPLAY_DIVERGENCE

Summarizes data and error divergence encountered during DBR Workload Replay operations

V$WORKLOAD_REPLAY_THREAD

Shows which database sessions are actually executing DBR Workload Replay operations

I’ve provided SQL*Plus formatted queries against the remainder of these views in Listing 3.13.

Conclusion

Oracle 11g’s new Database Replay capabilities provide extremely granular analysis of potential performance problems, data divergence, and error divergence during regression testing. These features have the potential to limit if not eliminate one of the biggest headaches for an Oracle DBA: the inability to tell with a sufficient degree of certainty how proposed changes to a database environment – regardless of the vector of the changes – will precisely impact the current database environment.

References and Additional Reading

While I’m hopeful that I’ve given you a thorough grounding in the technical aspects of the features I’ve discussed in this article, I’m also sure that there may be better documentation available since it’s been published. I therefore strongly suggest that you take a close look at the corresponding Oracle documentation on these features to obtain crystal-clear understanding before attempting to implement them in a production environment. Please note that I’ve drawn upon the following Oracle Database 11gR1 documentation for the deeper technical details of this article:

B28274-01 Oracle Database 11gR1 Performance Tuning Guide

B28279-02 Oracle Database 11gR1 New Features Guide

B28419-02 Oracle Database 11gR1 PL/SQL Packages and Types Reference

B28254-04 Oracle Database 11gR1 Real Application Clusters Configuration and Administration Guide

» See All Articles by Columnist Jim Czuprynski

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved