The final article in this series illustrates how to use Oracle Database 11gR1's new Database Replay (DBR) features to capture and prepare a workload from a current Oracle 10gR2 single-instance database environment and then replay that identical workload in
an Oracle 11gR1 Real Application Clusters (RAC) testing environment.
Synopsis. Oracle Database 11gR1s new Database Replay (DBR) features allow an Oracle DBA the capability to capture a workload from an Oracle 10gR2 database environment and then replay that same workload against an Oracle 11gR1 database environment to analyze how the transition to the new Oracle database release will impact overall systems performance. The final article in this series illustrates how to use these features to capture and prepare a workload from a current Oracle 10gR2 single-instance database environment and then replay that identical workload in an Oracle 11gR1 Real Application Clusters (RAC) testing environment. This offers an Oracle DBA the unprecedented opportunity to identify any potential performance bottlenecks well in advance of the transition to a RAC environment.
The prior article in this series demonstrated a relatively straightforward scenario: how to capture a simulated application workload within the current Oracle 11g production environment (P+0) and then replay that same workload in the next iteration of that same environment (P+1). This article tackles a somewhat more ambitious task because it will illustrate how to:
- Capture and record the simulated application workload from a single-instance Oracle 10gR2 database, including the corresponding Automatic Workload Repository (AWR) data for that recorded workload
- Transfer the simulated workload to an Oracle 11gR1 Real Application Clusters (RAC) testing environment
- Preprocess the workload for replay, including remapping the connections to different RAC load-balanced services
- Replay the workload in the Oracle 11gR1 RAC test environment
- Identify application performance issues, data divergence, or error divergence
Since I concentrated on how to perform these tasks almost exclusively within the Oracle 11gR1 Enterprise Manager GUI, Ill concentrate on how to use Oracle 11gR1s supplied PL/SQL packages, DBMS_WORKLOAD_CAPTURE and DBMS_WORKLOAD_REPLAY, to accomplish the same tasks via SQL*Plus and scripting.
Phase 1: Record a Workload in Single-Instance Environment
To keep this capture and replay scenario simple and because I hate wasting anything useful - Ill utilize the same PL/SQL objects that I constructed in the prior article to generate a workload for capture against a single-instance Oracle 10gR2 database. Since the minimum required release level to capture a workload in Oracle 10gR2 is 10.2.0.4. I first used the Database Upgrade Assistant (DBUA) to patch an existing Oracle 10gR2 database named DB10G up to Oracle Release 10.2.0.4 a relatively painless process that takes about 30 minutes in my simulated Oracle 10gR2 production environment.
Feature Upgrade Alert
Note that as of July 18, 2008, it's now also possible to use an Oracle 9iR2 database as a potential target for Real Application Testing workload capture. The minimum release level required for execution of DBMS_WORKLOAD_CAPTURE in an Oracle 9i or 10g database environment, which patch levels to apply, and instructions for applying the patches is detailed in this document from Oracle Technical Support.
I then executed the same scripts and PL/SQL code to create and initialize a suitable environment just prior to executing a simulated application workload for capture. (See the first section of the prior article for a summary of the scripts and code to accomplish this.)
Preparing for Workload Capture. Now that my source databases environment is initialized, Ill initiate the capture of an actual workload. Listing 3.1 shows how I used procedure DBMS_WORKLOAD_CAPTURE.ADD_FILTER to first apply some appropriate filters to eliminate capture of user sessions that are producing uninteresting activity or need to be ignored during the capture period. In this case, I want to make sure that any activity related to either Enterprise Manager Database Control or Grid Control wont be captured.
Starting Workload Capture. The code shown in Listing 3.2 illustrates how I used procedure DBMS_WORKLOAD_CAPTURE.START_CAPTURE to initiate workload capture. This procedure first checks the target directory (DBRCONTROL) for any prior executions of workload capture files; if any are found, it returns an error and wont allow the current workload capture attempt to continue. If the capture startup is successful, however, the DB10G databases alert log will recognize that a DBR capture operation is underway:
. . .
Mon Jun 23 19:40:41 2008
ALTER SYSTEM SET pre_11g_enable_capture=TRUE SCOPE=BOTH;
Mon Jun 23 19:40:44 2008
DBMS_WORKLOAD_CAPTURE.START_CAPTURE(): Starting database capture at 06/23/2008 19:40:44
. . .
Since Im capturing the workload against an Oracle 10gR2 database, note that I also had to set the dynamic initialization parameter PRE_11G_ENABLE_CAPTURE to TRUE before starting the capture.
Generating a Workload. To simulate multiple executions of similar code by different users, I prepared and executed a simple shell script named 10gSI_RandomLoadGenerator.sh. Like its similar predecessors in the prior article, it starts up approximately 80 user sessions that perform a random set of tasks like executing simple queries that do CPU-intensive computations, generating complex queries against the AP schema, or performing intense bursts of DML that add several thousand rows into the AP schemas tables.
Note that Ive also configured the DB10G database to use just one service name (DB10G) regardless of the type of operation being performed. Ive added this service name as a potential connection alias in my databases TNSNAMES.ORA configuration file. (In later steps, Ill illustrate how to remap this connection to a different service name during workload replay against my Oracle 11gR1 RAC environment with DBMS_WORKLOAD_REPLAY.REMAP_CONNECTION.)
Halting Workload Capture. To halt the workloads capture, I executed procedure DBMS_WORKLOAD_CAPTURE.FINISH_CAPTURE to stop the workload capture operation (see Listing 3.3). Note that the successful conclusion of the DBR capture operation is also recorded in database DB10Gs alert log:
. . .
Mon Jun 23 19:42:21 2008
Thread 1 advanced to log sequence 43 (LGWR switch)
Current log# 3 seq# 43 mem# 0: /u01/app/oracle/oradata/db10g/redo03.log
Mon Jun 23 19:44:35 2008
DBMS_WORKLOAD_CAPTURE.FINISH_CAPTURE(): Stopped database capture successfully at 06/23/2008 19:44:32
Viewing Workload Capture Results. To review the results of what the workload capture operation actually captured, I executed procedure DBMS_WORKLOAD_CAPTURE.REPORT to generate a summary report (see Listing 3.4). The output from this report can be viewed in text-only format in Report 3.1 or in HTML format here.
Phase 2: Preparing for Replay
After I successfully captured an appropriate workload against my Oracle 10gR2 single-instance database, I tackled the preparation of my target environment an Oracle 11gR1 Real Application Cluster (RAC) clustered database:
- I configured a two-node Real Applications Cluster (RACNODE1 and RACNODE2) using Oracle Clusterware 11.1.0.6.0 to configure and manage the clustered environment.
- I deployed an Oracle 11.1.0.6.0 ASM instance on each node of the cluster and created two ASM disk groups, +DATA and +FRA, on shared disk storage for my RAC databases files.
- I created a new RAC database named RACDB using the standard Oracle 11gR1 seed database template. This deployed two RAC instances, RACDB1 and RACDB2, one on each node of the cluster, each serviced by a single Listener.
- I deployed a new RAC service named TESTLBA as a preferred service on both nodes and tuned it for maximum connection management performance in a simulated OLTP environment that utilizes the Load Balancing Advisor to distribute connections across both database instances. Listing 3.5 shows the SRVCTL commands, the invocation of DBMS_SERVICE.MODIFY_SERVICE, and the TNSNAMES.ORA network configuration entries I deployed to configure the TESTLBA service appropriately.
- Finally, I created precisely the same database tables, indexes, and PL/SQL packages in the RACDB clustered database that I had previously created in the DB10G Oracle 10.2.0.4.0 database. I then restored the initial state of tables AP.VENDORS, AP.INVOICES and AP_INVOICE_DETAILS by executing script APInitialization.sql.
Preparing the Workload. Now that my Database Replay target environment is in place, Im ready to prepare the workload that I captured in single-instance mode for its eventual replay in a RAC database environment:
- I added a new physical directory, /home/oracle/DBRControl, on nodes RACNODE1 and RACNODE2.
- I created a new directory object named DBRCONTROL in the RACDB database, aimed it at the physical directory I just created on each node, and granted the appropriate access privileges to this directory object.
- I copied the files that I generated against the DB10G database during DBR Workload Capture to the same physical directory on both nodes. Note that I could have copied these workload replay files to just one of the nodes say, the one that normally handles the least work because any node in the RAC cluster can be used to manage the replay of the captured DBR workload.
- Finally, I executed procedure DBMS_WORKLOAD_REPLAY.PROCESS_CAPTURE to preprocess the captured workload for its eventual replay. This procedure parses the information in the files that recorded a shadow copy of each sessions operations against the DB10G database and prepares the workload for replay against the RACDB database.
Listing 3.6 shows the commands I issued to create the physical directories on each node, the corresponding directory object in the RACDB database, and (after all workload capture recorded files have been copied to the appropriate physical directories) how I executed procedure DBMS_WORKLOAD_REPLAY.PROCESS_CAPTURE to preprocess the captured workload
Phase 3: Replaying the Workload
My RAC
database is now finally ready to accept execution of the previously captured
workload. As the previous article illustrated, there are several steps that
must be followed in precise order when using Oracle 11g Enterprise Manager to
start the replay, and those same steps must be followed when calling DBMS_WORKLOAD_REPLAY
procedures to initiate the captured workloads replay, remap any connections,
adjust any custom replay frequency settings, and start gathering replay
performance and regression statistics collection.
Initiating Database Workload Replay. To
initiate the replay of the captured workload, Ill invoke procedure DBMS_WORKLOAD_REPLAY.INITIATE_REPLAY.
This places the RACDB
database into INIT
FOR REPLAY state a prerequisite to moving the database into
the PREPARE
state via procedure DBMS_WORKLOAD_REPLAY.PREPARE_REPLAY.
Remapping Connection Strings. To insure
that all sessions that participated in workload generation against the DB10G
connection on my Oracle 10gR2 single-instance database are remapped to the
corresponding TESTLBA
load-balanced connection on the RACDB database, Ill use procedure DBMS_WORKLOAD_REPLAY.REMAP_CONNECTION.
See Listing 3.7
for examples of how to invoke these two procedures and how to view the
resulting remapped connection strings.
Customizing Workload Replay Options. As I
described in the Database Replay primer article, Oracle 11g permits the DBA to alter
the frequencies at which a workload may be played back with extreme
granularity. The workload replay clients behavior can be tightly controlled by
setting several additional parameters via procedure DBMS_WORKLOAD_REPLAY.PREPARE_REPLAY:
Table 3.1. Replay Client Options
|
Replay Option
|
Description
|
SYNCHRONIZATION
|
Defines
whether synchronization will be
used during workload generation:
TRUE:
This is the default. The order of COMMITs in the captured workload will be preserved during replay; all
replay actions are executed only after all dependent COMMIT actions are
completed.
FALSE:
The original order of COMMITs will not be
honored. This most likely will result in a large data divergence,
but it is useful for load or stress
testing.
|
THINK_TIME_SCALE
|
Determines
the elapsed time between two successive
user calls within the same session, so it drives the replay speed:
The
default value is 100, or 100% of the original workload generation speed.
If
set to zero (0), calls are sent to the replay database in succession as
rapidly as possible.
If
set to > 100%, then the replay speed will decrease proportionally.
|
THINK_TIME_AUTO_CORRECT
|
Corrects
the THINK_TIME_SCALE between user calls,
based on the specified percentage value. Setting this parameter to TRUE
forces the Replay Client to shorten the think time between calls so that the
total elapsed time of Database Replay more accurately matches that which was
initially gathered.
|
CONNECT_TIME_SCALE
|
Scales
the elapsed time from when the workload capture started to when the session
connects with the specified value. This is interpreted as the percentage of
time that a user session should
remain connected.
|
Note that Database Replay does
differentiate between workload capture time
and workload replay time:
-
During the capture of the workload, the elapsed time is measured by user time (the total elapsed time of a
user call to the database) and user think
time (the time the user waited between issuing another call).
-
During the replay of the workload, however, the
elapsed time is measured by user time,
user think time, and synchronization time.
As in
the previous article, Ive simply accepted the default options shown for each
of these parameters by invoking procedure DBMS_WORKLOAD_REPLAY.PREPARE_REPLAY
as shown in Listing 3.8.
Initiating a Workload Replay Client. Its
time to start the Workload Replay Client
(WRC) to play back the previously captured workload. Just as in the prior
article, Ill start up a replay session by opening a terminal session and
invoking the WRC client executable on just one
node - RACNODE2
of my clustered database:
$> wrc replaydir=/home/oracle/DBRControl
Workload Replay Client: Release 11.1.0.6.0 - Production on Mon Jun 23 21:27:47 2008
Wait for the replay to start (21:27:48)
Once the
WRC is started on RACNODE2, I need to tell Oracle
11g that it should take control of any active database replay operations. As shown
in Listing 3.9,
Ill do this by invoking procedure DBMS_WORKLOAD_REPLAY.START_REPLAY.
The successful execution of this procedure will be reflected within the WRC terminal
sessions output:
Wait for the replay to start (21:27:48)
Replay started (21:28:16)
Monitoring Active Replay Operations. Listing 3.10
shows a query against the DBA_WORKLOAD_REPLAY view that produces a simple
report of the current DBR workload replay state; Report
3.2 shows the result of executing this query during several
different phases of the DBR workload replay until the replay operation is
completed (which will also be reflected in the WRCs session output):
Wait for the replay to start (21:27:48)
Replay started (21:28:16)
Replay finished (21:48:40)
The
successful startup and completion of the DBR workload replay sessions will be
recorded in the alert logs of both instances, as shown below:
>>> From RACDB1's alert log:
...
Tue Jun 24 21:28:01 2008
DBMS_WORKLOAD_REPLAY.START_REPLAY(): Starting database replay at 06/24/2008 21:28
Tue Jun 24 21:31:04 2008
Thread 1 advanced to log sequence 92
Current log# 2 seq# 92 mem# 0: +DATA/racdb/onlinelog/group_2.262.649041349
Current log# 2 seq# 92 mem# 1: +FRA/racdb/onlinelog/group_2.259.649041351
Tue Jun 24 21:48:39 2008
DBMS_WORKLOAD_REPLAY: Database replay ran to completion at 06/24/2008 21:48:40
...
>>> From RACDB2's alert log:
...
Tue Jun 24 21:28:01 2008
DBMS_WORKLOAD_REPLAY.START_REPLAY(): Starting database replay at 06/24/2008 21:28
Tue Jun 24 21:48:39 2008
DBMS_WORKLOAD_REPLAY: Database replay ran to completion at 06/24/2008 21:48:40
...
Phase 4: Regression Analysis
The
playback of the captured workload against my RAC test environment is complete,
so its time to turn my attention to what performance issues may arise if I decide
to migrate this application code to a Real Application Clusters environment.
The Database Replay Report is probably most
useful for a macroscopic view because it compares execution statistics between
the captured and replayed workloads. I used the code in Listing 3.11
to create the report output in simple text format as shown in Report 3.3.
In
addition, the Automatic Workload Repository
(AWR) report summarizes and analyzes the databases overall performance between
the database replay operations starting and ending time periods. Ive
generated two versions of this report (one for each of the two RAC instances)
using the code in Listing 3.12;
the corresponding report output is shown in Report 3.4.
As
these two reports demonstrate, Ive got some analysis to perform before I
implement this application in an 11gR1 Real Application environment because
they indicate that theres a lot of contention for a few data and index
segments most likely due to serialization of these resources and others. Ill
spend time in a later article deconstructing these performance issues with the
new enhancements to the Automatic Database Diagnostic Monitor (ADDM) tool set.
Data Dictionary Views. Oracle 11gR1 also
provides several data dictionary views that describe the results of database
workload capture and replay activities, configuration details, and active DBR
sessions, as shown in Table 3.2
below:
Table 3.2. Database Replay Performance and Metadata Views
|
Data Dicttonary View
|
Description
|
DBA_WORKLOAD_CAPTURES
|
Describes
statistics from DBR Workload Capture
operations
|
DBA_WORKLOAD_FILTERS
|
Tells
which types of filters were applied during DBR Workload Capture operations
|
DBA_WORKLOAD_REPLAYS
|
Describes
results from DBR Workload Replay
operations
|
DBA_WORKLOAD_CONNECTION_MAP
|
Explains
how connection mapping will affect any pending DBR Workload Replay operations
|
DBA_WORKLOAD_REPLAY_DIVERGENCE
|
Summarizes
data and error divergence encountered during DBR Workload Replay operations
|
V$WORKLOAD_REPLAY_THREAD
|
Shows
which database sessions are actually executing DBR Workload Replay operations
|
Ive
provided SQL*Plus formatted queries against the remainder of these views in Listing 3.13.
Conclusion
Oracle
11gs new Database Replay capabilities provide extremely granular analysis of
potential performance problems, data divergence, and error divergence during
regression testing. These features have the potential to limit if not eliminate
one of the biggest headaches for an Oracle DBA: the inability to tell with a
sufficient degree of certainty how proposed changes
to a database environment regardless of the vector of the changes will
precisely impact the current
database environment.
References and
Additional Reading
While
Im hopeful that Ive given you a thorough grounding in the technical aspects
of the features Ive discussed in this article, Im also sure that there may be
better documentation available since its been published. I therefore strongly
suggest that you take a close look at the corresponding Oracle documentation on
these features to obtain crystal-clear understanding before attempting to
implement them in a production environment. Please note that Ive drawn upon
the following Oracle Database 11gR1 documentation for the deeper technical
details of this article:
B28274-01 Oracle Database 11gR1 Performance Tuning Guide
B28279-02 Oracle Database 11gR1 New Features Guide
B28419-02 Oracle Database 11gR1 PL/SQL Packages and Types
Reference
B28254-04 Oracle Database 11gR1 Real Application Clusters
Configuration and Administration Guide
»
See All Articles by Columnist Jim Czuprynski