Part 2 of this series - demonstrates how Oracle 11g Database Replay can be used to capture and
prepare a workload from a current Oracle 11g production database environment
(P+0) and then replay that identical workload in an Oracle 11g testing environment
that represents the next iteration (P+1) of the database system.
Synopsis. Oracle Database 11gR1 offers the ability to capture applications workloads in a production environment and then replay that workload against a test environment to determine the impact of proposed system, database or application modifications on database performance. This article the second in this series demonstrates how Oracle 11g Database Replay can be used to capture and prepare a workload from a current Oracle 11g production database environment (P+0) and then replay that identical workload in an Oracle 11g testing environment that represents the next iteration (P+1) of the database system. This offers an Oracle DBA the chance to analyze and isolate the vectors that might cause detrimental performance or divergence between current and future environments well before any actual changes are rolled out.
A friend of mine who teaches future cardiologists describes the time-honored tradition under which surgeons learn their trade watch one, assist one, do one. The prior article in this series provided a primer into the theory and best practices behind Oracle Database 11gs new Database Replay features, and while understanding the theory behind a complex set of features is certainly a noble goal, its time to move on to the assist one / do one phases. This article will focus on how to:
- Capture an actual workload from an Oracle 11g database
- Capture the corresponding Automatic Workload Repository (AWR) data for that recorded workload
- Prepare the test database environment for eventual workload replay
- Transfer the production workload configuration to the testing environment
- Preprocess the production workload for replay
- Replay the workload on the test system
- Analyze the replayed workload for any performance issues or divergence
Simulated Application Environment
A word about my testing environment for this article: For simplicitys sake, Im going to use the same database for both capture and playback operations. Im using the basic Oracle 11g seed database with all sample schemas installed. The database will be run in ARCHIVELOG mode with Flashback Logging activated so that I can use the FLASHBACK DATABASE command to quickly rewind the database to an appropriate starting point for replay.
Phase 1: Recording a Workload
To set up a valid capture and reply scenario, Ive constructed several new users, tables, indexes, and related PL/SQL objects:
- A new user, ADMIN, will be used as a repository for global administrative objects within my Oracle 11g database. I also created a new table for storage of primary key values. The code to reproduce these objects is shown in ADMINSetup.sql.
- PL/SQL package ADMIN.PKG_SEQUENCING controls assignment of new primary key values. The source code for its package specification is shown in pkg_sequencing.spc, and the initial version of its package body is shown in pkg_sequencing_v1.bdy.
- Another user, AP, will encapsulate the schema for a simulated Accounts Payable system, including new tables AP.VENDORS, AP.INVOICES, and AP.INVOICE_ITEMS. The code to create this new schema and its related objects is found in APSetup.sql.
- To populate AP schema objects, Ive set up package AP.PKG_LOAD_GENERATOR. The source code for its specification and body can be found in pkg_load_generator.spc and pkg_load_generator.bdy, respectively.
- Finally, the code in APInitialization.sql loads table AP.VENDORS with a few hundred rows of simulated data, creates 25 new Invoices in table AP.INVOICES and corresponding Invoice line item detail in table AP.INVOICE_ITEMS, and gathers initial optimizer statistics for all objects in the ADMIN and AP schemas. It also creates a directory object, DBRControl, that Database Replay will use to store the resulting scripts and XML files generated during workload capture.
Setting Up for Workload Capture
Now that my source databases environment is initialized, Ill initiate the capture of an actual workload. Figure 2.1.1 below shows the initial screen for Database Replay, which is accessed from the Database Replay link under the Real Application Testing section on the Software and Support tab of the EM Database Control facility:
Figure 2.1.1: Workload Capture Setup: Initial Screen
Once Ive selected the first task, Im presented with a checklist that lists all prerequisite steps that Ill need to acknowledge before my capture session is allowed to proceed.
Figure 2.1.2: Workload Capture Setup: Plan Environment Checklist
Oracle 11g next presents me the ability to either restart or not restart the database before officially capturing the workload, and it lets me filter out unnecessary session activity (e.g. from EM itself!). Note that Ill follow Oracle 11gs recommended best practice for clean captures: Ill accept EMs suggestion to stop and restart the test database to establish an effective start time for the capture.
Figure 2.1.3: Workload Capture Setup: Options
Next, Im presented with the chance to name the capture session and specify the directory object into which all replay scripts will be captured.
Figure 2.1.4: Workload Capture Setup: Setting Parameters
Oracle 11g next needs me to specify the name of a scheduled EM Database Control task that will handle the workloads capture. Figures 2.1.5 and 2.1.6 show the final confirmation screens for that tasks setup.
Figure 2.1.5: Workload Capture Setup: Assigning EM Task Name
Figure 2.1.6: Workload Capture Setup: Final Task Review
Finally, Oracle 11g requests one last confirmation ...
Figure 2.1.7: Workload Capture Setup: Task Submission
... and the capture is initiated. Once Oracle 11g displays this screen, its essentially waiting for me to start the representative workload against the source database.
Capturing an Actual Workload
To simulate multiple executions of similar code by different users, Ive prepared a simple shell script named RandomLoadGenerator.sh that executes simple queries that do CPU intensive computations, generates complex queries against the AP schema, and also grinds through the insertion of several thousand rows into the new AP schemas tables. Ive also configured my source database environment to use multiple service names, one for each class of users, as shown in SI_Services_tnsnames.ora. Ive added those service names as potential connection aliases in my databases TNSNAMES.ORA configuration file.
I started this workload against my P+0 database environment. Once it was complete, I logged back into EM Database Control and viewed the results of its execution as shown in Figure 2.2.1 below, and then clicked on the Stop Capture button to terminate the workload capture.
Oracle 11g then asks for confirmation of the capture tasks termination, and displays a stopwatch until the capture has been completed.
Figure 2.2.2: Workload Capture: Confirming Capture Termination
Figure 2.2.3: Workload Capture: Confirming Capture Termination
Just as capture termination completes, Oracle 11g asks if Id like to capture the corresponding Automatic Workload Repository (AWR) data for the recorded workload. As shown in Figure 2.2.4 below, Ive selected to capture any related AWR snapshot(s) for later comparison reporting.
Figure 2.2.4: Workload Capture: Requesting AWR Snapshot Generation
Once the capture is complete, I can review the Oracle 11gs capture results to determine if the capture was successful and if it contains sufficient data. If I find that its insufficient, the FLASHBACK DATABASE command set allows me to rewind the database to just before capture started and then rerun the capture. I also selected the View Workload Capture Report button and generated a Database Capture Report that summarizes the completed capture.
Phase 2: Preparing for Replay
Even though Ive successfully completed capturing a sufficient workload within my P+0 database environment, I still have some work to do before I can play it back against the P+1 environment.
Resetting the P+0 Environment
Since Im using the same database for both the source and target environments, Ill first need to reset my current P+0 database system to the state immediately before I started workload capture. My database is in flashback logging mode, so Ill simply use Recovery Managers FLASHBACK DATABASE command to rewind the database to its initial state:
$> rman target /
RMAN> shutdown immediate;
RMAN> startup mount;
RMAN> reset database to incarnation 6;
RMAN> flashback database to scn= 4162947;
Migrating to the P+1 Environment
My next step is to apply the changes necessary to bring my database environment to the P+1 state. For this rather simple demonstration, Ill make two changes that are almost certain to impact the performance of my database in P+1 mode:
- Ive prepared an improved version of procedure ADMIN.PKG_SEQUENCING.NEXT_ID that uses sequences instead of table ADMIN.NEXT_IDS to determine the next primary key values for tables in the AP schema. This should dramatically increase the performance of procedure AP.PKG_LOAD_GENERATOR.RANDOMDML, which creates a large number of random entries in the AP.INVOICES and AP.INVOICE_ITEMS tables, during workload replay.
- Ill also drop the index on AP.INVOICES.CUSTOMER_ID and recalculate statistics on the AP schema. Since procedure AP.PKG_LOAD_GENERATOR.RANDOMQUERY often uses this index to select rows efficiently when it generates random queries against view AP.RV_INVOICE_DETAILS, I should expect to see a noticeable regression in its performance during replay of the workload.
Massaging the Workload
Now that my P+1 environment is in place, Im ready to preprocess the workload for playback. Once again, Ill use EM Database Control to initiate the preprocessing sequence. Figure 2.3.1 shows the results of selecting the Preprocess Workload option from the main Database Replay panel once Ive selected the DBRControl directory object as the target for preprocessing:
Figure 2.3.1: Preprocess Captured Workload: Selecting a Captured Workload
Once Ive chosen the desired workload, Oracle 11g reminds me that Ive got to use the same database version to eventually replay the selected workload ...
Figure 2.3.2: Preprocess Captured Workload: Database Version Warning
... and then sets up a new EM scheduled task to complete the preprocessing.
Figure 2.3.3: Preprocess Captured Workload: Scheduling the Preprocessing Task
Oracle 11g prompts for a final confirmation before submitting the preprocessing task, and then it fires it off immediately.
Figure 2.3.4: Preprocess Captured Workload: Final Confirmation
Phase 3: Replaying the Workload
Since
my target database has now been reset to its prior state just before workload
capture was initiated and all my P+1 changes are now in place, at last Im
ready to initiate the replay of the workload. Ill use EM Database Control to
initiate the replay, as shown in Figure 2.4.1
below.
Once
Ive selected a directory that already contains a captured workload, Oracle 11g
makes sure that Ive reviewed all required prerequisites for a successful
replay, including what external references (e.g. external files and
directories) need to be resolved before proceeding, as shown in Figures 2.4.2 and 2.4.3 below.
Figure 2.4.2: Workload Replay: Confirming Prerequisites
Figure 2.4.3: Workload Replay: Confirming References to
External Systems
Once
all prerequisites have been confirmed, its time to set up the actual EM task
that will perform the replay. Figure 2.4.4
shows how I set up the basics for this task ...
Figure 2.4.4: Workload Replay: Setting Up Replay Task
... and
Figure 2.4.5 shows how I remapped
the current connection strings generated during the initial workload capture so
that the identical connection strings are used during this database replay
operation.
Figure 2.4.5: Workload Replay: Setting Up Connection Strings
As I
described in the Database Replay primer article, Oracle 11g allows me to alter
the frequency at which a workload is played back with extreme granularity. To
keep this demonstration simple, Ive accepted the default options shown in Figure 2.4.6.
Figure 2.4.6: Workload Replay: Customizing Replay Options
Its
now almost time to start the Workload Replay
Client (WRC) to play back the previously captured workload. As Figure 2.4.7 illustrates, my next step is
to start the WRC with the appropriate parameters and then return to this screen
and select the Next button to initiate the workloads automatic replay:
Figure 2.4.7: Workload Replay: Initiating Workload Replay
Clients
Ill
start the replay session by opening a terminal session and invoking the WRC
client executable. Note that I first changed my present working directory to /home/oracle/DBRControl before starting up
the WRC:
$> wrc system/oracle@orcl mode=replay replaydir=.
Workload Replay Client: Release 11.1.0.6.0 - Production on Thu May 22 19:28:59 2008
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Wait for the replay to start (19:28:59)
...
Returning
to EM Database Control, I simply click on the Next button, and Oracle 11g
acknowledges that it has taken control of any active database replay operations
...
Figure 2.4.8: Workload Replay: Controlling Replay Operations
... and
thats also reflected from within the terminal sessions themselves.
Wait for the replay to start (19:28:59)
...
Replay started (19:31:17)
Back
at EM Database Control, the status of the database replay operation is
constantly being monitored ...
Figure 2.4.9: Workload Replay: Monitoring Active Replay
Operations
..;
until the replay operation is completed.
Wait for the replay to start (19:28:59)
...
Replay started (19:31:17)
Replay finished (19:37:02)
Figure 2.4.10: Workload Replay: Completed Replay Operation
Phase 4: Regression Analysis
Oracle
11g gives me several tools for comparing the captured vs. replayed workloads.
As shown in Figures 2.5.1 and 2.5.2 below, the most succinct comparison
is the one that Database Replay provides immediately after a successful replay
operation has completed:
As
these summaries illustrate, its obvious that the changes Ive made to the
databases objects have had an overall positive effect because the time it took
to execute the captured workload
is significantly longer than the
time it took to replay the
identical workload, and that immediately indicates that the system
modifications have positively increased database throughput. Its just as
important to note, however, that there were no deleterious effects introduced.
This is easiest to see in the second half of the regression analysis screen
because there is absolutely no divergence between the generated data, and there
were no unexpected errors generated.
Reporting Analyses
Finally,
Database Replay offers several reports in HTML format that analyze the results
of the completed database replay operation:
-
DB Replay Report.
This report compares the execution of the captured workload to that of the
replayed workload, and it searches for the source of any possible data and/or
error regressions.
-
AWR Report.
This report provides an Automatic Workload
Repository (AWR) report that summarizes and analyzes the databases
overall performance between the database replay operations starting and ending
time periods.
-
ASH Report. At an even
lower level of detail, this report shows exactly which SQL statements and wait
events caused the largest impact on database performance by looking at the
contents of the databases Active Session History
(ASH) buffer during the execution of the replay operation.
Next Steps
In
the final article in this series, Ill expand on the scenario that played out
in this article by tackling one of the thornier concerns most Oracle DBAs have
for systems that are about to migrate to Oracle 11g: How will my current
database perform in a brand-new environment?
And to make it even more relevant and interesting, Ill target a two-node
Oracle 11g Real Application Cluster (RAC) clustered database as the testing
environment.
References and Additional Reading
While
Im hopeful that Ive given you a thorough grounding in the technical aspects
of the features Ive discussed in this article, Im also sure that there may be
better documentation available since its been published. I therefore strongly
suggest that you take a close look at the corresponding Oracle documentation on
these features to obtain crystal-clear understanding before attempting to
implement them in a production environment. Please note that Ive drawn upon
the following Oracle Database 11gR1 documentation for the deeper technical
details of this article:
B28274-01 Oracle Database 11gR1 Performance Tuning Guide
B28279-02 Oracle Database 11gR1 New Features Guide
B28419-02 Oracle Database 11gR1 PL/SQL Packages and Types
Reference
»
See All Articles by Columnist Jim Czuprynski