Introduction
Do you know how your disk subsystem is actually performing? This article looks at extracting various I/O statistics so that you can monitor and determine just how well your disks are doing.
How can I separate Oracle I/O to maximize performance?
Should I separate data files from index files?
Should I separate redo logs
These question(s), AND many more, seem to flood our minds as database administrators. They are easy to answer with generalities but in practice can be very difficult to come to a conclusion on unless we look at how our disk subsystem is actually performing.
Many of us might stop reading this article right now, saying to ourselves that these questions and level of detail is only available to our system administrators or those that control the disk farm. All too often, I have seen two different methodologies when configuring Oracle on storage. The first is to use Oracles Flexible Architecture (OFA) approach where architects will separate Oracle object types (data, index, redo, archive, etc.) across a storage array. The second approach is to architect a JBOD (Just a Bunch Of Disks) configuration and throw everything on it. Both of these approaches lack the planning and configuration that ultimately produces a well-tuned database system. They are just taking a shot in the dark, hoping everything is going to work well because they followed a predefined methodology. Well, methodologies may not work in your case. Dont look to your system administrators as they might not even know how to extract information themselves, and when they do it is usually at a higher level since they too can not relate the information to the Oracle stack.
So the DBA must begin to understand the application from a purely I/O perspective, relay that information to the Storage Administrator and then, together, develop a plan for configuring or altering a storage subsystem that will be able to service the application mix. For the storage, a key performance indicator of an OLTP environment is based on I/Os per second (IOPS) and latencies (I/O turn-around time). OLAP databases are your data warehouses or reporting systems and are categorized by moving large amounts of data that is mostly read only. For the storage array, the performance of an OLAP environment is based on Mega-bytes per second (MBPS). A database workload is often descriptive of its application mix. Understanding and translating an application mix into a database workload is critical for optimizing a storage system. The workload of an OLTP database is categorized by small random I/O while OLAP is categorized by large sequential or random I/O.
For the database administrator it is now time to get dirty and look at the internals of your database. Somewhere and somehow, you must extract some form of statistics that allow you to categorize the type of SQL and I/O requests at the database level. Oracle for instance has quite a few internal tables that allow for the interrogation of this information. For instance we can query the gv$sysstat view for (physical read total IO requests physical read total multi block requests) to get the number of small reads in the system. Do this over a period of time, subtract the beginning value from the ending value and you quickly get IOPS for small reads over that period. This MUST be done for each statistic available to get a view of total IOPS and MBPS being requested by your particular database. These are the pertinent statistics you will need to extract. Basically, large reads and writes are used to calculate MBPS and small reads and writes are used for IOPS calculations.
gv$sysstat (name, value) Total Reads :'physical read total IO requests' Total Writes:'physical write total IO requests' Large Reads :'physical read total multi block requests' Large Writes:'physical write total multi block requests' Total Bytes Read :'physical read total bytes' Total Bytes Written :'physical write total bytes' To calculate small reads: Small Reads = Total Reads - Large Reads Small Writes = Total Writes - Large Writes
Coming up with these numbers allows us to make intelligent decisions in regards to our current database performance and storage requirements. Granted, the database may be experiencing contention and actual I/O requests may be lower then optimal so tuning may be in order. Regardless, it is the DBAs responsibility to take these IOPS and MBPS to the Storage Administrator, evaluate if the storage solution is being taxed, and if a reconfiguration is required. These reconfigurations may require moving data files around to use more disks, adding more disks for higher throughput, or reducing IOPS and MBPS through the application. Well, we have come full circle to the application again. I know you know what to do.
Below are two scripts that will get you started. I have spent quite a bit of time getting them easy to use and giving you the information quickly to understand how your disk subsystem is performing over time.
set echo off set feedback off set heading off set linesize 40 set pagesize 55 set verify off set termout off column rpt new_value rpt select instance_name||'_'||to_char(sysdate,'YYYYMMDDHH24MISS')||'_vsysstat_ioworkload.LST' rpt from v$instance; set termout on prompt prompt prompt ^^^^^^^^^^^^^ prompt Report Name : ../LST/&&rpt prompt ^^^^^^^^^^^^^ spool ../LST/&&rpt column sr1 new_value sr1 column sw1 new_value sw1 column lr1 new_value lr1 column lw1 new_value lw1 column tbr1 new_value tbr1 column tbw1 new_value tbw1 set termout off SELECT sum(decode(name,'physical read total IO requests',value,0)- decode(name,'physical read total multi block requests',value,0)) sr1, sum(decode(name,'physical write total IO requests',value,0)- decode(name,'physical write total multi block requests',value,0)) sw1, sum(decode(name,'physical read total multi block requests',value,0)) lr1, sum(decode(name,'physical write total multi block requests',value,0)) lw1, sum(decode(name,'physical read total bytes',value,0)) tbr1, sum(decode(name,'physical write total bytes',value,0)) tbw1 FROM v$sysstat; set termout on prompt prompt prompt ^^^^^^^^^^^^ prompt First Sample prompt ^^^^^^^^^^^^ prompt Number of Small Reads : &&sr1 prompt Number of Small Writes: &&sw1 prompt Number of Large Reads : &&lr1 prompt Number of Large Writes: &&lw1 prompt Total Bytes Read : &&tbr1 prompt Total Bytes Written : &&tbw1 prompt prompt prompt Enter the amount of time (in seconds) you would like this process to sleep for sampling data prompt ^^^^^^^^^^^^^^^^^^ prompt Sleep Time (secs): &&sleeptime prompt ^^^^^^^^^^^^^^^^^^ exec DBMS_LOCK.SLEEP (&&sleeptime); column sr2 new_value sr2 column sw2 new_value sw2 column lr2 new_value lr2 column lw2 new_value lw2 column tbr2 new_value tbr2 column tbw2 new_value tbw2 set termout off SELECT sum(decode(name,'physical read total IO requests',value,0)- decode(name,'physical read total multi block requests',value,0)) sr2, sum(decode(name,'physical write total IO requests',value,0)- decode(name,'physical write total multi block requests',value,0)) sw2, sum(decode(name,'physical read total multi block requests',value,0)) lr2, sum(decode(name,'physical write total multi block requests',value,0)) lw2, sum(decode(name,'physical read total bytes',value,0)) tbr2, sum(decode(name,'physical write total bytes',value,0)) tbw2 FROM v$sysstat; set termout on prompt prompt prompt ^^^^^^^^^^^^^ prompt Second Sample prompt ^^^^^^^^^^^^^ prompt Number of Small Reads : &&sr2 prompt Number of Small Writes: &&sw2 prompt Number of Large Reads : &&lr2 prompt Number of Large Writes: &&lw2 prompt Total Bytes Read : &&tbr2 prompt Total Bytes Written : &&tbw2 prompt prompt prompt ^^^^^^^^^ prompt Results : prompt ^^^^^^^^^ column sri new_value sri column swi new_value swi column tsi new_value tsi column srp new_value srp column swp new_value swp column lri new_value lri column lwi new_value lwi column tli new_value tli column lrp new_value lrp column lwp new_value lwp column tr new_value tr column tw new_value tw column tm new_value tm SELECT ROUND((&&sr2-&&sr1)/&&sleeptime,3) sri, ROUND((&&sw2-&&sw1)/&&sleeptime,3) swi, ROUND(((&&sr2-&&sr1)+(&&sw2-&&sw1))/&&sleeptime,3) tsi, ROUND((&&sr2-&&sr1)/DECODE(((&&sr2-&&sr1)+(&&sw2-&&sw1)),0,1,((&&sr2-&&sr1)+(&&sw2-&&sw1)))*100,3) srp, ROUND((&&sw2-&&sw1)/DECODE(((&&sr2-&&sr1)+(&&sw2-&&sw1)),0,1,((&&sr2-&&sr1)+(&&sw2-&&sw1)))*100,3) swp, ROUND((&&lr2-&&lr1)/&&sleeptime,3) lri, ROUND((&&lw2-&&lw1)/&&sleeptime,3) lwi, ROUND(((&&lr2-&&lr1)+(&&lw2-&&lw1))/&&sleeptime,3) tli, ROUND((&&lr2-&&lr1)/DECODE(((&&lr2-&&lr1)+(&&lw2-&&lw1)),0,1,((&&lr2-&&lr1)+(&&lw2-&&lw1)))*100,3) lrp, ROUND((&&lw2-&&lw1)/DECODE(((&&lr2-&&lr1)+(&&lw2-&&lw1)),0,1,((&&lr2-&&lr1)+(&&lw2-&&lw1)))*100,3) lwp, ROUND(((&&tbr2-&&tbr1)/&&sleeptime)/1048576,3) tr, ROUND(((&&tbw2-&&tbw1)/&&sleeptime)/1048576,3) tw, ROUND((((&&tbr2-&&tbr1)+(&&tbw2-&&tbw1))/&&sleeptime)/1048576,3) tm FROM dual; SELECT 'Small Read IOPS = '||ROUND((&&sr2-&&sr1)/&&sleeptime,3)||' IOPS', 'Small Write IOPS = '||ROUND((&&sw2-&&sw1)/&&sleeptime,3)||' IOPS', 'Total Small IOPS = '||ROUND(((&&sr2-&&sr1)+(&&sw2-&&sw1))/&&sleeptime,3)||' IOPS', 'Small Read I/O % = '||ROUND((&&sr2-&&sr1)/DECODE(((&&sr2-&&sr1)+(&&sw2-&&sw1)),0,1,((&&sr2-&&sr1)+(&&sw2-&&sw1)))*100,3)||' %', 'Small Write I/O % = '||ROUND((&&sw2-&&sw1)/DECODE(((&&sr2-&&sr1)+(&&sw2-&&sw1)),0,1,((&&sr2-&&sr1)+(&&sw2-&&sw1)))*100,3)||' %', 'Large Read IOPS = '||ROUND((&&lr2-&&lr1)/&&sleeptime,3)||' IOPS', 'Large Write IOPS = '||ROUND((&&lw2-&&lw1)/&&sleeptime,3)||' IOPS', 'Total Large IOPS = '||ROUND(((&&lr2-&&lr1)+(&&lw2-&&lw1))/&&sleeptime,3)||' IOPS', 'Large Read I/O % = '||ROUND((&&lr2-&&lr1)/DECODE(((&&lr2-&&lr1)+(&&lw2-&&lw1)),0,1,((&&lr2-&&lr1)+(&&lw2-&&lw1)))*100,3)||' %', 'Large Write I/O % = '||ROUND((&&lw2-&&lw1)/DECODE(((&&lr2-&&lr1)+(&&lw2-&&lw1)),0,1,((&&lr2-&&lr1)+(&&lw2-&&lw1)))*100,3)||' %', 'Total Read = '||ROUND(((&&tbr2-&&tbr1)/&&sleeptime)/1048576,3)||' MBPS', 'Total Written = '||ROUND(((&&tbw2-&&tbw1)/&&sleeptime)/1048576,3)||' MBPS', 'Total MBPS = '||ROUND((((&&tbr2-&&tbr1)+(&&tbw2-&&tbw1))/&&sleeptime)/1048576,3)||' MBPS' FROM dual ; prompt Small Read IOPS = &&sri IOPS prompt Small Write IOPS = &&swi IOPS prompt Total Small IOPS = &&tsi IOPS prompt Small Read I/O % = &&srp % prompt Small Write I/O % = &&swp % prompt Large Read IOPS = &&lri IOPS prompt Large Write IOPS = &&lwi IOPS prompt Total Large IOPS = &&tli IOPS prompt Large Read I/O % = &&lrp % prompt Large Write I/O % = &&lwp % prompt Total Read = &&tr MBPS prompt Total Written = &&tw MBPS prompt Total MBPS = &&tm MBPS spool off undefine sleeptime
Get a complete history of IOPS & MBPS from workload repository history and graph it for you management. This allows you to see total database disk activity. Compare this against what your disk capacity is. Just remember these numbers are for ALL disks. You can get average IOPS/MBPS by dividing by your total number of disks used in servicing database requests. This is great information and once you graph the results you will really see how your I/O, and application performance, might be suffering during the day or at least determine where peak periods are.
set echo off set feedback off set linesize 300 set pagesize 55 set verify off set termout off column rpt new_value rpt select instance_name||'_wrh_sysstat_ioworkload_'||'.LST' rpt from v$instance; set termout on prompt prompt prompt ^^^^^^^^^^^^^ prompt Report Name : ../LST/&&rpt prompt ^^^^^^^^^^^^^ spool ../LST/&&rpt column sri head "Small|Read|IOPS" column swi head "Small|Write|IOPS" column tsi head "Total|Small|IOPS" column srp head "Small|Read|I/O%" column swp head "Small|Write|I/O%" column lri head "Large|Read|IOPS" column lwi head "Large|Write|IOPS" column tli head "Total|Large|IOPS" column lrp head "Large|Read|I/O%" column lwp head "Large|Write|I/O%" column tr head "Total|Read|MBPS" column tw head "Total|Written|MBPS" column tm head "Total|MBPS" column begin_time for a25 column end_time for a25 SELECT end_time, ROUND(sr/inttime,3) sri, ROUND(sw/inttime,3) swi, ROUND((sr+sw)/inttime,3) tsi, ROUND(sr/DECODE((sr+sw),0,1,(sr+sw))*100,3) srp, ROUND(sw/DECODE((sr+sw),0,1,(sr+sw))*100,3) swp, ROUND(lr/inttime,3) lri, ROUND(lw/inttime,3) lwi, ROUND((lr+lw)/inttime,3) tli, ROUND(lr/DECODE((lr+lw),0,1,(lr+lw))*100,3) lrp, ROUND(lw/DECODE((lr+lw),0,1,(lr+lw))*100,3) lwp, ROUND((tbr/inttime)/1048576,3) tr, ROUND((tbw/inttime)/1048576,3) tw, ROUND(((tbr+tbw)/inttime)/1048576,3) tm FROM ( SELECT beg.snap_id beg_id, end.snap_id end_id, beg.begin_interval_time, beg.end_interval_time, end.begin_interval_time begin_time, end.end_interval_time end_time, (extract(day from (end.end_interval_time - end.begin_interval_time))*86400)+ (extract(hour from (end.end_interval_time - end.begin_interval_time))*3600)+ (extract(minute from (end.end_interval_time - end.begin_interval_time))*60)+ (extract(second from (end.end_interval_time - end.begin_interval_time))*01) inttime, decode(end.startup_time,end.begin_interval_time,end.sr,(end.sr-beg.sr)) sr, decode(end.startup_time,end.begin_interval_time,end.sw,(end.sw-beg.sw)) sw, decode(end.startup_time,end.begin_interval_time,end.lr,(end.lr-beg.lr)) lr, decode(end.startup_time,end.begin_interval_time,end.lw,(end.lw-beg.lw)) lw, decode(end.startup_time,end.begin_interval_time,end.tbr,(end.tbr-beg.tbr)) tbr, decode(end.startup_time,end.begin_interval_time,end.tbw,(end.tbw-beg.tbw)) tbw FROM (SELECT dba_hist_snapshot.snap_id, startup_time, begin_interval_time, end_interval_time, sum(decode(stat_name,'physical read total IO requests',value,0)-decode(stat_name,'physical read total multi block requests',value,0)) sr, sum(decode(stat_name,'physical write total IO requests',value,0)-
decode(stat_name,'physical write total multi block requests',value,0)) sw, sum(decode(stat_name,'physical read total multi block requests',value,0)) lr, sum(decode(stat_name,'physical write total multi block requests',value,0)) lw, sum(decode(stat_name,'physical read total bytes',value,0)) tbr, sum(decode(stat_name,'physical write total bytes',value,0)) tbw FROM wrh$_sysstat, wrh$_stat_name, dba_hist_snapshot WHERE wrh$_sysstat.stat_id = wrh$_stat_name.stat_id AND wrh$_sysstat.snap_id = dba_hist_snapshot.snap_id group by dba_hist_snapshot.snap_id, startup_time, begin_interval_time, end_interval_time) beg, (SELECT dba_hist_snapshot.snap_id, startup_time, begin_interval_time, end_interval_time, sum(decode(stat_name,'physical read total IO requests',value,0)-
decode(stat_name,'physical read total multi block requests',value,0)) sr, sum(decode(stat_name,'physical write total IO requests',value,0)-
decode(stat_name,'physical write total multi block requests',value,0)) sw, sum(decode(stat_name,'physical read total multi block requests',value,0)) lr, sum(decode(stat_name,'physical write total multi block requests',value,0)) lw, sum(decode(stat_name,'physical read total bytes',value,0)) tbr, sum(decode(stat_name,'physical write total bytes',value,0)) tbw FROM wrh$_sysstat, wrh$_stat_name, dba_hist_snapshot WHERE wrh$_sysstat.stat_id = wrh$_stat_name.stat_id AND wrh$_sysstat.snap_id = dba_hist_snapshot.snap_id group by dba_hist_snapshot.snap_id, startup_time, begin_interval_time, end_interval_time) end WHERE beg.snap_id + 1 = end.snap_id ) order by 1 / spool off
Understanding an application from a purely I/O perspective is a key aspect of configuring storage. Oracle has a variety of I/O types that ultimately need to be mapped, sampled, and related to storage. In Oracles case, there is I/O generated by server processes on behalf of users, multiple database writers, checkpoint activity, logging facilities that not only write as updates are being done but also the reading from online logs and writing to archive logs by the archive process, plus a few more and some internals that determine size and frequency of the I/Os. Oracle is a very complex system of processes that without understanding your disk I/O patterns is nearly impossible to configure properly.