Oracle Optimizer: Moving to and working with CBO - Part 4

Tuesday Oct 28th 2003 by Amar Kumar Padhi
Share:

Part 4 of the Oracle Optimizer series covers the various facilities available for generating statistics in Oracle database.

9. Generating Statistics

In Part 1 of this series, we discussed the basics of statistics. This month we'll explore statistics in-depth, learning how well we can maintain them.

There are two ways of gathering statistics, COMPUTE and ESTIMATE. COMPUTE results in 100% statistics generation and is accurate. The drawback is that it needs ample time to generate the complete statistics. ESTIMATE generates statistics as per the sample number of rows or percent provided. It is faster as only partial data is considered for arriving at the statistics. The drawback is that accuracy of the statistics depends on the sample size provided. Please note that the optimizer is better placed when accurate or near-accurate statistics are provided. The ESTIMATE option can be used for large systems that cannot afford the time required for generating statistics that are more accurate.

Please note that columns like NUM_ROWS, AVG_SPACE and AVG_ROW_LEN may not show exact data if the statistics are being estimated with a specific sample size. For example, accurate statistics may show you the exact number of rows in a table when the statistics were taken, but estimate statistics would show a derived value that might not be equal to the actual number of rows. Read further to find whether your setup will benefit with the ESTIMATE option.

How much to estimate?

If you intend to ESTIMATE statistics because of time constraints, try to arrive at an optimal sample size that will yield excellent results for your database. For example, start with a sample size of 5% and test your application. Increase the percentage to 10-15% and test your application again.

Some setups that I have come across report good throughput and response time for statistics with a sample size of only 5% to 10%. In general, a sample size of 5% to 10% generates adequate statistics! What will matter here is proper testing in your setup.

The accuracy of the statistics will depend on an appropriate sample size. Beyond a certain sample size, the statistics generated will be fairly consistent. For example, if you analyze tables with a sample size of 5% and later analyze the same set of tables with a sample size of 10%, you may find significant changes in the statistics collected. However, if you analyze the same set of tables with a sample size of 15-20%, it may not show a significant difference from the statistics collected at 10% and requires even more time. Thus, an optimal sample size here would be 10% flat.

If the ESTIMATE sample size is greater than 50%, the statistics will be almost similar to that of the COMPUTE option, and so will the time it takes to generate! There is no point estimating beyond 50% (see below example); rather, go for the COMPUTE option if you can afford that much time.

Gathering statistics on tables requires sorting to be done and this takes up resources. Gathering statistics on indexes does not require sorting. Considering this benefit, you may COMPUTE statistics on indexes for accurate data.

You may also consider generating statistics in Parallel, an example is provided in the DBMS_STATS package section below.

Below is an example of arriving at an appropriate ESTIMATE sample size. Please note that this is just for one table. For an application, it would be advisable to try it out at database level and draw conclusions only after proper testing.

Example

The table INV.MATERIAL_TRX has around 4.5 million records. Below are statistics at various sample sizes.

Commands used:

dbms_stats.gather_table_stats('INV', 'MTL_MATERIAL_TRX'); --compute
dbms_stats.gather_table_stats('INV', 'MTL_MATERIAL_TRX', estimate_percent => 5);
dbms_stats.gather_table_stats('INV', 'MTL_MATERIAL_TRX', estimate_percent => 10);
dbms_stats.gather_table_stats('INV', 'MTL_MATERIAL_TRX', estimate_percent => 51);

Final statistics:

 MODE  |  NUM_ROWS|    BLOCKS|AVG_ROW_LEN|SAMPLE_SIZE|LAST_ANAL|Time taken
_______|__________|__________|___________|___________|_________|__________
compute|   4591474|    193230|        251|    4591474|27-JUL-03|2 hr
at 5%  |   4582460|    193230|        247|     229123|27-JUL-03|8 mts
at 10% |   4587520|    193230|        249|     458752|27-JUL-03|17 mts
at 20% |   4591635|    193230|        250|     918327|27-JUL-03|32 mts
at 51% | 4590890.2|    193230|        250|    2341354|27-JUL-03|1 hr 56 mts

As you can see, the NUM_ROWS difference between full statistics and 5% is only 9014 records. Increasing the sample size to 10% adds 5060 records. Making it 20% adds an additional 4115 records but doubles the time. Bringing it up to 51% results in almost the same number of records as the COMPUTE option, but the time taken is also same. As per my requirement, statistics at a 5% sample size serves my purpose as the variance is less than 1% that of actual value.

What should the time interval be?

To determine an efficient statistics-gathering interval, keep a history of statistics generated and evaluate the variations. If the two statistics remain more or less similar, then you may consider increasing the time interval. If the statistics vary considerable then either the sample size is not appropriate (in the case of ESTIMATE) or the time interval is not appropriate. This is easier said than done and may be a tedious process for huge systems.

When there is a 10-20% change in data, the general convention is to generate fresh statistics. You can start of with a general rule of estimating statistics on a weekly basis. If the tables are giving real bad hits because of heavy activity, you may consider using the DML Monitoring option to update statistics every few hours for such tables.

Statistics are not incremental and are regenerated every time. If there is no considerable change in data, there is no advantage in generating statistics too frequently.

You may consider generating statistics for interface tables every time a bulk uploading is carried out. Objects can be grouped and statistics generated at different intervals, for example transaction tables could be analyzed every week, where as, Master tables might be done once a month. Statistics generation may also be done schema wise, depending on the maintenance window available.

Statistics locks?

A table that is in the process of being analyzed cannot undergo DDL changes but DML activities can be carried on. Analyzing an index puts a shared lock on the related table; hence, neither DML nor DDL can be performed. Preferably avoid all activities during the statistics generation phase.

Oracle provides more than one way of generating statistics.

  1. DBMS_UTILITY
  2. ANALYZE command
  3. DBMS_DDL
  4. DBMS_STATS

Following are the options.

9.1) DBMS_UTILITY

Oracle provides two procedures under the DBMS_UTILITY package related to statistics generation. (Oracle recommends use of DBMS_STATS package for generating statistics).

DBMS_UTILITY.ANALYZE_SCHEMA

This routine will generate statistics on an individual schema level. It is used for analyzing all tables, clusters and indexes.

It takes the following parameters:

  • schema - Name of the schema
  • method - Estimation method, COMPUTE or ESTIMATE. DELETE can be used to remove statistics.
  • estimate_rows - The number of rows to be considered for estimation.
  • estimate_percent - The percentage of rows to be considered for estimation.
  • method_opt - Method options. Generate statistics FOR TABLE, FOR ALL COLUMNS, FOR ALL INDEXED COLUMNS, FOR ALL INDEXES.

If the ESTIMATE method is used, then either estimate_rows or estimate_percent should be specified; these actually specify the sample size to be considered.

Call syntax

dbms_utility.analyze_schema(schema, method, estimate_rows, estimate_percent,  method_opt)

e.g.: Computing statistics for a schema

SQL> exec dbms_utility.analyze_schema('SYSTEM', 'COMPUTE');
PL/SQL procedure successfully completed.

e.g.: Estimating statistics for a schema, sample size is 1024 row.

SQL> exec dbms_utility.analyze_schema('FEM', 'ESTIMATE', estimate_rows => 1024);
PL/SQL procedure successfully completed.

e.g.: Estimating statistics for FA schema, sample size is 10 percent of rows.

SQL> exec dbms_utility.analyze_schema('FA', 'ESTIMATE', estimate_percent => 10);
PL/SQL procedure successfully completed.

e.g.: Deleting statistics for FA schema

SQL> exec dbms_utility.analyze_schema('FA', 'DELETE');
PL/SQL procedure successfully completed.

e.g.: Estimating statistics with 5 percent rows for all indexes in a schema.

SQL> exec dbms_utility.analyze_schema('FA', 'ESTIMATE', estimate_percent => 5, 
	method_opt => 'FOR ALL INDEXES');
PL/SQL procedure successfully completed.

e.g.: Estimating statistics with 5 percent rows for columns with indexes in a schema.

SQL> exec dbms_utility.analyze_schema('FA', 'ESTIMATE', estimate_percent => 5, 
	method_opt => 'FOR ALL INDEXED COLUMNS');
PL/SQL procedure successfully completed.

e.g.: Estimating statistics with 5 percent rows for all columns in a schema.

SQL> exec dbms_utility.analyze_schema('FA', 'ESTIMATE', estimate_percent => 5, 
	method_opt => 'FOR ALL COLUMNS');
PL/SQL procedure successfully completed.

e.g.: Estimating statistics for all tables in a schema.

SQL> exec dbms_utility.analyze_schema('FA', 'ESTIMATE', estimate_percent => 5, 
	method_opt => 'FOR TABLE');
PL/SQL procedure successfully completed.

e.g.: Proper sample size should be given, otherwise ORA-01493 is encountered.

SQL> exec dbms_utility.analyze_schema('FA', 'ESTIMATE', estimate_percent => -5);
BEGIN dbms_utility.analyze_schema('FA', 'ESTIMATE', estimate_percent => -5); END;

*
ERROR at line 1:
ORA-01493: invalid SAMPLE size specified
ORA-06512: at "SYS.DBMS_DDL", line 179
ORA-06512: at "SYS.DBMS_UTILITY", line 331
ORA-06512: at line 1

DBMS_UTILITY.ANALYZE_DATABASE

DBMS_UTILITY.ANALYZE_DATABASE is used for analyzing all tables, clusters and indexes at database level. It takes the same set of parameters as above except for the schema name.

e.g.: Estimating statistics for database with 30 percent sample.

SQL> exec dbms_utility.analyze_database('ESTIMATE', estimate_percent => 30);
PL/SQL procedure successfully completed.

9.2) ANALYZE command

The ANALYZE command can also be used to collect statistics for individual objects. The Object to be analyzed should belong to the local schema or the user should have ANALYZE ANY TABLE system privilege. This command can be used for the following purpose:

  1. Collect statistics for individual objects.
  2. Validate the structure of an object.
  3. To list migrated or chained rows.
  4. Validate REF links.
  5. Collect statistics not used by the optimizer.

Oracle recommends the use of the DBMS_STATS package for collecting statistics. The ANALYZE command can be used for the other 4 points mentioned above. Statistics are not collected for columns of type- REFs, varrays, nested tables, LOBs , LONG or object types.

If no sample size is provided when estimating statistics with the ANALYZE command, Oracle will take a default sample size of the first 1064 rows. This may not be effective and most often will result in bad queries.

If the ESTIMATE sample size is greater than 50%, it is as good as the COMPUTE option.

Columns such as EMPTY_BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_SPACE_FREELIST_BLOCKS and NUM_FREELIST_BLOCKS are not populated using the DBMS_STATS package. These are populated by using the ANALYZE command and could be used for maintenance and administration activities.

The below examples are given for statistics generation only.

e.g.: Gathering statistics for a table. This will also compute for individual columns and related indexes.

orAP>analyze table am_statchk compute statistics;
Table analyzed.

e.g.: Deleting statistics for a table. This will delete statistics related to table, columns and related indexes.

orAP> analyze table am_statchk delete statistics;
Table analyzed.

e.g.: Estimating statistics for a table with 20 percent rows.

orAP>analyze table am_statchk estimate statistics sample 20 percent;
Table analyzed.

e.g.: Estimating statistics for a table with 1000 sample rows.

orAP>analyze table am_statchk estimate statistics sample 1000 rows;
Table analyzed.

e.g.: Gathering statistics for indexed columns of a table.

orAP>analyze table am_statchk compute statistics for all indexed columns;
Table analyzed.

e.g.: Computing statistics for all columns in a table.

orAP>analyze table am_statchk compute statistics for all columns;
Table analyzed.

e.g.: Computing statistics for individual indexes.

orAP>analyze index am_statchk_n1 compute statistics;
Index analyzed.

e.g.: Deleting statistics for an index.

orAP>analyze index am_statchk_n1 delete statistics;
Index analyzed.

e.g.: Computing statistics for all indexes and all indexed columns.

orAP>analyze table am21 estimate statistics sample 5 percent for all indexes for all indexed columns;
Table analyzed.

9.3) DBMS_DDL

DBMS_DDL has a routine ANALYZE_OBJECT meant for generating statistics for individual objects (Oracle recommends use of DBMS_STATS package for generating statistics).

Parameters

  • type - TABLE, INDEX or CLUSTER.
  • schema - schema name.
  • name - name of the object.
  • method - ESTIMATE, COMPUTE or DELETE.
  • estimate_rows - sampling no. of rows.
  • estimate_percent - sampling percentage of rows.
  • method_opt - method options FOR ALL TABLES, FOR ALL COLUMNS, FOR ALL INDEXED COLUMNS, FOR ALL INDEXES.
  • partname - partition, if present, to be analyzed.

e.g.: Below call will generate statistics for table AM21 and its columns and indexes.

orAP>exec dbms_ddl.analyze_object(type => 'TABLE', schema => 'APPS', name => 'AM21', -
> method => 'ESTIMATE', estimate_percent => 5);
PL/SQL procedure successfully completed.

e.g.: Deleting all statistics related to a table

orAP>exec dbms_ddl.analyze_object(type => 'TABLE', schema => 'APPS', name => 'AM21', -
> method => 'DELETE');
PL/SQL procedure successfully completed.


9.4) DBMS_STATS

DBMS_STATS is a package provided for gathering and maintaining statistics in a database and is the recommended way. The following can be done with this package:



  • Gathering statistics
  • Deleting statistics
  • Providing user statistics
  • Retrieving statistics
  • Exporting and importing statistics

Below is a list of the various routines present in the package. The key parameters are covered for each routine.



9.4.1) Gathering statistics with DBMS_STATS

DBMS_STATS.GATHER_TABLE_STATS

DBMS_STATS.GATHER_TABLE_STATS gathers statistics for a table and its columns, and optionally the associated indexes.



Call syntax

dbms_stats.gather_table_stats(ownname, tabname, partname, 
estimate_percent, block_sample,
method_opt, degree, granularity, cascade, 
stattab, statid, statown);

The first two parameters are mandatory, the rest are defaulted to a value.

  • ownname - owner
  • tabname - table name
  • partname - partition name
  • estimate_percent - sample percent ratio
  • block_sample - consider random blocks sampling rather than rows sampling. TRUE/FALSE
  • method_opt - method options. FOR ALL COLUMNS/FOR ALL INDEXED COLUMNS. Append the phase SIZE 1 if it is required to generate statistics in parallel.
  • degree - degree of parallelism.
  • granularity - for partitioned tables. DEFAULT/SUBPARTITION/PARTITION/GLOBAL/ALL.
  • cascade - gather statistics for indexes also. TRUE/FALSE
  • stattab, statid, statown - required for user statistics, covered below in this section.

e.g.: Estimate statistics for a table and its columns

orAP>exec dbms_stats.gather_table_stats(ownname => 'INV', tabname => 'MTL_MATERIAL_TRX', -
> estimate_percent => 5);
PL/SQL procedure successfully completed.

e.g.: Estimate statistics for a table, its columns and indexes.

orAP>exec dbms_stats.gather_table_stats(ownname => 'APPS', tabname => 'AM21', -
> estimate_percent => 5, cascade => true);
PL/SQL procedure successfully completed.

e.g.: Estimate statistics in parallel, the following uses 8 threads to complete the task.

Session - A

orAP>exec dbms_stats.gather_table_stats(ownname => 'INV', tabname => 'MTL_MATERIAL_TRX', -
> estimate_percent => 5, degree => 8);
PL/SQL procedure successfully completed.

Session - B (When the above process is running)

orAP>select * from v$px_process;

SERV|STATUS   |       PID|SPID     |       SID|  SERIAL#

____|_________|__________|_________|__________|__________

P000|IN USE   |        50|9684     |         7|    50586
P001|IN USE   |        65|9686     |        60|    51561
P002|IN USE   |        66|9688     |        17|     2694
P003|IN USE   |        67|9690     |        30|    39243
P004|IN USE   |        68|9692     |        74|    11017
P005|IN USE   |        69|9694     |        48|     4253
P006|IN USE   |        70|9696     |        76|       17
P007|IN USE   |        71|9698     |        68|     1285

8 rows selected.

e.g.: Estimate statistics for columns in a table, this will also generate statistics for tables.

SQL> exec dbms_stats.gather_table_stats(ownname => 'SYS', tabname => 'AM01', -
>  estimate_percent => 5, 
method_opt => 'FOR ALL COLUMNS');
PL/SQL procedure successfully completed.

e.g.: The below example allows generation of column statistics in parallel. The degree of the table is initially set to 8 and the "SIZE 1" makes use of this. Refer to the Histogram section below to find out about the SIZE option.

SQL> exec dbms_stats.gather_table_stats(ownname => 'SYS', tabname => 'AM01',-
>  estimate_percent => 5, method_opt => 'FOR ALL COLUMNS SIZE 1');
PL/SQL procedure successfully completed.

DBMS_STATS.GATHER_INDEX_STATS

DBMS_STATS.GATHER_INDEX_STATS gathers statistics for indexes. Index statistics cannot be generated in parallel. In addition, the block sampling option available in tables is not available for indexes.

Call syntax

dbms_stats.gather_index_stats(ownname, indname, partname, estimate_percent, 
                                     stattab, statid, statown);

e.g.:

orAP>exec dbms_stats.gather_index_stats
	(ownname => 'INV', indname => 'MTL_SYSTEM_ITEMS_JHN99');
PL/SQL procedure successfully completed.

DBMS_STATS.GATHER_SCHEMA_STATS

DBMS_STATS.GATHER_SCHEMA_STATS gathers statistics for a given schema.

Call Syntax

dbms_stats.gather_schema_stats(ownname, estimate_percent, block_sample, 
method_opt, degree, granularity,
cascade, stattab, statid, options, objlist, statown);
  • options - object information can be further specified here.
  • GATHER - gather statistics for all objects (default).
  • GATHER STALE - update statistics for stale objects, identified with the monitoring option.
  • GATHER EMPTY - gather statistics for objects without any statistics.
  • LIST STALE - return a list of stale objects, this depends on the SMON processing.
  • LIST EMPTY - return a list of objects with no statistics.
  • GATHER AUTO - same as STALE but will include objects without any statistics.
  • objlist - table of type DBMS_STATS.OBJECTTAB, returns an empty or stale list.

e.g.: Gather schema statistics, for tables and indexes at 5% estimate.

SQL> exec dbms_stats.gather_schema_stats(ownname => 'SCOTT', estimate_percent => 5, -
> cascade => true, options => 'GATHER');

e.g.: Gather statistics for objects with no statistics. The cascade option given below does not make a difference as the GATHER EMPTY options generates for all objects without any statistics.

SQL> exec dbms_stats.gather_schema_stats(ownname => 'QP', estimate_percent => 5, -
> cascade => true, options => 'GATHER EMPTY');
PL/SQL procedure successfully completed.

e.g.: To identify a list of objects without any statistics.

orAP>declare
  2    l_owner       varchar2(30) := 'QP';
  3    l_emptylst    dbms_stats.objecttab;
  4  begin
  5    dbms_stats.gather_schema_stats(ownname => l_owner,
  6                options => 'LIST EMPTY', objlist => l_emptylst);
  7    for i in nvl(l_emptylst.first, 0) .. nvl(l_emptylst.last, 0) loop
  8      dbms_output.put_line(l_emptylst(i).objtype || '/' || l_emptylst(i).objname);
  9    end loop;
 10  end;
 11  /
INDEX/AM21_N1
TABLE/AM21

PL/SQL procedure successfully completed.

DBMS_STATS.GATHER_DATABASE_STATS

DBMS_STATS.GATHER_DATABASE_STATS gathers statistics for the complete database. In 8i, this will generate statistics for the SYS schema also. This has been rectified in Oracle 9i. For Oracle 8i, an alternative is to generate statistics for individual schemas or delete SYS schema statistics after generating statistics at the database level.

Call Syntax

 dbms_stats.gather_database_stats(estimate_percent, block_sample, method_opt, degree, granularity,
	cascade, stattab, statid, options, objlist, statown);

9.4.2) Deleting statistics with DBMS_STATS



DBMS_STATS.DELETE_TABLE_STATS

DBMS_STATS.DELETE_TABLE_STATS deletes table statistics.



Call Syntax

dbms_stats.delete_table_stats
	(ownname, tabname, partname, 
	stattab, statid, cascade_parts, 
	cascade_columns, cascade_indexes, 
	statown);


  • cascade_parts - delete statistics for all partitions (partname should be null).
  • cascade_columns - delete column statistics. Default is true.
  • cascade_indexes - delete index statistics. Default is true.


e.g.: Delete statistics for a table and its columns and indexes.



orAP>exec dbms_stats.delete_table_stats(ownname => 'APPS', tabname => 'AM21');
PL/SQL procedure successfully completed.

e.g.: Delete statistics for table only. Column and index statistics will be preserved.

orAP>exec dbms_stats.delete_table_stats(ownname => 'APPS', tabname => 'AM21', -
> cascade_columns => false, cascade_indexes => false);
PL/SQL procedure successfully completed.

DBMS_STATS.DELETE_COLUMN_STATS

DBMS_STATS.DELETE_COLUMN_STATS deletes individual column statistics.

Call Syntax

dbms_stats.delete_column_stats(ownname, tabname, colname, partname, stattab,
	statid, cascade_parts, statown);

e.g.: Deleting statistics for one column.

orAP>exec dbms_stats.delete_column_stats(ownname => 'APPS', tabname => 'AM21', -
> colname => 'DESCRIPTION');
PL/SQL procedure successfully completed.

DBMS_STATS.DELETE_INDEX_STATS

DBMS_STATS.DELETE_INDEX_STATS deletes individual index statistics.

Call Syntax

dbms_stats.delete_index_stats(ownname, indname, partname, stattab, statid, 
cascade_parts, statown);

e.g.: Deleting index statistics.

orAP>exec dbms_stats.delete_index_stats(ownname => 'APPS', indname => 'AM21_N1');
PL/SQL procedure successfully completed.

DBMS_STATS.DELETE_SCHEMA_STATS

DBMS_STATS.DELETE_SCHEMA_STATS deletes the complete schema statistics.

Call Syntax

dbms_stats.delete_schema_stats(ownname, stattab, statid, statown);

e.g.: Deleting statistics for schema FA.

SQL> exec dbms_stats.delete_schema_stats('FA');
PL/SQL procedure successfully completed.

DBMS_STATS.DELETE_DATABASE_STATS

DBMS_STATS.DELETE_DATABASE_STATS deletes the complete database statistics.

Call Syntax

dbms_stats.delete_database_stats(stattab, statid, statown);

9.4.3) Providing user statistics with DBMS_STATS

DBMS_STATS.SET_TABLE_STATS

Use this routine to set your own statistics in the dictionary instead of the RDBMS statistics.

Call Syntax

dbms_stats.set_table_stats(ownname, tabname, partname, stattab, statid, numrows, 
numblks, avgrlen, flags, statown);
  • numrows - number of rows.
  • numblks - blocks in the table.
  • avgrlen - average row length.
  • flags - currently for internal use only.

e.g.:

SQL> exec dbms_stats.set_table_stats(ownname => 'JASHAN', tabname => 'TMP_CKFA', -
> numrows => 12422, numblks => 100, avgrlen => 124);
PL/SQL procedure successfully completed.


jaJA>select owner, num_rows, blocks, avg_row_len
  2  from dba_tables
  3  where table_name = 'TMP_CKFA';
OWNER               |  NUM_ROWS|   BLOCKS|AVG_ROW_LEN
____________________|__________|__________|___________
JASHAN              |     12422|       100|       124

DBMS_STATS.SET_COLUMN_STATS

DBMS_STATS.SET_COLUMN_STATS sets column statistics explicitly.

Call Syntax

dbms_stats.set_column_stats(ownname, tabname, colname, partname, stattab, statid, 
	distcnt, density, nullcnt, srec, avgclen, 
	flags, statown);
  • distcnt - number of distinct values.
  • density - column density. If null, it is derived from distcnt.
  • nullcnt - null count.
  • srec - record of type DBMS_STATS.STATREC, value populated by
  • call to PREPARE_COLUMN_VALUES or GET_COLUMNS_STATS.
  • avgclen - average column length.

e.g.: Setting statistics for one column of a table.

jaJA>exec dbms_stats.set_column_stats(ownname => 'JASHAN', tabname => 
'TMP_CKFA', -
> colname => 'CODE', distcnt => 1000, density => 5, nullcnt => 0, 
avgclen => 12);
PL/SQL procedure successfully completed.

SQL> select column_name, num_distinct, low_value, high_value, density,
  2         num_nulls, num_buckets, avg_col_len
  3  from   dba_tab_columns
  4* where  table_name = 'TMP_CKFA' and column_name = 'CODE'
COLUMN_NAME|NUM_DISTINCT|LOW_VA|HIGH_VA|DENSITY| NUM_NULLS|NUM_BUCKETS|AVG_COL_LEN
___________|____________|______|_______|_______|__________|___________|___________
CODE       |        1000|      |       |      5|         0|          1|         12

DBMS_STATS.SET_INDEX_STATS

Set index statistics.

Call Syntax

dbms_stats.set_index_stats(ownname, indname, partname, stattab, statid, numrows, numlblks, 
	numdist, avglblk, avgdblk, clstfct, indlevel, flags, statown);
  • numlblks - number of leaf blocks.
  • numdist - number of distinct keys.
  • avglblk - average number of leaf blocks in which each distinct key appears.
  • avgdblk - average number of data blocks in the table pointed to by the distinct keys.
  • clstfct - clustering factor.
  • indlevel - Height of the index.

e.g.:

jaJA>exec dbms_stats.set_index_stats(ownname => 'JASHAN', indname => 'TMP_CKFA_N1', -
> numrows => 1000, numlblks => 100, numdist => 100, avglblk => 1, avgdblk => 12, -
> clstfct => 1000, indlevel => 2);

PL/SQL procedure successfully completed.

jaJA>select num_rows, blevel, leaf_blocks, avg_leaf_blocks_per_key,
  2         avg_data_blocks_per_key, clustering_factor, user_stats
  3  from   dba_indexes
  4  where  index_name = 'TMP_CKFA_N1';
  
NUM_ROWS|BLEVEL|LEAF_BLOCKS|AVG_LEAF_B|AVG_DATA_B|CLUSTERING_F|USE
________|______|___________|__________|__________|____________|___
    1000|     2|        100|         1|        12|        1000|YES

9.4.4) Retrieveing statistics with DBMS_STATS

DBMS_STATS.GET_TABLE_STATS


Get table statistics.


Call syntax

dbms_stats.get_table_stats(ownname, tabname, 
	partname, stattab, statid, numrows,
	numblks, avgrlen, statown);


e.g.: getting table statistics data.

SQL> declare
  2    l_numrows  number;
  3    l_numblks  number;
  4    l_avgrlen  number;
  5  begin
  6    dbms_stats.get_table_stats(ownname => 'SYS', tabname => 'AM01', 
  7                       numrows => l_numrows, numblks => l_numblks, avgrlen => l_avgrlen);
  8    dbms_output.put_line('No. of rows: ' || l_numrows);
  9    dbms_output.put_line('No. of blks: ' || l_numblks);
 10    dbms_output.put_line('Avg row length: ' || l_avgrlen);
 11  end;
 12  /
No. of rows: 4106860
No. of blks: 6219
Avg row length: 3

PL/SQL procedure successfully completed.

DBMS_STATS.GET_COLUMN_STATS

Get column statistics present in the dictionary.

Call syntax

dbms_stats.get_column_stats(ownname, tabname, colname, partname, stattab, statid, 
           distcnt, density, nullcnt, srec, avgclen, statown);

e.g.: getting statistics for a column.

SQL> declare
  2    l_distcnt number;
  3    l_density number;
  4    l_nullcnt number;
  5    l_srec   dbms_stats.statrec;
  6    l_avgclen number;
  7  begin
  8    dbms_stats.get_column_stats(ownname => 'SYS', tabname => 'AM01',
  9               colname => 'COL1', distcnt => l_distcnt, density => l_density,
 10               nullcnt => l_nullcnt, srec => l_srec, avgclen => l_avgclen);
 11    dbms_output.put_line('No. of distinct values: ' || l_distcnt);
 12    dbms_output.put_line('Density: ' || l_density);
 13    dbms_output.put_line('Count of nulls: ' || l_nullcnt);
 14    dbms_output.put_line('Avg. column length: ' || l_avgclen);
 15  end;
 16  /
No. of distinct values: 2
Density: .5
Count of nulls: 0
Avg. column length: 3

PL/SQL procedure successfully completed.

DBMS_STATS.GET_INDEX_STATS

Get index statistics.

Call syntax

dbms_stats.get_index_stats(ownname, indname, partname, stattab, statid, 
		numrows, numlblks, numdist, avglblk, avgdblk, 
		clstfct, indlevel, statown);

e.g.: getting an index statistics.

SQL> declare
  2    l_numrows   number;
  3    l_numlblks  number;
  4    l_numdist   number;
  5    l_avglblk   number;
  6    l_avgdblk   number;
  7    l_clstfct   number;
  8    l_indlevel  number;
  9  begin
 10    dbms_stats.get_index_stats(ownname => 'SYS', indname => 'AM01_N1',
 11                   numrows => l_numrows, numlblks => l_numlblks,
 12                   numdist => l_numdist, avglblk => l_avglblk,
 13                   avgdblk => l_avgdblk, clstfct => l_clstfct,
 14                   indlevel => l_indlevel);
 15    dbms_output.put_line('No. of rows: ' || l_numrows);
 16    dbms_output.put_line('No. of blks: ' || l_numlblks);
 17    dbms_output.put_line('No. of distinct values: ' || l_numdist);
 18    dbms_output.put_line('Avg leaf blocks for distinct keys: ' || l_avglblk);
 19    dbms_output.put_line('Avg data blocks pointed to in the table: ' || l_avgdblk);
 20    dbms_output.put_line('Clustering factor: ' || l_clstfct);
 21    dbms_output.put_line('Index height: ' || l_indlevel);
 22  end;
 23  /
No. of rows: 3819711
No. of blks: 11092
No. of distinct values: 1
Avg leaf blocks for distinct keys: 11092
Avg data blocks pointed to in the table: 14616
Clustering factor: 14616
Index height: 2

PL/SQL procedure successfully completed.

9.4.5) Exporting and importing statistics with DBMS_STATS

DBMS_STATS also includes routines for gathering statistics and storing them outside the dictionary. This does not influence the optimizer. Most of the procedures in this package as mentioned above have three common parameters - STATID, STATTAB and STATOWN that are related to user processing of statistics.

Advantages of this feature:

1. Estimated statistics at different percentages could be stored and used for testing.

2. Statistics generated on one database could be transferred to another database.

DBMS_STATS.CREATE_STAT_TABLE

DBMS_STATS.CREATE_STAT_TABLE creates a user statistics table for storing dictionary statistics.

Call syntax

dbms_stats.create_stat_table(ownname, stattab, tblspace);
  • stattab - statistics table name.
  • tblspace - tablespace to be used.

e.g.: creating a user statistics table.

SQL> exec dbms_stats.create_stat_table(ownname => 'SYS', stattab => 'STAT_AT_5PC', -
>  tblspace => 'SYSTEM');

PL/SQL procedure successfully completed.

SQL> desc stat_at_5pc
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- ----------------
 STATID                                                         VARCHAR2(30)
 TYPE                                                           CHAR(1)
 VERSION                                                        NUMBER
 FLAGS                                                          NUMBER
 C1                                                             VARCHAR2(30)
 C2                                                             VARCHAR2(30)
 C3                                                             VARCHAR2(30)
 C4                                                             VARCHAR2(30)
 C5                                                             VARCHAR2(30)
 N1                                                             NUMBER
 N2                                                             NUMBER
 N3                                                             NUMBER
 N4                                                             NUMBER
 N5                                                             NUMBER
 N6                                                             NUMBER
 N7                                                             NUMBER
 N8                                                             NUMBER
 N9                                                             NUMBER
 N10                                                            NUMBER
 N11                                                            NUMBER
 N12                                                            NUMBER
 D1                                                             DATE
 R1                                                             RAW(32)
 R2                                                             RAW(32)
 CH1                                                            VARCHAR2(1000)

DBMS_STATS.EXPORT_TABLE_STATS

DBMS_STATS.EXPORT_TABLE_STATS retrieves table statistics for a particular table and puts it in the user statistics table.

Call syntax

dbms_stats.export_table_stats(ownname, tabname, partname, stattab, statid, 
			                      cascade, statown);
  • cascade - column and index statistics are also exported.

e.g.: exporting AM01 stat for testing purpose, including table and indexes.

SQL> exec dbms_stats.export_table_stats(ownname => 'SYS', tabname => 'AM01', -
> stattab => 'STAT_AT_5PC', cascade => true, statown => 'SYS');

PL/SQL procedure successfully completed.

DBMS_STATS.EXPORT_COLUMN_STATS

DBMS_STATS.EXPORT_COLUMN_STATS retrieves column statistics for a particular table and puts it in the user statistics table.

Call syntax

dbms_stats.export_table_stats(ownname, tabname, colname, partname, stattab, statid, statown);

DBMS_STATS.EXPORT_INDEX_STATS

DBMS_STATS.EXPORT_INDEX_STATS retrieves index statistics for a particular index and puts it in the user statistics table.

Call syntax

dbms_stats.export_index_stats(ownname, indname, partname, stattab, statid, statown);

DBMS_STATS.EXPORT_SCHEMA_STATS

DBMS_STATS.EXPORT_SCHEMA_STATS retrieves statistics for a schema and puts it in the user statistics table.

Call syntax

dbms_stats.export_schema_stats(ownname, stattab, statid, statown);

DBMS_STATS.EXPORT_DATABASE_STATS

DBMS_STATS.EXPORT_DATABASE_STATS retrieves statistics for the complete database and puts it in the user statistics table.

Call syntax

dbms_stats.export_database_stats(stattab, statid, statown);

DBMS_STATS.IMPORT_TABLE_STATS

DBMS_STATS.IMPORT_TABLE_STATS retrieves statistics for a table from a user statistics table and stores it in dictionary.

Call syntax

dbms_stats.import_table_stats(ownname, tabname, partname, stattab, statid, 
	cascade, statown);

e.g.: importing statistics for table am01, including column and indexes.

SQL> exec dbms_stats.import_table_stats(ownname => 'SYS', tabname => 'AM01', -
> stattab => 'STAT_AT_5PC', cascade => true, statown => 'SYS');

PL/SQL procedure successfully completed.

DBMS_STATS.IMPORT_COLUMN_STATS

DBMS_STATS.IMPORT_COLUMN_STATS retrieves statistics for a column from a user statistics table and stores it in dictionary.

Call syntax

dbms_stats.import_column_stats(ownname, tabname, colname, partname, stattab, statid, statown);

DBMS_STATS.IMPORT_INDEX_STATS

Retrieve statistics for an index from a user statistics table and store it in dictionary.

Call syntax

dbms_stats.import_index_stats(ownname, indname, partname, stattab, statid, statown);

DBMS_STATS.IMPORT_SCHEMA_STATS

DBMS_STATS.IMPORT_SCHEMA_STATS Rretrieves statistics for a schema from a user statistics table and stores it in dictionary.

Call syntax

dbms_stats.import_schema_stats(ownname, stattab, statid, statown);

DBMS_STATS.IMPORT_DATABASE_STATS

DBMS_STATS.IMPORT_DATABASE_STATS retrieves statistics for the database from a user statistics table and stores it in dictionary.

Call syntax

dbms_stats.import_schema_stats(stattab, statid, statown);

DBMS_STATS.DROP_STAT_TABLE

DBMS_STATS.DROP_STAT_TABLE drops a user statistics table.

Call syntax

dbms_stats.drop_stat_table(ownname, stattab);

e.g.: dropping my stat table.

SQL> exec dbms_stats.drop_stat_table(ownname => 'SYS', stattab => 'STAT_AT_5PC');

PL/SQL procedure successfully completed.

We will continue with Histograms and DML Monitoring in the next installment of this series.

Due to the extent of coverage for each section, the topics to be covered have been segregated over more than the originally planned installments mentioned in part 1 of this series.

» See All Articles by Columnist Amar Kumar Padhi

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved