11G OCM Use multi column statistics

References

Oracle® Database Performance Tuning Guide 11g Release 1 (11.1)

Chapter 13, "Managing Optimizer Statistics"
- Manual Statistics Gathering
  - Gathering Statistics with DBMS_STATS Procedures
    - MultiColumn Statistics

Oracle® Database PL/SQL Packages and Types Reference 11g Release 1 (11.1)

Oracle® Database Reference 11g Release 1 (11.1)

Static Data Dictionary Views
- ALL_STAT_EXTENSIONS
- ALL_TAB_COL_STATISTICS

Overview

The 11G OCM Upgrade Exam objective is:

Use multi column statistics

This seems to mean:

Collection of multi-column statistics
Influencing plan chosen because of multi-column statistics

Procedure

The use of multicolumn statistics is needed when the table is not in at least third normal form. That is, the value of a column is only dependent on the primary key.

The discussion in MultiColumn Statistics gives the example of the following two columns in the sample SH.CUSTOMERS table:

CUST_STATE_PROVINCE
COUNTRY_ID

With individual column statistics the optimizer has no way of knowing that the cust_state_province and the country_id columns are related. By gathering statistics on these columns as a group (column group), the optimizer will now have a more accurate selectivity value for the group, instead of having to generate the value based on the individual column statistics.

Creating a Column Group

Following the example given in MultiColumn Statistics, the following PL/SQL block is run in the SH schema (using the CREATE_EXTENDED_STATISTICS procedure):

declare cg_name varchar2(30); begin cg_name := dbms_stats.create_extended_stats(null,'customers', '(cust_state_province,country_id)'); end; /

Monitoring Column Groups

The ALL_STAT_EXTENSIONS view shows:

Using the query from MultiColumn Statistics, but on the ALL_STAT_EXTENSIONS and ALL_TAB_COL_STATISTICS views instead, the following query returns no rows found:

SELECT e.extension col_group, t.num_distinct, t.histogram FROM all_stat_extensions e INNER JOIN all_tab_col_statistics t ON ( e.extension_name = t.column_name AND e.table_name = t.table_name ) WHERE t.table_name = 'CUSTOMERS';

Gathering Statistics on Column Groups

Onto the new step of Gathering Statistics on Column Groups in MultiColumn Statistics. Here we use the GATHER_TABLE_STATS procedure:

BEGIN DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS', METHOD_OPT => 'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY' ); END; /

Checking the statistics as before via:

SELECT e.extension col_group, t.num_distinct, t.histogram FROM all_stat_extensions e INNER JOIN all_tab_col_statistics t ON ( e.extension_name = t.column_name AND e.table_name = t.table_name ) WHERE t.table_name = 'CUSTOMERS';

COL_GROUP

("CUST_STATE_PROVINCE","COUNTRY_ID")

NUM_DISTINCT

145

HISTOGRAM

FREQUENCY

Explain Plan

The Explain Plan for the following query is:

EXPLAIN PLAN FOR select * from customers where cust_state_province = 'CA' and country_id=52790; SELECT * FROM TABLE( dbms_xplan.display() ); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------- Plan hash value: 2008213504 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3448 | 646K| 406 (1)| 00:00:05 | |* 1 | TABLE ACCESS FULL| CUSTOMERS | 3448 | 646K| 406 (1)| 00:00:05 | ------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790) 13 rows selected.