Examples‎ > ‎

Tunable Queries

DBO is successful at tuning queries that the native optimizer fails to optimizer correctly.

For a query to have a sub-optimal plan the database optimizer has to miscalculate the cost of the various possible access paths and mistakenly choose a bad path. The access path calculations can be miscalculated because
  1. The table/index statistics are missing or wrong (ie the number or rows is missing or way off, for example)
  2. The data is skewed, ie the number of orders  with an open status is usually low compared to all the ones that have a closed status because the work is complete (ie orders get filled every day, but only  a few are open and needing to be processed). Looking for open orders should probably use and index and return fewer rows than looking for closed orders which should probably just do a full table scan.
  3. The predicates used are correlated. The optimizer treats two predicate filter on a table as more selective than just one, but this is not always the case  like how many Swedes speak Swedish  which is basically returns the same number of results as just asking for Swedes alone.  Another example is how many Swedes speak Swahili which is probably way more selective than the optimizer would guess.
  4. A bug in the optimizer
A query that can be tuned for DBO needs at least one good path and one bad path. 
A clearly bad path will require enough data to make the paths sufficiently different.
 
The easiest case of having a good path and a bad path is using and index or not.
If  the optimizer to either uses the index when it shouldn't   or when 
the optimizer doesn't use an index and it should (more dramatic).
 
For Example
select E.EMPLOYEE_ID , E.LAST_NAME , E.SALARY ,
              D.DEPARTMENT_NAME , L.CITY , C.COUNTRY_NAME 
from    HR.COUNTRIES C,
              HR.LOCATIONS L,
              HR.DEPARTMENTS D,
              HR.EMPLOYEES E 
where D.DEPARTMENT_ID=E.DEPARTMENT_ID and
              D.LOCATION_ID=L.LOCATION_ID and 
              L.COUNTRY_ID=C.COUNTRY_ID and 
              E.SALARY > 9000 and  
              L.CITY !='Seattle'
Grapically is
                   

Joins
All 4 tables are joined , ie no Cartesian joins
Filters
There are two predicate filters 
    Salary > 9000 
    city!=’Seattle’ 
These two predicates filters are the ones that would be the easiest for trying to get the optimizer to go wrong.
For example if there is an index on salary then we could mess up the optimizer by trying to get it to think that > $9000 will be most of the employees and so it doesn’t use the index when actually the result set is small and it should use an index.
Oracle tracks the high value, low value and number of distinct, so I’d guess set the high value super high, set the low value just under $9000 and put all but 1 value say under $9000.  Oracle will think there is an even distribution of values across the range and not use the index , I think.
I have to try.
Do you have the creation scripts?
Do you have data  generation scripts?
Comments