Wait Event Docs‎ > ‎

Oracle: IO Waits

Datapump dump file I/O
dbms_file_transfer I/O
DG Broker configuration file I/O
Log file init write
buffer read retry
BFILE read
db file single write

Standard I/O

db file sequential
- read Single block read
db file scattered - read Multi block read
db file parallel - read Non-contiguous multi block read
read by other session - wait for another session to do the io

select parameter1, parameter2, parameter3 from v$event_name where ... 

NAME                             P1        P2      P3 
----------------------- ----------- --------- -------- 
db file sequential read       file#    block# blocks 
db file scattered read        file#    block# blocks 
db file parallel read         files    blocks requests 
read by other session         file#    block# class#

db file sequential read

  • Top reported wait
  • Single block read
  • block accessed via
    •  index 
    •  rowid
    •  rollback 

select * from emp where empno=99;

where there is an index on emp(empno)

search buffer cache for block by rowid, 
if fail, read block off  disk via file# and block#

Note: "sequential" means a sequence as in rowid

db file scattered read

  • Multi Block Read
    •  Full Table Scan 
    •  Index Fast Full Scan

select * from emp;

search buffer cache for block by rowid, 
if fail, read block off  disk via file# and block#  and # of blocks

Note: "scattered" means blocks are scattered throughout the buffer cache (even though they are laid out sequential on disk)

db_file_multiblock_read_count set the target block read size 

db file parallel read

Process issues multiple single block reads in parallel
  • Documentation says only for recovery 
  • seems to happen for normal read ops 
  • Async Call – wait for all reads to complete 
  •      Not contiguous multi block read 
  •      Index full or range scan 
  •      Where values in

Select * from emp where empno in (1,2,3,4,5,6);


see as well

read by other session

   Multiple sessions reading  the same data that requires IO


 two users doing a full table scan at the same time

  • Block not found in cache
  • Read block from disk
  • Found other session already reading block from disk
  • Wait for the other session to finish IO

Direct I/O

This mechanism lets the client bypass the buffer cache for I/O intensive operations. 
The disk blocks are written into and read from process private memory.

select parameter1, parameter2, parameter3 from v$event_name 

NAME                    P1                 P2       P3
----------------------- ----------- --------- --------
direct path read        file number first dba block cnt
direct path read temp   file number first dba block cnt
direct path write       file number first dba block cnt
direct path write temp  file number first dba block cnt

Direct I/O read and write size are controlled by the parameters

_smm_auto_min_io_size               56
_smm_auto_max_io_size               248

See Doc ID: 330818.1 Increasing db_file_multiblock_read_count Results In Slower Performance Due To More & Smaller Direct Path Waits
Note: from Bug 4047116, the value 248 can be used when I/O size is 256Kb 
whereas 1008 can be used when I/O size is 1Mb. Both the min and the 
max parameter must be set to the same value.

direct path read 

parallel query

(or using _serial_direct_read = true)

further reading:



Grouping by Query Coordinator:

       QC_SESSION_ID qsid,
       count(*) cnt,
       count (distinct session_id) deg,
       nvl(o.object_name,to_char(CURRENT_OBJ#))  obj,
       o.object_type otype,
       decode(session_state, 'WAITING',event,'CPU') event
from   v$active_session_history ash,
        all_objects o
where  o.object_id (+)= ash.CURRENT_OBJ#
   and qc_session_id is not null
group by qc_session_id, sql_id, o.object_name,
         o.object_type, CURRENT_OBJ#, event, session_state
Order by qc_session_id, sql_id

------------- ---- ---- --- ------------ ----- --------------------------
7p3jt75phub2d  144  386   4 WAIT_OBJECTS TABLE  PX Deq Credit: send blkd
               144    4   3 WAIT_OBJECTS TABLE  PX qref latch
               144   37   1 WAIT_OBJECTS TABLE  db file sequential read
               144    3   2 WAIT_OBJECTS TABLE  direct path read
               144   70   1 WAIT_OBJECTS TABLE  CPU
               144   21   4 0                   PX Deq Credit: send blkd
               144   12   4 0                   db file sequential read

direct path write

loading LOBs
parallel DMLs
create table as select
create index

Seems to be a bug on 10g where direct path write waits are incorrectly recorded as CPU

 (from metalink)
"If asynchronous IO is not being used then the IO write request blocks until completed but this dies not show as a wait at the time the IO is issued. The session returns later to pick up the completed IO data but can then show a wait on "direct path write" even though this wait will return immediately. "

11g : There is hidden parameter _direct_path_insert_features, which seems controls direct path behaviour. alter session set "_direct_path_insert_features" = 1 turns off direct path for insert .. values().

further reading:

direct path read temp

direct path write temp

This event happens when sorting overflows the memory buffers and has to be written to disk.
If it's a problem consider increasing paramater 


The explain plan can give estimated sort size requirements and extended row source stats in the explain plan can give the actual sort size usage of the sql statement

select  s.sid, --s.osuser, s. process, 
        s.sql_id, tmp.segtype, 
       ((tmp.blocks*8)/1024)MB, tmp.tablespace
       v$tempseg_usage tmp,  
       v$session s
where tmp.session_num=s.serial#
and segtype in ('HASH','SORT')
order by blocks desc

--- ------------- ------- -- ----------
 90 gm5s0vc0vzppc SORT     2 TEMP03
105 gm5s0vc0vzppc SORT     2 TEMP03
 80 gm5s0vc0vzppc SORT     2 TEMP03
105 gm5s0vc0vzppc HASH     2 TEMP03
102 gm5s0vc0vzppc SORT     2 TEMP03
 90 gm5s0vc0vzppc HASH     2 TEMP03

The following information from Jonathan Lewis,  Randolf Geist and Joze Senegacnik :

On Oracle starting in  to, this doesn't work:

alter session set workarea_size_policy = manual;
alter session set sort_area_size = 104857600;

broke in 
fixed in

Workaround for serial connections, just run the command twice:

alter session set sort_area_size = 104857600;
alter session set sort_area_size = 104857600;

But this doesn't work for parallel sessions that don't pick up the manual size at all but use the system default size (64K).You can change the system default, but this will be a one size fits all workaround.
Does not seem to affect HASH_AREA_SIZE. hash_area_size parameter is an integer in bytes so it is might be limited to approx 2G

On 9i the upper limit for sort area size using PGA_AGGREGATE_TARGET for a single session was fixed to 100M unless using undocumented parameters:
_pga_max_size: Maximum PGA size for a single process
               Defaults to 200M 
_smm_max_size: Maximum workarea size for one process
               I believe this parameter is in KILOBYTES, from testing, see also comments on 

In 10gR2 the 100M does not apply anymore when using a PGA_AGGREGATE_TARGET > 1000M. 
In 10.2 it is limited to 200M as long as pga_aggregate_target is smaller than 1GB.

When pga_aggregate_target is large than  1GB then _pga_max_size= 20% of pga_aggregate_target .
The pga_aggregate_target now drives in most cases _smm_max_size:
pga_aggregate_target <=500MB, _smm_max_size = 20%* pga_aggregate_target
pga_aggregate_target between 500MB and 1000MB, _smm_max_size = 100MB
pga_aggregate_target >1000MB, _smm_max_size = 10%* pga_aggregate_target
and _smm_max_size in turns now drives _pga_max_size: _pga_max_size = 2 * _smm_max_size
A pga_aggregate_target larger than 1000MB will now allow much higher default thresholds in 10gR2: pga_aggregate_target set to 5GB will allow an _smm_max_size of 500MB (was 100MB before) and _pga_max_size of 1000MB (was 200MB).

direct path write (lob)

Special Cases

Analyzing IO


        ( sql code ->  io by file
 CNT AAS EVENT                   P1 TABLESPACE
---- --- ----------------------- --- ----------
   1 .00 db file sequential read   1 SYSTEM
   2 .00 db file sequential read   3 SYSAUX
  38 .06 db file sequential read   6 SOE
 179 .30 db file sequential read   7 SOEINDEX


          ( sql code ->  io by object
 CNT     AAS OBJN                     OTYPE
---- ------- ------------------------- ---------------
  79     .00 52949 ORDER_ITEMS         TABLE PARTITION
  97     .00 -1
 130     .00 53117 ORD_STATUS_IX       INDEX
 498     .01 53120 CUST_EMAIL_IX       INDEX
 512     .01 0
1632     .03 53112 ITEM_ORDER_IX       INDEX

Object Advanced

         ( sql code -> io by object advanced  ) 
CNT AAS EVENT           OBJ                P1 TABLESPACE_NAME
--- --- --------------- ----------------- --- ---------------
  1 .02 db file sequent ORDER_ITEMS_PK      7 SOEINDEX
  1 .02 db file sequent 0                   3 SYSAUX
  2 .03 db file scatter WAIT_OBJECTS        1 SYSTEM
  2 .03 db file sequent ORDER_ITEMS         6 SOE
  3 .05 db file sequent CUST_EMAIL_IX       7 SOEINDEX
  4 .07 db file sequent CUST_LNAME_IX       7 SOEINDEX
  6 .10 db file sequent ORD_ORDER_DATE_IX   7 SOEINDEX
 13 .22 db file sequent ITEM_ORDER_IX       7 SOEINDEX
 25 .42 db file sequent 0                   2 UNDOTBS1

Statement Basic

           ( sql code ->  basic statement IO analysis

---- ------------- --- --- ----- --------------- -- ----------
  30 0yas01u2p9ch4   1 .01 53113 ITEM_PRODUCT_IX  7 SOEINDEX
                     1 .01 53079 ORDER_ITEMS_UK   7 SOEINDEX 
                    28 .16 53112 ITEM_ORDER_IX    7 SOEINDEX
  58 6v6gm0fd1rgrz   4 .02 54533 WAIT_OBJECTS     1 SYSTEM
                    54 .30     0 0                2 UNDOTBS1

 Statement Advanced

            ( sql code ->  advance statement  IO analysis

---- ----------------- ----------- ------- ---------- ---------- -- --------- ---------
.00 f9u2k84v884y7  33 CUSTOMERS    SYS_P27 TABLE PART  sequentia  1 SYSTEM    PERMANENT    
                   33 ORDER_PK             INDEX       sequentia  4 USERS     PERMANENT
                   33                                  sequentia  2 UNDOTBS1  UNDO
.01 0tvtamt770hqz 100 TOTO1                TABLE       scattered  7 NO_ASSM   PERMANENT
.06 75621g9y3xmvd   3 CUSTOMERS    SYS_P36 TABLE PART  sequentia  4 USERS     PERMANENT
                    3 CUSTOMERS    SYS_P25 TABLE PART  sequentia  4 USERS     PERMANENT
                    3 CUSTOMERS    SYS_P22 TABLE PART  sequentia  4 USERS     PERMANENT
                    3 CUSTOMERS    SYS_P29 TABLE PART  sequentia  4 USERS     PERMANENT
                    3 CUSTOMERS    SYS_P21 TABLE PART  sequentia  4 USERS     PERMANENT
                    3 CUSTOMERS    SYS_P35 TABLE PART  sequentia  4 USERS     PERMANENT
                    3 CUSTOMERS    SYS_P32 TABLE PART  sequentia  4 USERS     PERMANENT
                    3 CUSTOMERS    SYS_P27 TABLE PART  sequentia  4 USERS     PERMANENT
                    3 CUSTOMERS    SYS_P34 TABLE PART  sequentia  4 USERS     PERMANENT
                    4 CUSTOMERS    SYS_P23 TABLE PART  sequentia  4 USERS     PERMANENT
                    5 CUSTOMERS    SYS_P33 TABLE PART  sequentia  4 USERS     PERMANENT
               50 CUSTOMERS_PK        INDEX       sequentia  4 USERS 

Missing  Object Names 

Notice that in some of the analysis the object name or object id is missing? Sometimes the OBJECT ID is -1 or 0 in the ASH data for I/O so the above scripts will be missing the Object Name. 
In this case there is a second though more costly way to get the OBJECT NAME and that is by using the FILE# and BLOCK# that is being read. The File# and BLOCK# can be translated into to the OBJECT via a query like

select  segment_name,  
from     dba_extents 
where    &file  = file_id --  File = P1
     and &block between     -- Block = P2
              block_id and block_id + blocks – 1

This query is expensive and slow. If running this query more than once, it's recommended to create a copy of "DBA_EXTENTS" like

Create table my_extents as select * from DBA_EXTENTS;
Create index my_extents_i on my_extents(FILE_ID,BLOCK_ID);

To give an idea of just how much faster a copy of DBA_EXTENTS is, here an example on my machine (with out even creating an index)

Enter value for file: 3
Enter value for block: 6619

--------------- -------------------- ------------------
WMSYS           LOG_TAB_PK           INDEX

Elapsed: 00:00:41.25

Above I just queried for one block. Now below I will look up over a 150 objects in a faster time !

create table myextents as select * from dba_extents
Table created.

Elapsed: 00:01:25.73

col own for A7
col segment_name for a40
col segment_type for a15
col cnt for 9999
      count(*) cnt, 
      owner own, 
      segment_name , 
      v$active_session_history ash, 
      myextents ext
       ( event like 'db file s%' or event like 'direct%' )
    and (current_obj# in (0,-1) or current_obj# is Null)
    and sample_time > sysdate - &v_minutes/(60*24)
    and session_state='WAITING'
    and P1 = file_id
    and P2  between  block_id and block_id + blocks - 1
group by 
   owner, segment_name, segment_type

--- ---- ---------------- ------------                  
 993 SYS _SYSSMU7$        TYPE2 UNDO

150 rows selected.

Elapsed: 00:00:01.03

On other option is querying X$BH just after the IO has occured. X$BH reveals the contents of which blocks are in the buffer cache, so if the blocks from the recent physical IOs are still in the buffer cache we can query X$BH.

Tuning IO

Check average read times per file
Should be between 5-20 ms
    ADDM expects I/Os to be under 20ms
        The parameter defaults to 10000 micro seconds
        change behavior with DBMS_ADVISOR.SET_DEFAULT_TASK_PARAMETER( 'ADDM', 'DBIO_EXPECTED', <new value>)
Check time in 
    Data in Statspack under “File IO Stats” (8i and up)
    DBA_HIST_FILESTATXS for historical (10g)
    V$FILEMETRIC_HISTORY for recent (10g)
Check Cache buffer Hit ratio
Check db_cache_advice 9i and higher
Data in Statspack under “Buffer Pool Advisory”
Want to optimize data caching
Tune High IO SQL
Check reads off of UNDO

Average IO times


File IO Stats  DB/Inst:labsf03  Snaps: 1-2
                                            Av   Mx                                      Av
                                     Av     Rd   Rd    Av                 Av  Buffer   BufWt
Tablespace          Filename Reads Reads/s (ms)  Bkt Blks/Rd  Writes Writes/s  Waits    (ms)
---------- ----------------- ----- ------- ---- ---- ------- ------- -------- ------- ------
SYSTEM     /u01/system01.dbf   445      15  0.4   16    1.0    1,157       39   2,744   93.3
USERS      /u01/users01.dbf    223       7  0.5  ###    1.0    9,725      324       4  100.0

Recent  IO read times

       to_char(begin_time,'yyyy-mm-dd hh24:mi') begin_time,
       file_id fid,  
       average_read_time *10  avgrd_ms,
       average_write_time *10 avgwr_ms,
       physical_reads pr,
       physical_writes pw
order by begin_time;

BEGIN_TIME        FID AVGRD_MS AVGWR_MS         PR         PW
---------------- ---- -------- -------- ---------- ----------
2008-01-30 12:25    1      5.2      2.5        100         24
2008-01-30 12:25    5     80.0     20.0          2          2
2008-01-30 12:25    2     36.7      2.0          3        218
2008-01-30 12:25    4     22.8      1.8         89       2754
2008-01-30 12:25    3     22.9      2.6         14         47

Historic  IO read  times

       to_char(begin_interval_time,'yyyy-mm-dd hh24:mi') snap_time,
       file#,  readtim/nullif(phyrds,0) avg_read_ms, phyrds
     dba_hist_snapshot  s
where f.snap_id=s.snap_id ;

---------------- ---------- ----------- ----
2008-01-05 12:00          9       36.67         39
2008-01-05 12:00         10       32.31         39
2008-01-05 13:00          1       11.63     178224
2008-01-05 13:00          2       56.37       2014
2008-01-05 13:00          3       17.38      50668
2008-01-05 13:00          4        9.39     565321
2008-01-05 13:00          5       38.78         41
2008-01-05 13:00          6       28.29         41
2008-01-05 13:00          7       27.44         39
2008-01-05 13:00          8       42.56         39
2008-01-05 13:00          9       36.67         39
2008-01-05 13:00         10       32.31         39

ADDM and Avg Read Times

The parameter - DBIO_EXPECTED - that tells ADDM what is an acceptable
average single block read in micro seconds.

The parameter defaults to 10000 micro seconds and can be changed with

                               <new value>);.

Buffer Cache Hit Ratio

Buffer Cache Hit Ratio (BCHR) has a sordid history and it's tempting to just not mention it. For years the BCHR has been used to identify IO performance bottlenecks despite the fact that it can be completely unreliable. Connor McDonald even wrote a script (http://www.oracledba.co.uk/tips/choose.htm) that allows you to set the BCHR as high as you want without changing the size of the cache. The point being that IO bottlenecks should be identified with IO wait events and not BCHR, but if an IO bottleneck does happen, it is worth knowing if the buffer cache is supporting the IO load sufficiently. The question of whether the Buffer Cache is sufficiently sized can be addressed with Oracle's "DB Cache Advice". 

Buffer Pool Advisory  
Size for   Size      Buffers for   Read      Estimated
P    Est (M) Factor     Estimate Factor Physical Reads
--- -------- ------ ------------ ------ --------------
D         56     .1        6,986    2.3         58,928
D        112     .2       13,972    1.6         42,043
D        224     .4       27,944    1.0         25,772
D        336     .6       41,916    1.0         25,715
D        448     .8       55,888    1.0         25,715
D        596    1.0       74,351    1.0         25,715
D        728    1.2       90,818    1.0         25,715
D        840    1.4      104,790    1.0         25,715
D        952    1.6      118,762    1.0         25,715
D      1,064    1.8      132,734    1.0         25,715

The Buffer Pool Advisory gives expected physical reads at different cache sizes. The column "Factor" shows size relative to the current. I "Factor" of "1" is the current cache size. We can see in the above case, we can actually drop the cache size to .4 of the current size without significantly impacting the physical IO and on the other hand increasing it to 1.8 the current size won't help us at all. In this case the buffer cache size is probably fine an won't help us address IO performance issues on this particular database.


If the IO response time of the disks is good and if the buffer cache is correctly sized, then at this point it's clear it's time to tune the SQL with the most IO wait time:

select count(*),sql_id, event 
from v$active_session_history
where event in ('db file sequential read',
                'db file scattered read',
                'db file parallel read')
  and session_state='WAITING'
group by sql_id, event
order by count(*);

--------- ------------- ------------------------
       10 8hk7xvhua40va db file sequential read
      335 3hatpjzrqvfn7 db file sequential read
      343 0uuqgjq7k12nf db file sequential read
      549 75621g9y3xmvd db file sequential read
     1311 0bzhqhhj9mpaa db file sequential read
     1523 41zu158rqf4kf db file sequential read  
     1757 0yas01u2p9ch4 db file sequential read

IO Considerations

Unix Buffer Cache

Number of Spindles

Vendors say that read cache will take care of IOPs  but of you have a sustained throughput it will only sustain for so long


Seagate Barracuda 4LP

Seagate Cheetah 73LP




Rotation Speed



Rotational Delay(avg)



Time to read 32Kb



Seek Time (avg)



Total time for Single I/O



I/O per second (conservative)



IOPs/sec per 100GB



James  Morle  http://scaleability.com  

Raid 5

        BAARF - Battle Against any Raid Five

In the chart below, 1 represents good and 5 represents bad. 
Raid 5 runs into performance issues because every write takes two reads plus a write. These two reads can conflict with other readers. Also because Raid 5 is often configured with small stripe sizes, say 32K or 64K, a multiblock read can easily span several disks, increasing the chances of concurrency read conflicts. 

SAME - Stripe and Mirror Everything

    Oracle's recommendation is to stripe and mirror everything (SAME). See paper http://www.miracleas.com/BAARF/oow2000_same.pdf
    Redo will take a hit being mixed with everything else, so if it's a possibility, I'd recommend putting redo on a separate stripe/mirror set.

    Of course with Oracle's ASM and now Exadata machines, much if this discussion will become mute over the coming years.

direct path read

The direct path read wait event has three parameters: file#, first block#, and block count. In Oracle Database 10g, this wait event falls under the User I/O wait class. Keep the following key thoughts in mind when dealing with the direct path read wait event.

  • These are waits that are associated with direct read operations. An Oracle direct read operation reads data directly into the session’s PGA (Program Global Area), bypassing the SGA. The data in the PGA is not shared with other sessions.

  • Direct reads may be performed in synchronous or asynchronous mode, depending on the platform and the value of the DISK_ASYNC_IO parameter. The systemwide direct path read wait event statistics can be very misleading when asynchronous I/O is used.

  • A significant number of direct path read waits is most likely an application issue.

Common Causes, Diagnosis, and Actions

The direct path read waits are driven by SQL statements that perform direct read operations from the temporary or regular tablespaces.

SQL statements with functions that require sorts, such as ORDER BY, GROUP BY, UNION, DISTINCT, and ROLLUP, write sort runs to the temporary tablespace when the input size is larger than the work area in the PGA. The sort runs in the temporary tablespace are subsequently read and merged to provide the final result. An Oracle session waits on the direct path read wait event while the sort runs are being read. SQL statements that employ the merge join, either through a hint or as decided by the optimizer, also require sort.

SQL statements that employ the hash join, either as directed by a hint or the optimizer, flush the hash partitions that don’t fit in memory to the temporary tablespace. The hash partitions in the temporary tablespace are read back into the memory to be probed for rows that match the SQL predicates. An Oracle session waits on the direct path read wait event while the hash partitions are being read.

SQL statements that employ parallel scans also contribute to the systemwide direct path read waits. In a parallel execution, the direct path read waits are associated with the query slaves and not the parent query. The session that runs the parent query waits mostly on the PX Deq: Execute Reply wait event (in Oracle8i Database and above).


As of Oracle 8.1.7, there is a separate direct read wait event for LOB segments: direct path read (lob). This wait event applies to LOBs that are stored as NOCACHE. When LOBs are stored as CACHE, reads and writes go through the buffer cache and waits show up as db file sequential read.

Session-Level Diagnosis

It is highly unlikely that the direct path read wait event will show up as the leading bottleneck within a session even though it may actually be it. The reasons are as follows:

  • The way Oracle accounts for the waits, as discussed in Chapter 3.

  • Sessions that perform parallel processing using parallel query do not wait on the direct path read wait event, which normally represents the bulk of the time. The direct path read waits are associated with the query slaves’ sessions that scan the tables. The only time the direct path read wait event shows up within a parent session is when the parent session itself has read activities against the temporary tablespace. This is driven by SQL functions, such as ORDER BY, GROUP BY, and DISTINCT, or hash partitions that spill to disk but not parallel scans.

Therefore you shouldn’t evaluate direct path read waits based on the TOTAL_WAITS or TIME_WAITED in the V$SESSION_EVENT view. Instead, you can find current sessions that perform a lot of direct read operations from the V$SESSTAT view using the following query. The physical reads direct is comprised of the direct reads that are originated by the parent session itself as well as the sum of all direct reads that are originated by the query slaves that the parent session employs. The direct reads that are initiated by query slaves are only reflected in the parent session when the slaves complete their jobs. The downside to this approach is that there is no time element.

select a.name, b.sid, b.value, 
round((sysdate - c.logon_time) * 24) hours_connected
from v$statname a, v$sesstat b, v$session c
where b.sid = c.sid
and a.statistic# = b.statistic#
and b.value > 0
and a.name = 'physical reads direct'
order by b.value;

------------------------- ---- ---------- ---------------
physical reads direct 2 41 980
physical reads direct 4 41 980
physical reads direct 5 445186 980

Apart from finding the sessions that perform a lot of direct reads, you must also find out where the sessions are reading from (temporary tablespace or data files), the SQL statements that initiate the waits, and the type of segments being read. The following query gives you the answers. Sessions that read from the temporary tablespace may be reading sort or hash segments. Sessions that read from data files normally belong to parallel query slaves.

select a.event,
c.sql_hash_value hash_value,
1,'SORT', 2,'HASH', 3,'DATA',
null) as segment_type,
from v$session_wait a, dba_data_files b, v$session c, x$ktsso d
where c.saddr = d.ktssoses(+)
and c.serial# = d.ktssosno(+)
and d.inst_id(+) = userenv('instance')
and a.sid = c.sid
and a.p1 = b.file_id
and a.event = 'direct path read'
union all
select a.event,
d.sql_hash_value hash_value,
1,'SORT', 2,'HASH', 3,'DATA',
null) as segment_type,
from v$session_wait a, dba_temp_files b, v$parameter c,
v$session d, x$ktsso e
where d.saddr = e.ktssoses(+)
and d.serial# = e.ktssosno(+)
and e.inst_id(+) = userenv('instance')
and a.sid = d.sid
and b.file_id = a.p1 - c.value
and c.name = 'db_files'
and a.event = 'direct path read'
order by 1,2;

-- File name output is edited to fit page.
------------------ --- ---------- ------- ------------ -----------------
direct path read 8 511952958 SORT TEMP_BATCH temp_batch_01.dbf
direct path read 9 3138787393 ORDERS orders_01.dbf
direct path read 11 3138787393 ORDERS orders_01.dbf
direct path read 12 3138787393 ORDERS orders_01.dbf
direct path read 14 3138787393 ORDERS orders_01.dbf

If you catch a session reading sort segments from the temporary tablespace, this indicates the SORT_AREA_SIZE (or work area size if you use the PGA_AGGREGATE_TARGET in Oracle9i Database) is not large enough to accommodate a cache sort (or in memory sort). This is fine. It is unrealistic to expect all SQL statements to perform cache sorts. However, you should avoid multipass sorts because they create a lot of I/Os to the temporary tablespace and are very slow. How can you tell if a SQL statement is doing a multipass sort? Well, it is not that easy in versions prior to Oracle9i Database. You have to trace the session with the event 10032 and examine the trace file. However, beginning in Oracle9i Database, you can simply query the V$SQL_WORKAREA or V$SQL_WORKAREA_ACTIVE views with the SQL hash value that performs the sort. For a more in-depth discussion on sort, please review the “If Your Memory Serves You Right” white paper from the International Oracle Users Group (IOUG) 2004 conference proceedings at www.ioug.org.

The goal of tuning in this case is to minimize the number of sorts as a whole and, more specifically, disk sorts. Increasing the SORT_AREA_SIZE (or PGA_AGGREGATE_ TARGET) may help reduce the number of disk sorts, but that’s usually the workaround rather than the cure, unless your SORT_AREA_SIZE is unreasonably small to begin with. You should examine the application and the SQL statements to see if sorts are really necessary. Applications have the tendency to abuse the DISTINCT and UNION functions. Whenever possible use UNION ALL instead of UNION, and where applicable use HASH JOIN instead of SORT MERGE and NESTED LOOPS instead of HASH JOIN. Make sure the optimizer selects the right driving table. Check to see if the composite index’s columns can be rearranged to match the ORDER BY clause to avoid sort entirely. Also, consider automating the SQL work areas using PGA_AGGREGATE_TARGET in Oracle9i Database. Statistically, the automatic memory management delivers a higher percentage of cache sorts.


Be careful when switching from UNION to UNION ALL as this can produce different results depending on the data. The UNION operator returns only distinct rows that appear in either result, while the UNION ALL operator returns all rows.


By default, the HASH_AREA_SIZE is twice the SORT_AREA_SIZE. A larger HASH_AREA_SIZE will influence the optimizer toward the hash joins (full table scan) rather than nested loops operation.

Similarly, if you catch a session reading hash segments from the temporary tablespace, all that tells you is that the HASH_AREA_SIZE (or work area size in case of the PGA_AGGREGATE_TARGET in Oracle9i Database) is not big enough to accommodate the hash table in memory. The solution is similar to the one just mentioned: cure it from the application and SQL tuning before adjusting the HASH_AREA_SIZE (or PGA_AGGREGATE_TARGET). Unless, of course, your HASH_AREA_SIZE is too small to begin with.

If you discover that the direct reads belong to parallel query slaves, you should verify if parallel scans are appropriate for the parent SQL statement and that the degree of parallelism is right. Make sure the query slaves do not saturate your CPUs or disks. Identifying the parent SQL statement can be a bit tricky because the hash value of the parent statement is not the same as the hash value of child statements that the query slaves execute. It was even trickier before the V$PX_SESSION view was introduced in Oracle 8.1.5. Following are two examples, one for versions prior to 8.1.5 and the other for version 8.1.5 and greater, that can help you identify the parent SQL statements when parallel queries are involved:

-- For versions prior to 8.1.5.
-- Note: This query is not able to differentiate parallel query statements
-- that are executed by multiple SYS users as they all share a common
select decode(ownerid,2147483644,'PARENT','CHILD') stmt_level,
sql_hash_value hash_value,
from v$session
where type <> ‘BACKGROUND’
and audsid in (select audsid
from v$session
group by audsid
having count(*) > 1)
order by audsid, stmt_level desc, sid, username, osuser;

------ -------- ---- ------- -------- ------- ------- ---------- --------
PARENT 3086501 20 779 INTREPID cdh8455 16537 3663187692 A0938E54
CHILD 3086501 12 841 INTREPID cdh8455 16544 817802256 A092E1CC
CHILD 3086501 14 2241 INTREPID cdh8455 16546 817802256 A092E1CC
CHILD 3086501 17 3617 INTREPID cdh8455 16540 817802256 A092E1CC
CHILD 3086501 21 370 INTREPID cdh8455 16542 817802256 A092E1CC

The following query applies to version 8.1.5 and higher.

select decode(a.qcserial#, null, 'PARENT', 'CHILD') stmt_level,
from v$px_session a, v$session b
where a.sid = b.sid
order by a.qcsid, stmt_level desc;

------ --- ------- -------- ------- ---------- -------- --- -------
PARENT 20 779 INTREPID cdh8455 3663187692 A0938E54
CHILD 17 3617 INTREPID cdh8455 817802256 A092E1CC 4 4
CHILD 21 370 INTREPID cdh8455 817802256 A092E1CC 4 4
CHILD 12 841 INTREPID cdh8455 817802256 A092E1CC 4 4
CHILD 14 2241 INTREPID cdh8455 817802256 A092E1CC 4 4

Initialization Parameters of Interest

The DB_FILE_DIRECT_IO_COUNT initialization parameter can impact the direct path read performance. It sets the maximum I/O buffer size for direct reads and writes operations. Up to Oracle8i Database, the default value on most platforms is 64 blocks. So if the DB_BLOCK_SIZE is 8K, the maximum I/O buffer size for direct reads and writes operations is 512K. This number is further subject to hardware limits.

The DB_FILE_DIRECT_IO_COUNT parameter is hidden in Oracle9i Database, and the value is expressed in bytes instead of blocks. The default value in Oracle9i Database is 1MB. The actual direct I/O size depends on your hardware configuration and limits.

You can discover the actual direct read I/O size in three ways:

  • Trace the Oracle session that performs direct reads operations using the trace event 10046 at level 8. The P3 parameter indicates the number of blocks read. Based on the following example, the direct path readI/O size is 64K since the block size is 8K. Alternatively, you can query the V$SESSION_WAIT view for the P3 value of the direct path read event.

    WAIT #1: nam='direct path read' ela= 4 p1=4 p2=86919 p3=8
    WAIT #1: nam='direct path read' ela= 5 p1=4 p2=86927 p3=8
    WAIT #1: nam='direct path read' ela= 10 p1=4 p2=86935 p3=8
    WAIT #1: nam='direct path read' ela= 39 p1=4 p2=86943 p3=8
    WAIT #1: nam='direct path read' ela= 5 p1=4 p2=86951 p3=8
    WAIT #1: nam='direct path read' ela= 38 p1=4 p2=86959 p3=8
    . . .
  • Trace the Unix session that performs direct reads or writes operations using the operating system trace facility such as trusstusctrace, or strace. The snippet of the truss report from an Oracle9i Database reveals the direct I/O size is 65536 bytes or 64K:

    9218/6:         kaio(AIONOTIFY, -14712832)                      = 0
    9218/1: kaio(AIOWAIT, 0xFFBECE98) = 1
    9218/1: lwp_cond_signal(0xFEB7BFA0) = 0
    9218/3: pread64(256, "0602\0\001\0 ~13C19AEE }".., 65536, 0x0FC26000) = 65536
    9218/1: lwp_cond_signal(0xFEB69FA0) = 0
    9218/4: pread64(256, "0602\0\001\0 ~1BC19AEFE7".., 65536, 0x0FC36000) = 65536
    . . .
  • Enable the debug information for the session that performs direct I/O operations using the trace event 10357 at level 1. Example: alter session set events '10357 trace name context forever, level 1'. The snippet of the trace file is provided here:

    Unix process pid: 4375, image: oracle@kccdeds73 (P000)
    *** SESSION ID:(9.18) 2004-02-08 21:47:01.908
    DBA Range Initialized: length is 1570, start dba is 0100602b
    kcblin: lbs=fc86c1cc flag=8 slot_cnt=32 slot_size=65536 state obj=24321224
    kcblin: state objects are: Call=243a2210,Current Call=243a2210, Session=24321224
    kdContigDbaDrCbk:starting from tsn 5
    kdContigDbaDrCbk:starting from rdba 0100602b
    kdContigDbaDrCbk:returning 1570 blocks
    kcblrs:issuing read on slot : 0
    kcbldio:lbs=fc86c1cc slt=fc86408c typ=0 async=1 afn=4 blk=602b cnt=8 buf=fc87fe00
    kcblrs:issuing read on slot : 1
    kcbldio:lbs=fc86c1cc slt=fc864210 typ=0 async=1 afn=4 blk=6033 cnt=8 buf=fc89fe00
    kcblcio: lbs=fc86c1cc slt=fc86408c type=0 afn=4 blk=602b cnt=8 buf=fc87fe00
    . . .

In the preceding example, the trace file belongs to query slave #0 (P000). There are 32 I/O slots available for the direct read operation (slot_cnt=32). A slot is a unit of I/O, and each slot is 65536 bytes (slot_size=65536). Asynchronous I/O is enabled during the read operation (async=1). The query slave reads data file #4 (afn=4). The number of blocks read is 8 (cnt=8). Since the block size is 8K, this translates to 65536 bytes.

In this case, the direct I/O slot size prevents the process from achieving the full 1MB, which is the default limit of the _DB_FILE_DIRECT_IO_COUNT parameter. The slot size can be modified by event 10351. The number of slots can be modified by event 10353.


The preceding information gives you a sense of the direct I/O throughput in your system. Don’t simply change the default slot size or the number of direct I/O slots. You need to know your hardware limits before doing so. Besides, you should focus on optimizing the application and SQL statements first.

Lastly, in Oracle8i Database, direct reads can also be enabled for serial scans with the _SERIAL_DIRECT_READ initialization parameter. Earlier releases may do the same by setting event 10355. Doing so will cause data from full scans to be read into the session’s PGA and not shared with other sessions. This can sharply increase memory usage.