 |
|
 |
|
|
|
 |
 |

 |
 |
|
|
 |
 |
All of us have experienced the incredible increase in the speed and capabilities of computing technology. Gordon Moore - one of the founders of Intel - famously remarked that the density of transistors that can be placed on an integrated circuit doubles every two years. This increasing transistor density, together with a number of other related trends, drives a similar growth in computer performance and “Moore’s law” has been remarkably accurate in forecasting the exponential growth of CPU, disk and memory capacities in computer equipment.
However, while this trend drives exponential improvements in the electronic aspects of computing performance, hard disk drive performance improves at a far slower pace. Hard disk drives are mechanical devices which do not benefit from the exponential growth promised by Moore's law. Indeed, during a period in which CPU speeds increased by about 32 times, hard drive speeds increased only 4 times (Figure 1). Compared to everything else in the computer, hard drives have gotten significantly slower.
 Figure 1 Trends in CPU, memory and disk
Contemporary databases are all about managing data stored on disks, so in order to maintain acceptable performance and leverage increases in CPU and memory capacity, modern databases like Oracle go to enormous lengths to avoid disk IO. The primary method for avoiding IO is – of course – to cache data in memory. The area in memory that Oracle uses to cache data from disk files is known as the buffer cache.
Buffer cache contention in Oracle We’re not concerned with the effectiveness of caching in particular just now - though I promise we will get to it eventually. In this article I want to examine how contention within the buffer cache can reduce the effectiveness of caching. Reading from the buffer cache involves latch operations, and we saw last month how latch contention can result. This month, I want to examine other forms of contention within the cache.
As DML statements execute against the database, changes are initially made only to memory. In fact, the only SQL statement that requires an immediate write to disk is the COMMIT statement – and that requires only a write to the redo log, rather than the database files. However, these changes must eventually find their way to disk, and this is the job of various background processes. In particular:
- The Database Writer (DBWR) writes modified blocks from the buffer cache to the database files.
- The Recovery Writer (RVWR) writes modified blocks from the buffer cache to the flashback logs
- The Log Writer (LGWR) writes modified blocks from the log buffer to the redo log.
Figure 2 illustrates how these writer processes interact.
 Figure 2 The role of background processes in writing to disk These writers are sometimes referred to as “lazy” writers, because they do not immediately write changes to disk, but instead do so at some later time. This mechanism works wonderfully when these processes can keep up with demand. However, should they fall behind, then various forms of buffer cache contention can result.
Database Writer issuesThe Database Writer (DBWR) writes modified blocks from the buffer cache to the database files. Sessions are adding blocks to the cache all the time, so it’s important that the DBWR not allow modified (“dirty”) blocks to fill the cache. If a session wants to add a new block to the cache, the least recently used block in the cache is discarded and the new block takes its place. However, if all blocks in the cache are dirty, then the session must wait for the DBWR to write the blocks to disk and a “free buffer wait” will occur (Figure 3).
Figure 3 Free buffer waits occur when the DBWR cannot keep up with demand
Another sign that the DBWR might not be keeping up are “write complete” waits, which occur when a session wants to modify a buffer, but the buffer is currently being written out by the DBWR. This will of course happen from time to time, but should be rare unless the DBWR is struggling to complete batches.
There’s a number of possible causes and solutions for DBWR related buffer contention:
- Asynchronous IO is not enabled. This used to be the single most prevalent cause of DBWR problems, but is fairly rare today. Without Asynchronous IO, the DBWR can only write to a single disk at a time. Consequently, the DBWR is likely to fall behind foreground sessions, which are reading from many disks concurrently. With today’s operating systems and Oracle9i and above, this is almost never the problem.
- RAID5. RAID5 disk arrays magnify the cost of disk writes over disk reads. Typically with RAID5 a disk write will actually consume at least 4 physical IO operations on at least two disk devices (the extra IOs being required to maintain parity information). Most RAID5 arrays use non-volatile (eg, battery-backed) caches to prevent this overhead from impacting the database, but during sustained write IO the cache fills and becomes ineffective. Almost every credible database expert feels strongly that RAID5 is a bad idea for any database with non-trivial write overhead– see www.baarf.com where database experts unite in the Battle Against Any Raid Five.
- Insufficient IO write bandwidth. The maximum amount of IO writes that can be achieved is limited primarily by the number of disks allocated to the database files and the degree to which data is evenly spread across those disks. RAID 1 0 (Oracle’s Stripe And Mirror Everything [SAME] approach) generally takes care of the spread. However, it’s almost always possible to write data to memory faster than writing data to disk, so if the rate at which blocks are added to cache exceeds the IO bandwidth, then you may decide to increase the number of physical disks and spread the data more evenly. ASM makes this procedure fairly easy, since it will automatically redistribute data after you’ve added the new disk.
Flashback buffer bottlenecksThe flashback database functionality introduced in 10g allows us to “rollback” a database to recover from a logical corruption rather than “rolling forward” using redo logs. The reduction in MTR (Mean Time to Recover) can be substantial and consequently flashback database was one of the most popular features of Oracle 10g.
However, nothing comes for free, and flashback technology requires that we periodically write what essentially amounts to rollback information to the flashback logs. The Recovery Writer process (RVWR) is responsible for writing these log records. If it can’t keep up, then sessions trying to add to the buffer cache will receive a “flashback buf free by RVWR” while the RVWR catches up (Figure 4).
Figure 4 Recovery writer (flashback log) bottleneck
Recovery Writer waits are usually the result of insufficient IO bandwidth for the Flash Recovery Area (FRA). Quite often, the FRA is placed on a smaller number of disks than host the database files which increases the chance that the RVWR will become a bottleneck. The problem can be exacerbated if the archive destination is set to the FRA, since now the RVWR and the redo log writer (LGWR) will compete for IO.
Buffer busy waits The DBWR and RVWR problems above occur when the entire buffer cache is full with modified blocks. The “buffer busy” wait occurs when a session wants to access a block but the block currently being subjected to an incompatible operation by another session.
Some of the major causes of buffer busy are:
- For segments not using ASSM (Automatic Segment Space Management), concurrent inserts into a table can cause buffer busy contention around the segment freelists. ASSM has largely eliminated this phenomenon, but where ASSM is not enabled, multiple freelists are generally indicated. Note that freelist contention results in buffer busy on the “data block”, not – as you might expect – the “freelist” block.
- “Hot” blocks - that are subject to very high concurrent access – may exhibit buffer busy waits as an inevitable result.
- In a RAC environment, buffer busy waits (or “gc buffer busy”) can result from using non-ASSM especially if there is only a single freelist group defined.
- Also in RAC, IO related problems on one instance (redo log bottlenecks for instance) can result in buffer busy waits when the buffer is accessed on another instance.
The first step in resolving buffer busy contention is to identify the segments involved in buffer busy waits. Prior to the introduction of the V$SEGMENT_STATISTICS view in 9i, we could only do this by running a 10046 trace, or by catching the waits as they appear in V$SESSION_WAIT. From 9i onwards we can simply examine V$SEGMENT_STATISTICS to see which segments are most heavily associated with buffer busy waits.
Figure 5 Spotlight buffer busy drilldown
Figure 5 shows the Spotlight buffer busy drilldown which is available for any version of Oracle. We show the segments most heavily associated with buffer busy waits, as well as the block types encountered, tablespaces and users involved.
If your database is version 10g (or higher) we can show more information including the SQL statements associated with buffer busy (Figure 6).
 Figure 6 Identifying SQLs associated with buffer busy in 10g Remedial action for buffer busy waits could include the following:
- If buffer busy waits are associated with heavy rates of insert into a table not configured for ASSM, increase the number of freelists using an ALTER TABLE statement. If in a RAC environments consider increasing FREELIST_GROUPS as well.
- If buffer busy is associated with very hot blocks, but not for individual rows, you can consider partitioning the segment or reducing the number of rows per block by manipulating PCTFREE.
- Buffer busy on primary key columns populated by a sequence (or other monotonically increasing number) can be relieved by using a “reverse key” index. This will result in inserted rows being evenly distributed amongst the leaf nodes of the index, but prevents index range scans (which rarely occur for primary key sequence numbers).
- As always, make sure that what the application is asking Oracle for is sensible – poorly tuned SQL that unreasonably magnifies the Oracle demand will cause all sorts of contention including buffer busy.
ConclusionA properly configured buffer cache can be the key to avoiding a disk-bound database and delivering peak database performance. However, contention for blocks in the data cache can prevent the buffer cache from reaching it's full potential. To avoid buffer cache contention, follow these guidelines:
- Make sure that the DBWR and RVWR (if you have one) can keep up with demand. The most important thing is to spread data files and flashback logs across enough disks using striping techniques.
- Don't use RAID5 if you are trying to achieve a high performance database with any non-trivial write load. See www.baarf.com if you need help in persuading your manager that RAID5 is not the answer.
- Use ASSM (Automatic Segment Storage Management) where possible. If not using ASSM, make sure that segments have enough freelists and - for RAC - freelist groups.
- Consider reducing the row density or partitioning tables that have hot blocks that are showing buffer busy waits.
- Consider using reverse key indexes for primary key indexes that are showing buffer busy waits.
- Keep you application load nominal by tuning SQL and PL/SQL.
|
 |
 |
|
 |
|
 |
|
|
 |
 |
|
 |
|
|
|