Hello, you are not logged in.  Login or sign up
Community >> Blogs
Search Toad World Search

Do you have a topic that you'd like discussed?  We'd love to hear from you.  Send us your idea for a blog topic.

Watching Your ASMM in Oracle10g
 
Location: Blogs Mike Ault's Blog    
 MikeA Monday, November 27, 2006 8:44 AM
Oracle’s Oracle10g requires more memory than was ever required before. If you utilize any of the new features such as automatic storage management (ASM) and automatic shared memory management (ASMM) then you really need to pay attention to what memory is doing in 10g.

Why do I say that you need to really pay close attention? Well, in 10g the ASMM feature will rob Peter to pay Paul as the old saying goes. If it sees the shared pool requires more space, it will take it, if it can, from the buffer cache, and of course, the obverse is also true.

ASMM (Automatic Shared Memory Management)

In Oracle10g by setting the initialization parameters SGA_TARGET and SGA_MAX_SIZE you turn on ASMM. ASMM looks at memory based statistics to dynamically alter the settings for the following SGA areas:

  • DB_CACHE_SIZE
  • SHARED_POOL_SIZE
  • LARGE_POOL_SIZE
  • JAVA_POOL_SIZE

So if any of these structures needs memory, then Oracle10g ASSM will take any free memory it sees from any other structure that isn’t being utilized, at least that is the way it is supposed to work.

ASMM Problems

Let’s look at a case in point:

On a RedHat 4.0 4-CPU Opteron (2-Chip, 4-core) using 6 gigabytes of memory in a 2-node RAC, the client kept getting ORA-07445’s when their user load exceeded 60 users per node. At 100 users per node they were getting these errors, a coredump for each and a trace file on each server, for each node, about twice per minute. There didn’t seem to be any operational errors associated with it, but it seriously affected IO rates to the SAN and filled up the UDUMP and BDUMP areas quickly. Of course when the BDUMP area filled up the database tends to choke.

The client is using ASMM with SGA_TARGET and SGA_MAX_SIZE set and no hard settings for the cache or shared pool sizes. Initially we filed an ITar or SR or whatever they are calling them these days but didn’t get much response on it. So the client suffered until I could get on site and do some looking.

I looked at memory foot print, CPU foot print and user logins and compared them to the incident levels of the ORA-07445. There was a clear correlation to the number of users and memory usage. Remembering that the resize operations are recorded I then looked in the GV$SGA_RESIZE_OPS dynamic performance view (DPV) and then correlated the various memory operations to the incidences of the ORA-07445, the errors only seemed to occur when a shrink occurred in the shared pool as we saw the error on node 1 where a shrink occurred and none on node 2 where no shrink had happened yet.

Date: 03/03/06                                                                            Page:   2
Time: 02:33 PM                        Resize OPS                                          BEIMON
                                    client1 database

 I Component   Oper   PARAMETER          INITIAL    TARGET     FINAL STATUS    S_TIME       E_TIME
- ----------- ------ ---------------- --------- --------- --------- -------- ------------ ------------
1 shared pool STATIC shared_pool_size         0 268435456 268435456 COMPLETE 200603031055 200603031055
1 shared pool GROW   shared_pool_size 268435456 285212672 285212672 COMPLETE 200603031104 200603031104
1 shared pool GROW   shared_pool_size 285212672 301989888 301989888 COMPLETE 200603031249 200603031249
1 shared pool SHRINK shared_pool_size 301989888 285212672 285212672 COMPLETE 200603031253 200603031253
1 shared pool GROW   shared_pool_size 285212672 301989888 301989888 COMPLETE 200603031300 200603031300
1 shared pool SHRINK shared_pool_size 301989888 285212672 285212672 COMPLETE 200603031303 200603031303
1 shared pool GROW   shared_pool_size 285212672 301989888 301989888 COMPLETE 200603031346 200603031346
1 shared pool SHRINK shared_pool_size 301989888 285212672 285212672 COMPLETE 200603031410 200603031410 

Figure 1: Resize Operations

Sure enough, hard setting the SHARED_POOL_SIZE to a minimum value delayed the error so that it didn’t start occurring until the pool extended above the minimum then shrank back to it, however, not every time. We were able to boost the number of users to 80 before the error started occurring by hard setting the shared pool to 250 megabytes. A further boost to the shared pool size to 300 megabytes seems to have corrected the issue so far but we will have to monitor this as the number of user processes increases. Note that you need to look at the GV$SGA_RESIZE_OPS DPV to see what resize operations are occurring and the peak size reached to find the correct setting on your system.

It appears that there must some internal list of HASH values that is not being cleaned up when the shared pool is shrunk. This results in the kernel expecting to find a piece of code at a particular address, looking for it and not finding it, this generates the ORA-07445. Of course this is just speculation on my part.

The Case of the Missing Memory

Another reported issue is on HPUX 11.11 where a there has been an observed memory leak for each committed DML on a table which has a materialized view log.  The size of the memory leak is constant and appears to be independent of the size of the transaction that is committed. If materialized view logs are dropped, the leak stops. Also, it appears that disabling Automatic Shared Memory Management (ASMM) works around the problem. Oracle has promise a fix in version Oracle11g version 11.1.

Some ASSM Components are Stingy

In another ASMM bug, The large_pool_size and java_pool_size components do not scale downwards, only db_cache_size and shared_pool_size are able to shrink when memory needs to be reallocated to another component.  

Even though the lower bounds of large_pool_size and java_pool_size are set far lower than the current allocation and v$sgastat shows that the memory is free in these pools, no dynamic shrink of the pools is made when memory needs to be reallocated to another component, instead ORA-00384 ("Insufficient memory to grow cache") is raised. Again, the promised fix will not occur until Oracle11g.

Sometimes Oracle gets Confused

One additional note, if you use the DBCA to create your 10g environment it may set SGA_MAX_SIZE equal to SGA_TARGET resulting in ASMM being turned on, but being unable to perform any operations.

Summary

So, if Oracle10g is just tooling along with gradual changes in the shared pool and buffers then ASSM seems to do a decent job, however, if you have spiking and highly transient loads, it may not be for you. Also, if you see spiking in JAVA or large pool areas, this memory may not get returned for use when those areas should shrink. On top of all of the above, there may be many memory leaks and OS specific issues with utilizing ASMM so if you utilize this feature you will need to watch it closely.

Copyright ©2006 Quest Software Inc.
Permalink |  Trackback
Search Blog Entries
 
Copyright 2008 by Quest Software  | Terms Of Use | Privacy Statement | Contact Us