OOME: PermGen

Amazon.com WidgetsWebLogic 10.3.3 Admin server OutOfMemoryError: PermGen

Here I am going to discuss about the case that describes the analysis to resolution of a memory leak (PermGen space) problem experienced for the admin server with a Weblogic 10.3.3 environment using Sun JDK 1.6.0_21.

The permanent generation (PermGen) is special because it holds meta-data describing user classes (classes that are not part of the Java language). Examples of such meta-data are objects describing classes and methods and they are stored in the Permanent Generation. Applications with large code-base can quickly fill up this segment of the heap which will cause java.lang.OutOfMemoryError: PermGen no matter how high your -Xmx and how much memory you have on the machine.

Environment specifications

Java EE server: Weblogic 10.3.3

Operating Environment: Solaris 10

JDK: Sun JDK 1.6.0_21  VM arguments for Perm Space are given as below:

-XX:PermSize=48m -XX:MaxPermSize=128m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+HeapDumpOnOutOfMemoryError

This problem was becoming unmanageable for the team to support in the production environment as admin server was constantly failing with this OutOfMemoryError.  We were worked with Oracle support team to fix this issue. The problem mitigation did involve restarting the admin server every day 2 times.

Analysis Factors

JVM Heap Dump

A few JVM Heap Dumps (java_pid<xyz>.hprof format) were generated by the Sun JVM following some occurrences of the OutOfMemoryError.

Please note that a Heap Dump represents a snapshot of the Java Heap. PermGen is not normally part of a JVM heap dump but it still provide some information on how many classes were loaded, how many class loaders etc. as a pointer (stub) to the real native memory object stored in the native memory space.

The heap dump was analyzed using Eclipse Memory Analyzer in order to try to get information on the loaded classes and classloaders.

The analysis reveals following facts:

Checking the JVM foot prints using jstat command and UNIX awk script is referred in the following link

http://wlatricksntips.blogspot.com/2009/08/gc-jvm-state-with-jstat.html

Interning Strings

From the logs we can see this error

java.lang.OutOfMemoryError: PermGen space

at java.lang.String.intern(Native Method)

You can have intern strings using String.intern() which is defined in the java docs as follows:

Returns a canonical representation for the string object. A pool of strings, initially empty, is maintained privately by the class String.

When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

Interned strings are stored in our PermGen space, so if you interned all your strings, you would eventually run into Out Of Memory.

High amount of instances of java.lang.Class loaded by the system class loader (leak suspect #2). These classes did not appear to be referenced anymore. WebLogic Admin Server throws OOM ERROR, The Admin console unresponsive under load. There is a bug reported in Oracle Bug# 9764015. This is issue found in WebLogic 10.3.1, 10.3.2 and 10.3.3 also.

In the Admin Server GC logs we found that Garbage Collection is not happening properly even after the Full GC we can still see the memory is not clearing from the Heap/Perm

71361.227: [Full GC 71361.227: [CMS: 530359K->530368K(741376K), 12.6086964 secs] 531435K->530368K(1017856K), [CMS Perm : 524287K->524287K(524288K)], 12.6090303 secs] [Times: user=12.41 sys=0.02, real=12.61 secs]

Here PermGen is almost full (99.99%) so that is the reason we are getting OutOfMemory in PermGen space.

By default the JVM loads a class in the PermGen and unloads a class from the PermGen space when there are no live instances of that class left, but this can degrade performance in some scenarios. Turning off class garbage collection eliminates the overhead of loading and unloading the same class multiple times. If a class is no longer needed, the space that it occupies on the heap is normally used for the creation of new objects. However, for an application that handles requests by creating a new instance of a class, the normal class garbage collection will clean up this class by freeing the PermGen space it occupied, only to have to re-instantiate the class when the next request comes along.

Why this FULL GC is not happening properly? 

The addition of the –Xnoclassgc flag did disabled the Sun JVM PermGen space garbage collection was leaking in the PermGen space. So lets try to remove this flag "-Xnoclassgc" and see.

Result: Failed, Still OOM Error occurred again.

Trial  2

Whenever there is a fresh deployment, new class objects get placed into the PermGen and thus the memory space occupy an ever increasing amount of space. Regardless of how large you make the PermGen space, it will inevitably top out after enough deployments. What you need to do is take measures to flush the PermGen so that you can stabilize its size. There are two JVM flags which handle this cleaning:

-XX:+CMSPermGenSweepingEnabled

This setting includes the PermGen in a garbage collection run. By default, the PermGen space is never included in garbage collection (and thus grows without bounds).

-XX:+CMSClassUnloadingEnabled

This setting tells the PermGen garbage collection sweep to take action on class objects. By default, class objects get an exemption, even when the PermGen space is being visited during a garbage collection.JVM memory arguments review

If a class is no longer needed, the space that it occupies on the heap is normally used for the creation of new objects. However, for an application that handles requests by creating a new instance of a class, the normal class garbage collection will clean up this class by freeing the PermGen space it occupied, only to have to re-instantiate the class when the next request comes along. In this situation you might want to use this option to disable the garbage collection of classes.

However, in the Java EE world, this is normally a bad idea since many applications creates classes dynamically, or uses reflection, because for this type of application, the use of this option can lead to native memory leak and exhaustion.

GC can be your friend (or enemy) keep your application on an Object diets.

As WLA, you should always monitor an application server with -verbose:gc. Garbage collection logs using -verbosegc provide help in gathering detailed info about GC from Java Heap. -XX:+PrintGCDetails 

$ jinfo -flag PermSize 23970

-XX:PermSize=50331648

The Java utility jinfo command will give you the current PermSize setting values. Here it is 48m.

jmap -permstat <pid>

Note this option won't work in windows operating system

Example

jmap -permstat 23970

Conclusion

Finally, The huge load on the Admin server due to MBeans connecting Managed servers, it was confirmed with the second time given JVM arguments. This issue is found as Known issue from Oracle support team. This bug was reported from WebLogic 10.3.1 onwards. Issue was reoccurred in next versions till 10.3.3. Oracle released a bug fix-PATCH. You can refer this in http://support.oracle.com with Bug # 764015.

Keeping the following patch in the PRE_CLASSPATH it got resolved.

BUG9764015_1033.jar

Reference Links