Thursday, June 21, 2007

Garbage Collection - Moving Closer

In the previous blog, there was a open ended question. The question goes like this "In your JVM, you have few objects that are eligible for the garbage collection. JVM also runs garbage collector. Now the question is, whether all the objects that are eligible for GC will be garbage collected?". The answer to this question needs the deeper understanding of the various garbage collection algorithms, the size of the objects and more importantly the life duration of the objects. Believe me, JVM's GC is not one algorithm but pool of algorithms that sweeps the heap.

We had an overview of Garbage Collection from a programmers' perspective in the previous write-up. It was just an introduction and we did not discuss much from the JVM's perspective. Some of you would started ask a question, "Why should I know all these? I am just a developer". Thanks for your thoughtful question and I appreciate it. You might have high-end desktops for coding a "Hello, World" program and your application might work as expected. But an enterprise level Java application face a lot of challenges in terms of functionalities and performance. If you know the internal working of a system, you get a broader perspective and possibly lead to judicious usage of the system. Your knowledge on JVM will become handy and it will certainly pay-off in near future.

As we discussed in the previous write-ups, there are two factors that are key for successful garbage collection. The first work is to identify the garbage objects (objects that are no longer used) and the second work is to actually clean them up. Apart for cleaning up, the garbage collection algorithm also does memory organization by relocating objects in heap.

Garbage Collection - A Closer Look

In the case of Sun Hotspot JVM, the garbage collection takes place using generational collector. Basically, there are two postulates based on which the generational collector works and they are
  • Most of the objects (more than 90% of the objects) are short lived. They die as soon as they are created.
  • A smaller percentage of the objects are long lived. They live almost till the JVM is live.
Based on this two postulates, Sun Hotspot JVM has designed the garbage collection algorithm. The entire heap is sub-divided into two regions - Young generation and Old generation. The young generation is smaller in size when compared to the older generation. Initially the objects are created in the young generation and they are promoted to older generation, if they live long. JVM employs two difference garbage collection algorithm in young and old generation as they nature of the objects varies. The following diagram gives a pictorial representation of JVM heap.

Figure - 1

2 comments:

Unknown said...

I went through all your blogs (atleast related to GC). I would really like to appreciate your efforts in sharing your understanding on them and putting them in an intseresting and understandable way. With the level of understanding I have with GC ( though not as deep as you), I have had quite a few questions all the time. It might be stupid or lack of complete knowledge or little understanding with C++, but, i always thought that there should be a way to enforce the destruction of obejcts programatically. This is just my opinion and not challening the behaviour of JVM. When you say that you can make an object to be eligible for GC, you are sure that you can not access it any more (even if GC has not run immedietley). So, I think if you say System.gc, it should remove the object. Even though there is a trade off in few more threads getting spawned for the destruction of objects, there have been a demand for more amount of heap size for certain programs/applications requiring more no. of objects. So, if there is an option to destroy objects programatically, we might also have had enforced some thing like finnally/finalize to destroy objects. This might have even been made a convention. This might affect performance during run time due to extra processing, but on the other end, lack of this causes overflow and requires better hardware all the time for high weight applications. What say? I might be wrong.

Lakshmi Narayanan N said...

You are right. Interestingly, the JVM specification does not enforce the existence of a garbage collector. As long as infinite memory is invented, JVM implementors have to live with some garbage collectors. That said, in real time there are two heuristics that comes into the picture.
1. Most objects are short lived
2. Little objects have more longevity.

I would like to stress that this is not an assumption but a research study.

Yes, Heavy weight applications are going to require more memory. Even if you provide programmatic way to initiate GC, you will still run into resource issues.

The current JVM throws OutOfMemoryError only when the resources are not available. Before throwing error, it makes every possible attempt to continue the application. I am going to write more on GC in coming article and keep watching. you might get an answer.