Java is platform independent and Java Virtual Machine, the platform dependent component of Java gives this platform independency. JVM is exceptional user space process and it runs from mobile to high end servers. JVM has to bring in platform independency in Class Loading, Threading, Input/Output and Garbage Collection. Looking from the operating system's perspective Java Virtual Machine is just another process. But things are not that easy as they seem. Starting from the loading of classes to the recycling of objects, JVM behaves like an operating system by itself. In this series of blogs, we will discuss how the objects are re-cycled, various garbage collection algorithms and how to tune your JVM for better performance. Our discussions will be based on Sun JDK 1.5 (Tiger) and will be use open source tools as it is freely available in order everyone can use them.
Overview of GC
JVM is just a mighty big process (or it could be a tiny process running in a mobile phone) – that’s the operating system’s perspective. JVM internally does a lot of magic but nothing is visible to the operating system. The operating system just services the JVM. Garbage Collection and Threads are integral part of JVM. There are at least few threads runs when the JVM alive and GC is one among them. But GC is a low priority thread and JVM runs it only on need basis. JVM will try to run GC only when there is memory scarcity. But when it runs, it stalls the other threads in the JVM. The user program creates the objects and uses them as long as they want. The user program de-references it when they no longer need it and JVM takes care of deleting the unused objects executing the finalizer.
There is question that comes to our mind immediately. Who decides the eligibility for the garbage collection? Or how the objects are picked up for deletion? Though JVM automatically deletes the unwanted object but it is developers’ responsibility to say that he/she does not want the objects anymore. The developers need not tell this explicitly but JVM understands it implicitly. Among the objects in the JVM, JVM chooses few objects as special ones and name them as “Root Objects”. Usually the local references of all the stack frames (local variables of all the methods that haven’t exited), string objects in constant pool of the class and the class variables or static variables will be termed as “Root Objects”. When JVM wants to do the garbage collection, it removes the unused object from the Heap. All the objects that are reachable from any of the root objects directly or any objects that are chained with the reachable objects are the objects that are currently being used. All other objects that are not reachable from root objects either directly or through object that are linked with root objects are termed as “Unused objects”. During garbage collection, the JVM marks the unused objects and deletes them freeing up memory. But the algorithm of finding the unused objects and deleting them greatly varies from implementation to implementation. Even a single JVM implementation might have many GC algorithms which can be used based on the application and situation. Before deleting, JVM checks whether the object has finalize. If it has one, JVM postpones the freeing up of the object until finalizer is run. So the objects which have finalizer, GC is has one additional step, that is, invoking finalizer. Until then JVM does not deletes the object.
So far we discussed theoretical aspect of the garbage collection process, the rest of this section explains with an example.
Looking at the figure, the objects that are yellow are the root objects. All the objects that are chained with root objects are currently being used (that are represented in pink color). The objects that are not reachable from any root object directly or indirectly are unused objects which are represented as dotted circle. There are chances for unused object being linked with each other but still they are unused objects and eligible for the garbage collection. The point is, the objects should be reachable from the any of the existing stack frame (Each method when invoked, JVM pushes a stack frame that contains local variable, operand stack).
Eligibility for GC
In the last section, we discussed GC from 10,000 feet. In this section we are going to see Java program and identify the objects that are eligible for garbage collection at various stages of the program. By “Eligible for GC”, we should understand that the objects are only eligible for GC and we are telling JVM that we no longer need the objects. It is up to JVM to delete those objects and recycle the memory. JVM will make every possible attempt to recycle memory.
Consider the above code, at line 18, 19 and 20 we are creating objects. Assume that the control is at line 21 after executing 20. At this point of time, we have references to four objects referred by “args”, “string”, “i” and “j”. So there are four root objects. All the root objects and the objects that are reachable from the root objects are not eligible for GC. Hence at line 20, there are no objects eligible for GC.
Consider the above code, at line 18, 19 and 20 we are creating objects. Assume that the control is at line 21 after executing 20. At this point of time, we have references to four objects referred by “args”, “string”, “i” and “j”. So there are four root objects. All the root objects and the objects that are reachable from the root objects are not eligible for GC. Hence at line 20, there are no objects eligible for GC.
When JVM executes the method “display”, the control goes to the method where it has references to the objects that are passed as arguments. Apart from the arguments, the method “display” has one more local variable “string” which also becomes a root object. After executing line 32, the total root object becomes 5. At line 34 and 37, even though the object referred by “i” and “j” are de-referenced, the objects pointed by “i” and “j” cannot be garbage collected as the method “main” already has a reference. At line 40, the object referred by “string” is eligible for garbage collection. Once again at line 43, the object referred by “str” is not eligible has the method “main” has a reference to it.
When the control gets back to the method “main”, after executing line 23, the object referred by the variable “i” becomes eligible for GC. Subsequently, the objects referred by variable “j” and “string” become eligible for GC when the line 25 and 27 are executed respectively.
To summarize
- The root object is decided not only passed on the local variable of the method being executed. JVM goes through the entire stack to find the root objects.
- Apart from the stack, JVM also looks into “static” variable or class variables and keep them as root variables. So the static variables will be eligible for garbage collection when the class is unloaded from JVM
- Few JVM will implement the method area in Java Heap. That is the JVM allocated memory to hold code in the heap. Those objects are also becomes eligible for GC when the classes are unloaded.
Questions for Understanding1. What is Garbage Collection?
2. How the objects are recycled?
3. What is the standard garbage collection algorithm recommended by Java Virtual Machine Specification?
4. Elaborate the object lifecycle.
5. What are the candidates of root objects? How they affect the garbage collection?
Question for Thinking
In your JVM, you have few objects that are eligible for the garbage collection. JVM also runs garbage collector. Now the question is, whether all the objects that are eligible for GC will be garbage collected?
Answer in single word "yes" or "no".
If you are not sure on what to answer, the next blog will open some of the concepts and eventually you will answer the question.
Keep Watching and Have a Great Day