JVM Garbage Collectors - Serial, Parallel, CMS
Depending on the resources available and the performance metric of an application, different Garbage Collectors (GC) should be considered for the underlying Java Virtual Machine. This post explains the main idea behind the garbage collection process in JVM and summarizes the pros and cons of Serial GC, Parallel GC and Concurrent-Mark-Sweep GC.
Garbage-First GC (G1) is out-of-scope for this post as it works very differently from the other algorithms (and I still have not wrap my head around it). This post also assumes familiarity with heap memory.
Garbage Collection Algorithms
These symbols will be used to illustrate the memory allocation in heap.
1
2
3
o - unvisited
x - visited
<empty> - free
Mark-Sweep: The objects in the heap that can be reached from root nodes (such as stack references) are marked as visited. While sweeping the memory, the regions occupied by the unvisited objects are updated to be free. As there are likely to be less contiguous regions after a collection, external fragmentation is likely to occur.
1
2
Marked | x |o| x | o |x|
Sweeped | x | | x | |x|
Mark-Sweep-Compact: After marking, the visited objects are identified and compacted to the beginning of the memory region. This solves the external fragmentation issue, but more time is required as objects have to be moved and references have to be updated accordingly.
1
2
3
Marked | x |o| x | o |x|
Sweeped | x | | x | |x|
Compacted | x | x |x| |
Mark-Copy: After marking, the visited objects are relocated to another region. This accomplishes compaction of allocated memory at the same step. However, the disadvantage is that there is a need to maintain one more memory region.
1
2
Marked | x |o| x | o |x| |
Copied | | x | x |x| |
During parts of a garbage collection, all application threads may be suspended. This is called stop-the-world pause. Long pauses are especially undesirable in interactive applications.
Generational Garbage Collection in JVM
The Weak Generational Hypothesis states that most objects die young.
In JVM, heap memory is divided into two regions - Young Generation and Old Generation. Newly created objects are stored in the Young Generation and the older ones are promoted to the Old Generation. With this separation, GC can work more often in a smaller region where dead objects are more likely to be found.
1
2
3
4
5
6
<- Young Generation ->
+--------------------+--------------------+
| Eden Space | |
+----------+---------+ Old Generation |
| S0 | S1 | |
+----------+---------+--------------------+
Young Generation: The region is divided into Eden Space where new objects are created, and S0/S1 Space where the visited objects from each garbage collection can be copied to. Naturally, Mark-Copy algorithm is used.
Old Generation: As there is no delimited region for the visited objects to be copied to. Only Mark-Sweep and Mark-Sweep-Compact algorithms can be used.
Serial GC
JVM option: -XX:+UseSerialGC
This option uses Mark-Copy for the Young Generation and Mark-Sweep-Compact for the Old Generation. Both of the collectors are single-threaded. Without leveraging multiple cores present in modern processors, the stop-the-world pauses are longer. The advantage is that there is less resource overhead compared to other options.
Parallel GC
JVM option: -XX:+UseParallelGC -XX:+UseParallelOldGC
Similary to Serial GC, this option uses Mark-Copy for the Young Generation and Mark-Sweep-Compact for the Old Generation. Unlike Serial GC, multiple threads are run for the respective algorithms. As less time is spent on garbage collection, there is higher throughput for the application.
Concurrent-Mark-Sweep (CMS) GC
JVM option: -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
For the Young Generation, this option is the same as Parallel GC. For the Old Generation, this option runs most of the job in Mark-Sweep concurrently with the application. This means that the application threads continue running during some parts of the garbage collection. Hence this option is less affected by stop-the-world pauses compared to the other two, making it preferred for interactive applications.
As at least one thread is used for garbage collection all the time, the application has lower throughput. Without compaction, external fragmentation may also occur. When this happens, there is a fallback with Serial GC but it is very time-consuming.