Java Garbage Collection Explained: What You Need to Know

·

7 min read

Java Garbage Collection Explained: What You Need to Know

Introduction

In Java, memory management is an important aspect of application performance. One of the key features for managing memory automatically is Garbage Collection (GC). It eliminates the need for developers to manually manage memory allocation and deallocation, improving overall application stability.

In this blog, we'll take a deep dive into how GC works, its connection to heap memory, and the various strategies and algorithms used by the JVM to manage memory efficiently.

What is a Garbage Collection?

Garbage Collection refers to the automatic process by which Java removes objects that are no longer in use or reachable from the root, thus freeing up memory.

  • Goal: To reclaim memory occupied by unused objects, reducing the risk of memory leaks and optimizing memory utilization.

  • How it works: The JVM identifies objects no longer needed and removes them. This allows the system to reuse memory, thus ensuring that your program doesn't run out of space.

Example:

public class GarbageCollectionExample {
    public static void main(String[] args) {
        ExampleObject obj = new ExampleObject();  // Object creation
        obj = null;  // Object becomes unreachable
        // GC happens here
        // GC is Automated task done by JVM at perodic intervals.
    }
}
class ExampleObject {
    // Class implementation
}

In the example above, when obj is set to null, the object is no longer reachable, and the JVM may choose to collect it.

Heap and Garbage Collection

In Java, the heap is the memory region where all objects are allocated. Garbage Collection operates on this heap to clean up unused objects.

  • Heap Memory: The heap is divided into several regions, which play distinct roles in the garbage collection process.

    Key Points:

  • The Young Generation holds newly created objects.

  • The Old Generation holds objects that have lived longer.

  • The Metaspace holds metadata about classes and methods.

Structure of the Heap

The heap is divided into the following regions:

Note: The reason for these divisions is explained below in sections

a. Young Generation

  • The Young Generation is where all new objects are allocated.

  • It's further divided into:

    • Eden Space: The initial space where objects are created.

    • Survivor Spaces (S0 and S1): Objects that survive a garbage collection in Eden are moved here. Only one is used at a time. After several cycles between s0 and s1 when some object age hits the threshold value then those objects are moved into the old generation space.

(Note: Check the above image showing Eden, S0, and S1 spaces in the young generation of Heap)

b. Old Generation

  • The Old Generation holds objects that have existed for some time and survived several GC cycles.

  • Objects that survive multiple garbage collections in the Young Generation are eventually moved to the Old Generation.

c. Metaspace

  • In Java 8 and later, the Metaspace is used to store class metadata.

    ex: details of the class, static variables and methods of a class, etc.

  • Unlike the heap, the Metaspace is not limited by the heap size, and it grows dynamically as needed.

Types of Garbage Collection Events

Garbage Collection is categorized into different types based on the region of the heap being collected. Below are the events based on which we categorized heap as young generation or old generations.

a. Minor GC

  • Focuses on cleaning up the Young Generation, especially the Eden Space.

  • Trigger: This happens when the Eden space is full.

  • Efficiency: Faster, as it only involves the Young Generation.

  • The frequency of minor gc is more than major gc because of newly created objects.

b. Major GC

  • Cleans the Old Generation.

  • Trigger: Occurs when the Old Generation is full.

  • Impact: Slower and more resource-intensive than Minor GC.

  • The periodic cleanup of the old generation is much lower because technically here we have those objects that survived multiple cycles of survivor space which shows these objects are important or frequently used.

c. Full GC

  • A Full GC involves cleaning both the Young and Old Generations.

  • Impact: This can cause significant pauses in the application, and is generally more costly in terms of time and resources.

Garbage collection is an expensive process.

Why Different Types of Garbage Collection?

The Young Generation and Old Generation differ in the types of objects they hold. Since short-lived objects are common in most applications, the Young Generation needs more frequent GC to reclaim memory. On the other hand, long-lived objects are less frequent but need to be cleaned less often.

By separating these two types of objects, the JVM (Java Virtual Machine )can optimize memory management, making Minor GC faster and more efficient, while Major GC happens less often, which is suitable for long-lived objects.

Garbage Collector Algorithms

Different Garbage Collection algorithms are employed based on the performance needs of the application here are some of the most used and popular.

  1. Mark and sweep algorithm

    A fundamental algorithm where objects are marked as reachable and unreachable. Unreachable objects are then swept away.

  2. Mark and sweep with compaction

    It's a variation of the Mark and Sweep algorithm. It adds a compaction phase after the sweep phase. After sweeping, the garbage collector moves all live objects to one end of the memory. This creates a contiguous block of free memory at the other end, eliminating memory fragmentation. It provides a significant reduction in fragmentation However compaction adds overhead to the garbage collection process.

  3. Copying

    The heap is divided into two sections: the "from-space" and the "to-space". Objects are copied from the "from-space" to the "to-space", leaving behind the unreachable ones.

Types of Garbage Collection :

Some of the most used and popular versions of the Java garbage collection are given below.

  1. Serial GC

  • Suitable for single-threaded applications.

  • Simple and efficient but can cause longer pauses.

  1. Parallel GC

  • Uses multiple threads for garbage collection.

  • Reduces pause times compared to Serial GC.

  • Improves performance on multi-core systems.

  1. CMS (Concurrent Mark-Sweep) GC

  • Designed for low-latency applications.

  • Runs in parallel with the application threads, reducing pause times.

  1. G1 (Garbage First) GC

  • Optimized for applications with large heaps.

  • It divides the heap into regions and prioritizes the collection of most full regions.

5. ZGC (Low-latency Garbage Collector)

  • A low-latency garbage collector that works well in applications requiring minimal pause times.

6. Shenandoah GC

  • Similar to ZGC, it aims for low pause times and is used in real-time applications.

Heap Tuning and JVM Options

You can adjust heap sizes and garbage collection behaviors using JVM options:

  • Initial Heap Size: -Xms (e.g., -Xms512m).

  • Maximum Heap Size: -Xmx (e.g., -Xmx1024m).

  • Young Generation Size: -Xmn to specify the size of the Young Generation.

  • Metaspace Size: -XX:MetaspaceSize and -XX:MaxMetaspaceSize to manage the Metaspace size.

Example:

java -Xms512m -Xmx2048m -XX:+UseG1GC MyApp

This will start the Java application with a minimum heap size of 512MB and a maximum of 2GB, using the G1 GC.

Common Issues and Best Practices

a. Memory Leaks

Memory leaks occur when objects that are no longer needed are not cleared by GC. You can detect leaks using heap dumps and profiling tools like JVisualVM.

b. Reducing GC Pauses

Minimize long GC pauses by:

  • Tuning heap sizes.

  • Using low-latency GC algorithms like G1, ZGC, or Shenandoah.

Monitoring and Debugging

To monitor and analyze garbage collection, you can use tools like:

  • JVisualVM: A tool for monitoring JVM performance.

  • GC Logs: Enables logging of garbage collection events.

  • Eclipse MAT: A memory analyzer for finding memory leaks.

Conclusion

Garbage Collection in Java plays a crucial role in memory management by automatically freeing up memory that is no longer in use. Understanding heap structure, GC events, and algorithms allows you to optimize memory management for better application performance.

By tuning the heap and choosing the right GC strategy, developers can ensure that their applications run efficiently without manual intervention.