Garbage Collector introduction - sonyynosification/bkit-kb Wiki
Java Garbage Collector basics
Java system architecture
Overview
Hotspot JVM
GC introduction
What is Garbage Collector (GC) in Java?
Garbage collection (or GC) is an automated way to reclaim for reuse memory that is no longer in use.
Why do we need it
- Frees developers from having to manually release memory.
- Allocates objects on the managed heap efficiently.
- Reclaims objects that are no longer being used, clears their memory, and keeps the memory available for future allocations.
- Provides memory safety by making sure that an object cannot use for itself the memory allocated for another object.
Do other languages have GC?
- .NET: Yes.
- C: No. Developers have to manually allocate / deallocate memory. Forget to do so will result in memory leak.
- Rust: No. It uses ownership mechanism.
Problem with GC
- Affect application performance during execution
- Can cause "stop the world" event
Stack and heap
Stack
- It grows and shrinks as new methods are called and returned, respectively.
- Variables inside the stack exist only as long as the method that created them is running.
- It's automatically allocated and deallocated when the method finishes execution.
- If this memory is full, Java throws java.lang.StackOverFlowError.
- Access to this memory is fast when compared to heap memory.
- This memory is threadsafe, as each thread operates in its own stack.
Heap
- It's accessed via complex memory management techniques that include the
- Young Generation,
- Old or Tenured Generation,
- and Permanent Generation (or Metagen later)
- If heap space is full, Java throws java.lang.OutOfMemoryError.
- Access to this memory is comparatively slower than stack memory
- This memory, in contrast to stack, isn't automatically deallocated. It needs Garbage Collector to free up unused objects so as to keep the efficiency of the memory usage.
- Unlike stack, a heap isn't threadsafe and needs to be guarded by properly synchronizing the code.
Allocation
- Stack:
- Function's local primitive variables.
- Object reference in functions. `
- Function calls
- Heap
- Static variables
- Object
- Classes footprints
- And more...
However, this is not supposed to be always correct. It depends on the JVM implementation.
Challenge #1:
From the following program, which are stored on heap and which are stored on stack?
// Application.java
class Application {
private int counter = 0;'
public static void main(String args) {
Application app = new Application();
app.increase(3);
}
public void increase(int i) {
int total = count + i;
this.counter = total;
}
}
Challenge #2
Where is a primitive array stored? E.g:
int[] a = new int[100]{};
How GC works
Marking
- Objects are eligible for GC when there are no more references to that object
- Reference variable goes out of scope
- Reference variable is set to
null
- Mark referenced objects, decide which objects are still "living".
Normal deletion
- Remove unreferenced objects, provide free space.
Delete with compacting
- Compact memory for easier allocation.
- During GC, the memory address of objects would be adjusted accordingly. This would need to trigger a
Stop the world
event
Challenge #3
public class GCDemo {
public static ArrayList<Object> l = new ArrayList<>();
public void doIt() {
HashMap<String, Object> m = new HashMap<>();
Object o1 = new Object(); // line n1
Object o2 = new Object();
m.put("o1", o1);
o1 = o2; // line n2
o1 = null; // line n3
l.add(m);
m = null; // line n4
System.gc();// line n5
}
}
GCDemo demo = new GCDemo();
demo.doIt();
demo = null; // line n6
When does the object created at line n1 become eligible for garbage collection?
Generational Garbage Collection
- Having to mark and compact all the objects in a JVM is inefficient.
- More objects lead to longer GC time
- Analysis of applications has shown that most objects are short lived.
- Young gen: for newly allocated objects
- Tenured / old gen: for long lived objects
- Metaspace: for loading classes
- New objects allocated to
Eden
- When
Eden
is full, memory allocation failed - Minor GC: Mark alive objects of
Eden
and move toS0
- Repeat step 1,2
- Minor GC: Mark alive objects of
Eden
andS0
, and move toS1
.Eden
andS0
should be empty - Repeat step 1,2
- Minor GC: Mark alive objects of
Eden
andS1
, and move toS0
.Eden
andS1
should be empty - Any objects reached
MaxTenuringThreshold
(default=15) during MinorGC will be "promoted" toTenured
(Old gen) zone. - When
Tenured
is full, a MajorGC is executed.
Some other factors may cause Major GC:
- Developer calls System.gc(), or Runtime.getRunTime().gc()
- During minor GC, if the JVM is not able to reclaim enough memory from the eden or survivor spaces, then a major GC may be triggered.
- If we set a
MaxMetaspaceSize
option for the JVM and there is not enough space to load new classes, then the JVM triggers a major GC.
Types of GC
GC Performance factors
- Total heap size
- Young to old ratio
- Hardware
Performance goals
- Pause time
- Throughput
Serial GC
- Use a single thread to perform GC
- No threads communication overhead
- Good when we have limited hardware and no Pause time requirements. Or small application, with small dataset.
Parallel GC
- Use multiple threads to perform GC
- Better for multi processors/threads
- Good when we need best performance and no Pause time requirements (batch applications, backend services...)
G1 GC
- Work concurrently with application
- Use lots of hardware resources
- Best when we need quick response time application (web applications, time sensitive services...)
GC configurations
GC parameters
value | descriptions | example |
---|---|---|
-Xms | Starting heap size | -Xms1G, -Xms1m |
-Xmx | Maximum heap size | -Xmx1G, -Xmx1m |
-XX:+UseXXXGC | Use specified GC | -XX:+UseSerialGC |
-XX:+HeapDumpOnOutOfMemoryError | Create heap dump on OOM error | |
-XX:NewRatio | Set Old:New ratio | -XX:NewRatio=2 (1x old = 2x new) |
-XX:SurvivorRatio | Each survivor:eden ratio | -XX:SurvivorRatio=8 (1x eden = 8x survivor) |
-XX:NewSize | Size of young generation area | -XX:NewSize=10m |
-XX:MaxNewSize | Max size of young generation area | -XX:MaxNewSize=10m |
HotSpot JVM defaults:
Option | Default Value |
---|---|
-XX:NewRatio | 2 |
-XX:NewSize | 1310 MB |
-XX:MaxNewSize | unlimited |
-XX:SurvivorRatio | 8 |
General memory allocation guidelines:
- Try granting as much memory as possible to the virtual machine.
- The default size is often too small.
- Leave some spaces for other programs.
- Should double check pause issues.
- Setting -Xms and -Xmx to the same value
- Increases predictability by removing the most important sizing decision from the virtual machine.
- However, the virtual machine is then unable to compensate if you make a poor choice.
- Increase the memory as you increase the number of processors, because allocation can be made parallel.
General tuning strategy:
- First decide max amount of memories can be allocated. Then test performance with young generation sizes, to find best settings.
- Keep old generation size large enough to hold all application data and provide some additional space (10% ~ 20%)
- Avoid OOM error
- Better avoid Major GC
Tools
- VisualVM + Visual GC plugin
- Java Flight Recorder
Coding best practices
Use primitives when possible
- Don't
Integer sum = 0;
- Do
int sum = 1;
Avoid creating unnecessary objects
- Don't
int s = square(new Rectangle(10, 20));
- Do
int s = square(10, 20);
- Wombo combo: select (*) + findAll() + eager fetch
Use predefined instances or cached instances
- Don't
public List<String> getItems() {
if (someCondition) {
return new ArrayList();
}
}
- Do
return Collections.emptyList();
- Don't
Integer x = new Integer(i);
- Do
Integer x = Integer.valueOf(i); // Some commonly used Integers are cached.
Stream and Optional
- Use "primitive" Stream and Optional whenever possible
- Think of using for-each or using Stream
Use third party libs:
- Chronicle software: https://github.com/OpenHFT
Summary
- Stack & heap
- Mark, Sweep. Stop the world event
- Generational GC: Young gen, tenured
- Minor / Major GC
- GC types/ config
Common questions
- Can we (developers) trigger GC manually?
- Avoid garbage collection?
More readings
- Inside Java - A good site for reading Java and GC related articles
- ZGC - A new GC since Java 15
- Soft, Weak and Phantom Reference - Some other considerations on GC
- Project Vahalla - Improving performance of memory management
- Compact Strings