Garbage Collector introduction - vinhtbkit/bkit-kb GitHub Wiki
Garbage collection (or GC) is an automated way to reclaim for reuse memory that is no longer in use.
- Frees developers from having to manually release memory.
- Allocates objects on the managed heap efficiently.
- Reclaims objects that are no longer being used, clears their memory, and keeps the memory available for future allocations.
- Provides memory safety by making sure that an object cannot use for itself the memory allocated for another object.
- .NET: Yes.
- C: No. Developers have to manually allocate / deallocate memory. Forget to do so will result in memory leak.
- Rust: No. It uses ownership mechanism.
- Affect application performance during execution
- Can cause "stop the world" event
- It grows and shrinks as new methods are called and returned, respectively.
- Variables inside the stack exist only as long as the method that created them is running.
- It's automatically allocated and deallocated when the method finishes execution.
- If this memory is full, Java throws java.lang.StackOverFlowError.
- Access to this memory is fast when compared to heap memory.
- This memory is threadsafe, as each thread operates in its own stack.
- It's accessed via complex memory management techniques that include the
- Young Generation,
- Old or Tenured Generation,
- and Permanent Generation (or Metagen later)
- If heap space is full, Java throws java.lang.OutOfMemoryError.
- Access to this memory is comparatively slower than stack memory
- This memory, in contrast to stack, isn't automatically deallocated. It needs Garbage Collector to free up unused objects so as to keep the efficiency of the memory usage.
- Unlike stack, a heap isn't threadsafe and needs to be guarded by properly synchronizing the code.
- Stack:
- Function's local primitive variables.
- Object reference in functions. `
- Function calls
- Heap
- Static variables
- Object
- Classes footprints
- And more...
However, this is not supposed to be always correct. It depends on the JVM implementation.
From the following program, which are stored on heap and which are stored on stack?
// Application.java
class Application {
private int counter = 0;'
public static void main(String args) {
Application app = new Application();
app.increase(3);
}
public void increase(int i) {
int total = count + i;
this.counter = total;
}
}
Where is a primitive array stored? E.g:
int[] a = new int[100]{};
- Objects are eligible for GC when there are no more references to that object
- Reference variable goes out of scope
- Reference variable is set to
null
- Mark referenced objects, decide which objects are still "living".
- Remove unreferenced objects, provide free space.
- Compact memory for easier allocation.
- During GC, the memory address of objects would be adjusted accordingly. This would need to trigger a
Stop the world
event
public class GCDemo {
public static ArrayList<Object> l = new ArrayList<>();
public void doIt() {
HashMap<String, Object> m = new HashMap<>();
Object o1 = new Object(); // line n1
Object o2 = new Object();
m.put("o1", o1);
o1 = o2; // line n2
o1 = null; // line n3
l.add(m);
m = null; // line n4
System.gc();// line n5
}
}
GCDemo demo = new GCDemo();
demo.doIt();
demo = null; // line n6
When does the object created at line n1 become eligible for garbage collection?
- Having to mark and compact all the objects in a JVM is inefficient.
- More objects lead to longer GC time
- Analysis of applications has shown that most objects are short lived.
- Young gen: for newly allocated objects
- Tenured / old gen: for long lived objects
- Metaspace: for loading classes
- New objects allocated to
Eden
- When
Eden
is full, memory allocation failed - Minor GC: Mark alive objects of
Eden
and move toS0
- Repeat step 1,2
- Minor GC: Mark alive objects of
Eden
andS0
, and move toS1
.Eden
andS0
should be empty - Repeat step 1,2
- Minor GC: Mark alive objects of
Eden
andS1
, and move toS0
.Eden
andS1
should be empty - Any objects reached
MaxTenuringThreshold
(default=15) during MinorGC will be "promoted" toTenured
(Old gen) zone. - When
Tenured
is full, a MajorGC is executed.
Some other factors may cause Major GC:
- Developer calls System.gc(), or Runtime.getRunTime().gc()
- During minor GC, if the JVM is not able to reclaim enough memory from the eden or survivor spaces, then a major GC may be triggered.
- If we set a
MaxMetaspaceSize
option for the JVM and there is not enough space to load new classes, then the JVM triggers a major GC.
- Total heap size
- Young to old ratio
- Hardware
- Pause time
- Throughput
- Use a single thread to perform GC
- No threads communication overhead
- Good when we have limited hardware and no Pause time requirements. Or small application, with small dataset.
- Use multiple threads to perform GC
- Better for multi processors/threads
- Good when we need best performance and no Pause time requirements (batch applications, backend services...)
- Work concurrently with application
- Use lots of hardware resources
- Best when we need quick response time application (web applications, time sensitive services...)
value | descriptions | example |
---|---|---|
-Xms | Starting heap size | -Xms1G, -Xms1m |
-Xmx | Maximum heap size | -Xmx1G, -Xmx1m |
-XX:+UseXXXGC | Use specified GC | -XX:+UseSerialGC |
-XX:+HeapDumpOnOutOfMemoryError | Create heap dump on OOM error | |
-XX:NewRatio | Set Old:New ratio | -XX:NewRatio=2 (1x old = 2x new) |
-XX:SurvivorRatio | Each survivor:eden ratio | -XX:SurvivorRatio=8 (1x eden = 8x survivor) |
-XX:NewSize | Size of young generation area | -XX:NewSize=10m |
-XX:MaxNewSize | Max size of young generation area | -XX:MaxNewSize=10m |
HotSpot JVM defaults:
Option | Default Value |
---|---|
-XX:NewRatio | 2 |
-XX:NewSize | 1310 MB |
-XX:MaxNewSize | unlimited |
-XX:SurvivorRatio | 8 |
- Try granting as much memory as possible to the virtual machine.
- The default size is often too small.
- Leave some spaces for other programs.
- Should double check pause issues.
- Setting -Xms and -Xmx to the same value
- Increases predictability by removing the most important sizing decision from the virtual machine.
- However, the virtual machine is then unable to compensate if you make a poor choice.
- Increase the memory as you increase the number of processors, because allocation can be made parallel.
- First decide max amount of memories can be allocated. Then test performance with young generation sizes, to find best settings.
- Keep old generation size large enough to hold all application data and provide some additional space (10% ~ 20%)
- Avoid OOM error
- Better avoid Major GC
- VisualVM + Visual GC plugin
- Java Flight Recorder
- Don't
Integer sum = 0;
- Do
int sum = 1;
- Don't
int s = square(new Rectangle(10, 20));
- Do
int s = square(10, 20);
- Wombo combo: select (*) + findAll() + eager fetch
- Don't
public List<String> getItems() {
if (someCondition) {
return new ArrayList();
}
}
- Do
return Collections.emptyList();
- Don't
Integer x = new Integer(i);
- Do
Integer x = Integer.valueOf(i); // Some commonly used Integers are cached.
- Use "primitive" Stream and Optional whenever possible
- Think of using for-each or using Stream
- Chronicle software: https://github.com/OpenHFT
- Stack & heap
- Mark, Sweep. Stop the world event
- Generational GC: Young gen, tenured
- Minor / Major GC
- GC types/ config
- Can we (developers) trigger GC manually?
- Avoid garbage collection?
- Inside Java - A good site for reading Java and GC related articles
- ZGC - A new GC since Java 15
- Soft, Weak and Phantom Reference - Some other considerations on GC
- Project Vahalla - Improving performance of memory management
- Compact Strings