Heap dump analysis: Redundant String objects - Poweruser/MinetickMod GitHub Wiki
While hunting down the cause for a recent high memory usage on our server, I came across some interesting findings. One of them is an incredible redundancy of String objects when storing NBTTags in memory at runtime.
Both heap dumps below were taken right after server start. The first one without and the second one with this change:
Caching of highly recurrent Strings (NBTTag names and NBTTagString data), while loading NBT files
(This fix is not perfect though, it got a downside too, which is storing of NBTTagString data. Cases like “MSCorridor”, “MSRoom”, “MSCrossing” and some others in the structure data occur very frequently and must be cached, but there are alot of other String data that does occur only a few times. And those other strings bloat the HashMap unnecessarily)
The reason for this insane amount of identical strings is the creation of a new String object at the end of the method readUTF(DataInput) in java.io.DataInputStream:
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/java/io/DataInputStream.java#661
This is with just 1,2MB of Mineshaft.dat data!:
Before:
After:
The following screen is from a server instance that had been running (without the fix) for almost
a day already with about 9.8MB of Mineshaft.dat data: