Java Object Size

Source code for objsize
Estimate the size of Java objects. In the past I have often used it to estimate the size of simple caches. The caches are mostly filled with static data from a DBMS and reloaded approximately every hour. This is old code. I remember implementing this when Java had an idea of permgen and not metaspace. Since the introduction of “metaspace”, class pointers always seem to be compressed by default.
Lately I used it to explain compact strings and string deduplication to a colleague(below).

The code assumes a normal JVM(64Bit) with compressed oops and compressed class pointers enabled. If the heap is larger than 32GB, none of the aforementioned is enabled. Everything is based on assumptions. If you need near-perfect values, you need to use JOL
Doesn’t work with class-objects, or special things like lamda arguments…

Object Header for a 5 byte array

Mark Word: 8 bytes on 64 bit. Stores different information e.g. hash, synchronization infos and gc stuff.
Compressed Class Pointer: 4 bytes. Points to class metadata, providing information about the object’s type and methods.
Size: 4 bytes. Length of the array
Bytes: 5 bytes. The actual data
Padding: 3 bytes. To align on 8 bytes

layout

Example:

PrintWriter out = new PrintWriter(System.out);
System.out.println("Size: " + SizeOfObj.sizeOf(new byte[] { 0, 1, 2, 3, 4 }, out, out) + " bytes");

Output:

[byte[]: #1
  0) : 0  ~1B
  1) : 1  ~1B
  2) : 2  ~1B
  3) : 3  ~1B
  4) : 4  ~1B
]  ~24B
ObjSize: ~24 Bytes

Count       Sum         DESCRIPTION 
1           24          [B          
1           24          Sum total

It is possible to change the alignment, if you need bigger heaps and still want to use compressed oops(a java references).
16 Bytes:

java version 21 (linux)

java -XshowSettings:vm -Xms33G -version

VM settings:
    Min. Heap Size: 33.00G
    Max. Heap Size (Estimated): 33.00G
    Using VM: OpenJDK 64-Bit Server VM

openjdk version "21.0.3" 2024-04-16
OpenJDK Runtime Environment (build 21.0.3+9-Ubuntu-1ubuntu122.04.1)
OpenJDK 64-Bit Server VM (build 21.0.3+9-Ubuntu-1ubuntu122.04.1, mixed mode, sharing)

java -XX:ObjectAlignmentInBytes=16 -Xmx63G -XX:+PrintFlagsFinal 2>/dev/null | grep -i usecompr
     
bool UseCompressedClassPointers     = true         {product lp64_product} {default}
bool UseCompressedOops              = true         {product lp64_product} {ergonomic}

A nice fun fact while testing this old code was that Java21 added a hash field to the Enum class. 4 bytes more memory usage, I think that was done by the guy who invented JOL.

{E: #1
  ^.hash : 0  ~4B
  ^.ordinal : 0  ~4B
  ^.name : {String: #2
    coder : 0  ~1B
    hash : 0  ~4B
    hashIsZero : false  ~1B
    value : [byte[]: #3
      0) : 69  ~1B
...

java version 17 (windows)

"C:\Program Files\Zulu\zulu-17\bin\java.exe" -XX:ObjectAlignmentInBytes=16 -Xmx63G -XX:+PrintFlagsFinal 2>NUL | findstr /i usecompr
bool UseCompressedClassPointers     = true          {product lp64_product} {ergonomic}
bool UseCompressedOops              = true          {product lp64_product} {ergonomic}

Compressed Oops: Optimizes memory usage by reducing the size of object references in the heap. Found in object fields and arrays, representing references to other objects within the heap.

Compact Strings and Deduplication

This example shows the compact string feature and string deduplication of the G1GC.
Compact Strings A 1-byte representation is used when there are only characters in the latin1 (ISO-8859-1) range. Unfortunately latin9(ISO-8859-15) is not used, a EUR sign (€) makes it fall back to UTF16 encoding ;-(

	private static final class TestStringDeDupUTF16 {
		ConcurrentLinkedDeque<String> str = new ConcurrentLinkedDeque<>();

		TestStringDeDupUTF16() 
		{
			for (int ia = 0; ia < 100; ia++) {
				for (int i = 0; i < 10; i++) {
                    // ISO-8859-1(Latin1) has no EURO-Sign => (UTF16)
					str.add(new String("String€Deduplication123" + i)); 
					// (8 + 4)(Header) + 4(array length) + 48(bytes) = 64 utf16
					}
			}
			try {
				// trigger deduplication to consolidate the underlying byte arrays
				// run with -XX:+UseG1GC -XX:+UseStringDeduplication
				System.gc(); Thread.sleep(300);
				System.gc(); Thread.sleep(300);
				System.gc(); Thread.sleep(300);
				/**
				 * With String deduplication:
				 * Count       Sum         DESCRIPTION 
				 * 10          XXX         [B
				 * Without:
				 * Count       Sum         DESCRIPTION 
				 * 1000        XXXXX       [B
				 */
				
			} catch (InterruptedException e) {
				e.printStackTrace();
			}
		}
	}
--
TestStringDeDupUTF16 dedupUtf = new TestStringDeDupUTF16();
long size = SizeOfObj.sizeOf(dedupUtf, out);
System.out.println("Size: " + size + " bytes");

Output(Deduplication has done its job):

Count       Sum         DESCRIPTION 
10          640         [B          
1           16          de.codecoverage.utils.TestDriver$TestStringDeDupUTF16
1000        24000       java.lang.String
1           24          java.util.concurrent.ConcurrentLinkedDeque
1001        24024       java.util.concurrent.ConcurrentLinkedDeque$Node
2013        48704       Sum total   
Size: 48704 bytes

€-Sign replaced by @-Sign:
Count       Sum         DESCRIPTION 
10          400         [B 
...

String Deduplication A function of the garbage collector that can “merge” the used byte arrays.

If you have many same strings you should use intern() or better yet consider a simple ConcurrentHashMap. This also allows the string objects to be removed from the heap and replaced with a simple oop (compressed 4 bytes).