logoalt Hacker News

jonasnlast Tuesday at 2:23 PM4 repliesview on HN

Hi HN, I'm the author of this post and a JVM engineer working on OpenJDK.

I've spent the last few years researching GC for my PhD and realized that the ecosystem lacked standard tools to quantify GC CPU overhead—especially with modern concurrent collectors where pause times don't tell the whole story.

To fix this blind spot, I built a new telemetry framework into OpenJDK 26. This post walks through the CPU-memory trade-off and shows how to use the new API to measure exactly what your GC is costing you.

I'll be around and am happy to answer any questions about the post or the implementation!


Replies

yunnpptoday at 2:22 AM

Hey, noob question, but does OpenJDK look at variable scope and avoid allocating on the heap to begin with if a variable is known to not escape the function's stack frame?

Not strictly related to this post, but I figured it'd be helpful to get an authoritative answer from you on this.

exabrialtoday at 2:20 AM

I just want to say this is an incredibly detailed, well written, and beautifully illustrated article. Solid work.

spockzyesterday at 9:54 PM

Thank you for this interface! It will definitely help in tracking down GC related performance issues or in selecting optimal settings.

One thing that I still struggle with, is to see how much penalty our application threads suffer from other work, say GC. In the blog you mention that GC is not only impacting by cpu doing work like traversing and moving (old/live) objects but also the cost of thread pauses and other barriers.

How can we detect these? Is there a way we can share the data in some way like with OpenTelemetry?

Currently I do it by running a load on an application and retaining its memory resources until the point where it CPU skyrockets because of the strongly increasing GC cycles and then comparing the cpu utilisation and ratio between cpu used/work.

Edit: it would be interesting to have the GC time spent added to a span. Even though that time is shared across multiple units of work, at least you can use it as a datapoint that the work was (significantly?) delayed by the GC occurring, or waiting for the required memory to be freed.

show 1 reply
latchkeytoday at 2:15 AM

I built this 15 years ago and it got fairly popular, but is long dead now...

https://github.com/jmxtrans/jmxtrans

Kind of amazing how people are still building telemetry into Java. Great post and great work. Keep it up.