logoalt Hacker News

Three Cache Layers Between Select and Disk

17 pointsby dltlast Monday at 1:48 PM2 commentsview on HN

Comments

mjbyesterday at 10:29 PM

Cool article!

> This is why free -h on a Linux box can look alarming. You see almost no “free” memory, but most of it is “available” - and the page cache is using it.

And other buffers and stuff too. This is a great thing on bare metal, because it's a bet that the marginal cost of using an empty memory page is zero. This is true on bare metal, always. But in containers, or multi-tenant infrastructure, that isn't true anymore. That's where stuff like DAMON come in: https://www.kernel.org/doc/html/v5.17/vm/damon/index.html

In Aurora Serverless this kind of page cache management is a critical part of what the control plane does. Essentially we need to size the page cache to be big enough for great performance, but small enough not to cost the customer unnecessarily. We go into quite a lot of detail on that in our VLDB'25 paper: https://assets.amazon.science/ee/a4/41ff11374f2f865e5e24de11...

> Linux fills free memory with page cache on purpose. It’s a bet: if someone reads this block again, I already have it.

This works because most database workloads have great temporal and spatial locality. And it works well. But it's also one of the biggest practical issues people run into with relational databases in production: performance is great until it isn't. The shared buffers and page cache keep reads to near zero, but when the working set grows even a tiny bit bigger, then the rate of reads can go up super quickly.

This is why in both Aurora Serverless and Aurora DSQL we do buffer and cache sizing very dynamically, getting rid of this cliff for most workloads.

kevin_nisbetyesterday at 11:41 PM

Thanks for sharing, this was a great read. And it brought up some old memories, of fighting with telecom vendors about how their calculated their memory usage. At that time you only really got what the vendors gave you.

I think the most fun I had with the page cache was when one of the vendors "fixed" a bug where the memory calculation has previously excluded the page cache. These boxes were all network services, they didn't rely on the disk for anything more than holding the binaries, configuration, and logs. Where the fun comes in is the first time you grep through the logs, which fills the page cache, and sets off the alarms that the cellular network is about to die. ;)

Anyways, important concept to know and understand when it comes to how software performs when interacting with a host.