Nothing about PDF is easy. Similarly to what once Tom Scott said about time zones, every time I must deal with PDFs I pray that PDF.js can be hacked in to doing it instead, otherwise I just don’t bother.
It’s on of the few examples when converting it in to picture and chucking it in a multimodal llm is a more sensible solution than trying to parse it.