You can almost tell the "era" that a solution was built in these days since things are changing so fast.
Mid-2026, we have very large context windows, and much smarter models than we did in 2024 when this was built. If I were to tackle this today I'd ask a current frontier model to work through the source data and design a hierarchy that would give it the ability to sift through the content itself by drilling down as it sees fit, and I expect it would nail that.
i can almost tell then you have not done anything like this in production scale. context window size is irrelevant.
It would not, and you would know that if you actually evaluated the results.