Nice work. pg_search has been on my radar for a while, having BM25 natively in Postgres instead of bolting on Elasticsearch is a huge DX win. Curious about the index build time on larger datasets though. I'm working with ~2M row tables and the bottleneck for most Postgres extensions I've tried isn't query speed, it's the initial indexing. Any benchmarks on that?
had a bad experience with pg_search (paradedb) in the past
Yep, there are numbers in the blog post and repo. We are able to index MS-MARCO v2 (138M documents, around 50GB of raw data) in a bit under 18 minutes.