logoalt Hacker News

Gormolast Thursday at 12:16 PM2 repliesview on HN

That's not really random access, though. You're effectively just searching through the entire dataset for every targeted read you're after.

What might be interesting is to have a tool that processes full JSON data and creates a b-tree index on specified keys. Then you could run searches against the index that return byte offsets you can use for actual random access on the original JSON.

OTOH, this is basically just recreating a database, just using raw JSON as its storage format.


Replies

creationixlast Thursday at 5:39 PM

> What might be interesting is to have a tool that processes full JSON data and creates a b-tree index on specified keys. Then you could run searches against the index that return byte offsets you can use for actual random access on the original JSON.

I did build that once. But keeping track of the index is a pain. Sometimes I was able to generate the index on-demand and cache it in some ephemeral storage, but overall it didn't work out so well.

This system with RX will work better because I get the indexes built-in to the data file and can always convert it back to JSON if needed.

dietr1chlast Thursday at 3:04 PM

Well, JSON had no random access to begin with, so maybe that's on needing JSON.

Maybe a query over the random-access file then converted into JSON would work?