Reminds me of when I tried to use the library of babel as a data compression tool. It led me down a fun rabbit hole and was my first introduction to information theory.
The conclusion being that you basically need the same amount of data to represent the address of your data as the data itself, so it's not really effective at compression, just a fun thought experiment.
The cool part of this in modern times is that LLMs are basically a form of lossy compression that actually achieves the gist of what these tools fail at. Although it is lossy, and requires a massive substrate. This is related to the idea of AI/LLMs being a form of language compression.
In some sense, science is the most extreme form of compression - Newtonian mechanics explains an incredible number of phenomena in a few lines of text.
That conclusion is similar to the concept of 'unconditional security' especially WRT one-time pads. The key must be at least as long as the message itself.
Other forms of encryption are based on assumptions and conditions being true (e.g. factoring is a hard problem, etc.) that may or may not be true. We don't know.
3Blue1Brown just released a viduo about this Intelligence-Compression connection.
The level of compression is pretty impressive when you think about it. I wrote a comment a while back which is still true (although bytes should be bits, so in that sense it’s still wrong): https://news.ycombinator.com/item?id=39559969
Back of the envelope calculation for storing valid 4-grams (sequences of four words) is around 10 billion x 14 bits per word = 17 gb for all 10 billion. There are LLMs 100x smaller which can write coherent prose.
LLM's seem to be the weird interesting outcome of applying lossy (de)compression concepts to text instead of the audio/image/video domains where they have traditionally been used.
If you set temperature to 0.0 you almost have a key-value store, but finding the right key for your value might take some effort.
> you basically need the same amount of data to represent the address of your data as the data itself
Almost like the other Borges work where “the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire”.
[flagged]
You'll find this an interesting watch:
Reinventing Entropy Compression is Intelligence Part 1
3blue1brown https://youtu.be/l6DKRf-fAAM?is=ne73FCJ7ErXhzZ-v