For anyone who is curious about the first paragraph here, this is actually a great video overview of how LLM works and the tokenization part.
Tangentially related: This part always seemed fuzzy to me, especially when dealing with data scientists and how they talk about how 'ML' looks at problems. I had this issue when working at a SIEM vendor where they kept going on about use case development having to be designed a certain way to catch things. It was all very frustrating.
> this is actually a great video overview of how LLM works and the tokenization part
Did you mean to link to the video? I would be interested.