I don’t understand how lines of code matter at all for scary LLM core capabilities. Does the transformer architecture get better with more lines of code?
My impression was that LLM training codebases were 99% resource management and only a few lines actually implement the core training algorithm, which is where 100% of the intelligence comes from. Data, not lines of code, are the constraint.
After training you can adapt the intelligence in various ways, and that takes a bunch of lines of coded too. But you cant raise the intelligence ceiling again without another training run. So where is the scary recursive part?