Most LLMs are trained on a lot of the source code for many open-source projects. This 'project&...

tavavex • last Thursday at 5:41 PM • 2 replies • view on HN

Most LLMs are trained on a lot of the source code for many open-source projects. This 'project' has the whole song-and-dance about never seeing the source code and separating the system to skirt around legal trouble. Why didn't anyone do that yet?

Replies

imiric • last Thursday at 5:55 PM

Because that's impossible. Any "robot" that can generate code must be trained on massive amounts of code, most of which is open source.

➕ show 1 reply

preisschild • last Thursday at 6:47 PM

not a lot of code is public domain and thus not a lot of training data is available

alt Hacker News

Replies