I also thought of these after writing my comment. The main problems that I see with these solutions are:
- Training seems to need a lot of data available at the same time, which is difficult to handle on commodity hardware.
- Manual curation can be a mind-numbing task, it might need to be gamified somehow.
There is a chance that the curation could be higher quality than the current corporate stuff. Pretty sure that it's not an intrinsic property of LLMs to write like TED talks.