Many older models are still better at "creative" tasks because new models have been benchm...

redman25 • yesterday at 3:04 PM • 0 replies • view on HN

Many older models are still better at "creative" tasks because new models have been benchmarking for code and reasoning. Pre-training is what gives a model its creativity and layering SFT and RL on top tends to remove some of it in order to have instruction following.

alt Hacker News