logoalt Hacker News

Eremyesterday at 8:49 PM1 replyview on HN

Why is it more monstrous to alter weights post-training than to do so as part of curating the training corpus?

After all we already control these activation patterns through the system prompt by which we summon a character out of the model. This just provides more fine grain control


Replies

astrangeyesterday at 9:32 PM

It would be more moral to give the LLM a tool call that lets it apply steering to itself. Similar to how you'd prefer to give a person antipsychotics at home rather than put them in a mental hospital.

show 1 reply