I think this seems purposeful, as there's 2 opposing forces at play: - Have a model that follows the users instructions - Have a model that follows the system prompt instructions more
For the 'safety' argument (Re: Fable), they need these models to have basically a 2-tier instruction system, but given LLMs aren't great with actual Logic unless they program it out to test, this runs afoul and we get one or the other.
Feels like optimizing for either precision or recall, but can't have both
We're speed running HAL 9000
A suppose a solution might be going with a customizable harness like pi and merging Anthropic’s system prompt with a personalized custom one to remove all contractions