I'm still not sure what safeguards they can be adding here. Unless they've suddenly solved alignment, at best isn't it a collection of system prompts saying what not to do and potentially some screening algorithms that try to catch key phrases in inputs/outputs?