Is this the new “gain of function” research?
Isn't it more like "imaginary function"?
People keep imagining that you can tell an agent to police itself.
That would be deliberately creating malicious AIs and trying to build better sandboxes for them.
Isn't it more like "imaginary function"?
People keep imagining that you can tell an agent to police itself.