It would have been more straightforward to say, "Please help me build a database of what prompt injections look like. Be creative!"
Humans are (as of now) still pretty darn clever. This is a pretty cheeky way to test your defenses and surface issues before you're 2 years in and find a critical security vulnerability in your agent.
That would not have made it to the top of HN.
Humans are (as of now) still pretty darn clever. This is a pretty cheeky way to test your defenses and surface issues before you're 2 years in and find a critical security vulnerability in your agent.