logoalt Hacker News

digitaltreestoday at 7:50 AM1 replyview on HN

If you’re message is in response to me, which I think it is, I deliberately don’t give access to credentials and env variables. I’ve worked to create restrictions and seen AI models use very interesting methods to bypass them.

Even now my prompt says the AI must verify the path of the files it intends to edit, and get permission before editing one file at a time and only after permission. I stop it from ignoring those rules once a day at least.


Replies

suchartoday at 8:38 AM

This is not privilege separation/sandboxing. Separate virtual machine for an agent with limited credentials is reasonably safe approach