Building agents myself, this tracks. The issue isn't just that they violate constraints - it's that current agent architectures have no persistent memory of why they violated them.
An agent that forgets it bent a rule yesterday will bend it again tomorrow. Without episodic memory across sessions, you can't even do proper post-hoc auditing.
Makes me wonder if the fix is less about better guardrails and more about agents that actually remember and learn from their constraint violations.