That doesn't sound right. For example, there's plenty of software with the correct observable behavior which leaks credentials. So what needs to be captured goes beyond observable behavior.
Certainly you could write specification for a piece of software, and the software could meet the specification while also leaking credentials. Obviously, that would be a problem. But at some point, this starts to feel artificial and silly. The same software could reformat your hard disk, right?
At some point, we aren’t discussing whether or not AI is doing a bad job writing software. We’re discussing whether or not it’s actively malicious.
Leaking credentials is observable behavior.
Certainly you could write specification for a piece of software, and the software could meet the specification while also leaking credentials. Obviously, that would be a problem. But at some point, this starts to feel artificial and silly. The same software could reformat your hard disk, right?
At some point, we aren’t discussing whether or not AI is doing a bad job writing software. We’re discussing whether or not it’s actively malicious.