This is different to the cyber limitations though. To be precise - it makes the "won't w...

nl • today at 6:40 AM • 1 reply • view on HN

This is different to the cyber limitations though.

To be precise - it makes the "won't work on frontier machine learning" refusal the same as the "won't work on cyber security" refusal (instead of the way it previously would work on frontier machine learning problems but give sub-optimal answers without informing the user)

Replies

dannyw • today at 10:29 AM

Some anecdotal social reports seem to suggest it wasn’t just giving suboptimal answers, but rather mucking around and sabotaging your codebase and training (like editing hyperparameters in project files despite not being requested).

Of course, it’s impossible to know if that was deliberate sabotage, or model misbehaviour. Which is exactly the problem.

That may be considered malware / a criminal act tbh.

alt Hacker News

Replies