logoalt Hacker News

conquera_aitoday at 5:59 PM1 replyview on HN

Feels like we’re repeating classic distributed systems lessons: assume failure, constrain blast radiusand never trust components that can’t explain themselves reliably


Replies

ibrahimhossaintoday at 6:13 PM

Exactly assuming failure and constraining the blast radius feels like the only reliable path when the models themselves are black boxes. Patch based alignment starts looking fragile pretty quickly