The main problem with current AI reviewers isn't catching bugs, it's shutting up when there is no bug. Humans have an intuitive filter like "this code is weird, but it works and won't break prod, so I'll let it slide". LLMs lack this, they generate 20 comments about variable naming and 1 comment about a critical race condition. As a result the developer gets fatigue and ignores everything. Until AI learns to understand the context of importance, not just code context, it will remain an expensive linter