> The compiler may also flatten a loop.
http://c2.com/cgi/wiki?SufficientlySmartCompiler
In practice, C compilers are still notoriously bad at loop optimizations.
Polyhedral optimizations provided some hope, but no compiler managed to adopt it in production.
Maybe, but also irrelevant to the discussion because whether you write mat[b * A + a] by hand or mat[b][a] and let the compiler frontend expand then makes no difference to the optimizer.