The first form is easier to send to 32 beefy cores or 1024 small CPUs or a Beowulf cluster or a GPU or people sitting in a room.
It's been 15 years since I've last touched OpenMP, but the second form is trivially parallelizable as well. Besides, this parallelization can only ever properly work with arrays/vectors or, at the very worst, std::deque as its usually implemented (a vector of fixed-length arrays), not with e.g. linked lists or red-black trees, so why even bother with generic spans and algorithms?
For compilation?
Both of them have to be completely rewritten to make use of multiprocessing, so what exactly is the advantage?