I don't think it would help. It's not just a software issue that can be fixed in the kernel, the hardware fundamentally isn't part of the cache coherency system of the CPU.
This is correct, look at IBM's CAPI for an example of the needed hardware
This is correct, look at IBM's CAPI for an example of the needed hardware