What exactly is this:
Is this fast GPU like instructions that anyone can use in any operating system to run any open sourced LLMs on CPU with all your RAM rather than on a discrete GPU with its own limited amount of VRAM?
Or is this a proprietary thing that only works in Windows for some specific use cases and irrelevant for Linux users?
it's a distinct piece of hardware based on AMD XDNA architecture, which, coincidentally, much like CPUs, can tap into your RAM pool. there are XDNA drivers (`amdxdna`) for Linux.