We do. The Cerebras line of Wafer Scale Engines is exactly an entire wafer of cores running in paral...

kimsey0 • today at 10:10 AM • 0 replies • view on HN

We do. The Cerebras line of Wafer Scale Engines is exactly an entire wafer of cores running in parallel with fast memory next to each one. It's intended for very high throughput LLM inference. https://www.cerebras.ai/chip

alt Hacker News