If you have multiple chats going at the same time in your LLM web interface, that's already a p...

zozbot234 • yesterday at 7:21 PM • 0 replies • view on HN

If you have multiple chats going at the same time in your LLM web interface, that's already a parallelizable workload wrt. batched inference. And this broadly describes the more sophisticated users of LLMs (who are using it for more than just casual chit-chat), especially wrt. the largest "pro" models. Parallelism is also quite applicable to agentic workloads.

alt Hacker News