Can you ELI5 why this is so slow for local inference but so fast for using hosted models?

alt Hacker News

habosa • today at 3:57 AM • 0 replies • view on HN