Wait, this is incredible. I have a spare 5090 lying around and run a claw-like on my M4 Mini. Just p...

arjie • yesterday at 6:20 PM • 2 replies • view on HN

Wait, this is incredible. I have a spare 5090 lying around and run a claw-like on my M4 Mini. Just plugging it into some sort of 3D print frame for stability and plugging it into the TB port might get me a pretty viable tool for local inference. Would need something neat to ensure the power etc. is well fed.

The problem is `max-num-seqs` and `max-model-len` fight each other, and unless you're in the pure single-client mode you'll need multiple slots so to speak.

Replies

pat_space • yesterday at 8:18 PM

If you get too busy to take advantage, I'll take that spare 5090 off your hands, free of charge :)

originalvichy • yesterday at 10:18 PM

Whilw you can just print something, look into eGPU enclosures. Modern cards are xboxhueg but maybe someone has one lying around and it might help with sound and airflow

alt Hacker News

Replies