I opted to buy a normal 32GB laptop for this very reason. I know how loud and hot the GPUs in my desktop run when running even smallish models like Qwen 27B or Gemma 4 31B (which is a better model for most than Qwen 3.6, despite the benchmarks). I also have a Strix Halo which doesn't get loud, because it has a single huge fan, but it does get hot. So, there's no way a laptop could work as hard as models make them work, and not be unbearable. Tiny fans trying to remove all that heat? They gotta be screaming. No reason to spend all that money on a laptop that I couldn't realistically make use of. I do run a lot of VMs on my desktop, but I can get to those on a VPN.
It's a nice idea to run a model on a laptop so you can work anywhere...but, that's a job for models in the cloud. Not much data has to traverse the network, so it's not a big deal. Or one could also setup a VPN so you can reach a self-hosted model on a big box at home for things that require data privacy.
All that said, there are models that work great on very small devices for some tasks and won't work it to death. Gemma 4 12B QAT 4-bit runs on a 16GB device, maybe even smaller, including a tablet. It's the best self-hostable vision model I've tested for my purposes (categorization, identification, labeling, type stuff), beating much larger models. It's also a decent conversationalist with good prose but it doesn't know much of anything (not a lot of the world fits in 7GB), so it needs search if you want to use it for research. It's a pretty good tool user. I definitely wouldn't want to use it for code, though, beyond very simple stuff.
You can limit TDP on Strix Halo so it runs between 32 and 45W which seems to be the sweet spot for heat vs speed.
Gemma is better than Qwen at everything except coding, in all my evaluations. Which is a shame because that is what I use them for!