logoalt Hacker News

bjconlanyesterday at 10:45 PM1 replyview on HN

This is great! I feel the same way about the deepseek v4 architecture for commodity hardware.

Also have enjoyed playing with https://huggingface.co/HuggingFaceTB/nanowhale-100m-base (but early days for me understanding this space)


Replies

kamranjonyesterday at 11:42 PM

Very cool! I had no idea that HF was doing this - I really love their small model experiments.