logoalt Hacker News

Nanobot: Ultra-Lightweight Alternative to OpenClaw

196 pointsby ms7892today at 9:39 AM104 commentsview on HN

Comments

yberrebytoday at 1:45 PM

Watching the OpenClaw/Molbot craze has been entertaining. I wouldn't use it - too much code, changing too quickly, with too little regard for security - but it has inspired me.

I often have ideas while cleaning around, cooking, etc. Claude Code (with Opus 4.5) is very capable. I've long wanted to get Claude Code working hands-free.

So I took an afternoon and rolled my own STT-TTS voice stack for Claude Code. The voice stack runs locally on my M4 Pro and is extremely fast.

For Speech to Text, Parakeet v3 TDT: https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3

For Text to Speech, Pocket TTS: https://github.com/kyutai-labs/pocket-tts

Custom MCP to hook this into Claude Code, with a little bit of hacking around to get my AirPods' stem click to be captured.

I'm having Claude narrate its thought process and everything it's doing in short, frequent messages, and I can interrupt it at any time with a stem click, which starts listening to me and sends the message once a sufficiently long pause is detected.

I stream the Claude Code session via AirPlay to my living room TV, so that I don't have to get close to the laptop if I need extra details about what it's doing.

Yesterday, I had it debug a custom WhatsApp integration (via [1]) hands-free while brushing my teeth. It can use `osascript` for OS integration, browse the web via Claude Code's builtin tools...

My back is thankful. This is really fun.

[1]: https://github.com/jlucaso1/whatsapp-rust

show 3 replies
johaugumtoday at 11:00 AM

Skimmed the repo, this is basically the irreducible core of an agent: small loop, provider abstraction, tool dispatch, and chat gateways . The LOC reduction (99%, from 400k to 4k) mostly comes from leaving out RAG pipelines, planners, multi-agent orchestration, UIs, and production ops.

show 3 replies
raphaelmolly8today at 5:02 PM

The 4k LOC claim is interesting but I think the real insight is about what you remove rather than what you keep. Looking at the codebase, they've essentially bet that LLMs with 100k+ context windows make most RAG pipelines redundant - just give the agent grep/rg and let it iterate.

What's clever is treating memory as filesystem ops rather than vector stores. For codebases this works great since code has natural structure (imports, function calls) that grep understands. The question is whether this scales to truly unstructured knowledge where semantic similarity matters.

Would love to see benchmarks comparing retrieval accuracy vs a proper embedding pipeline on something like personal notes or research papers.

show 1 reply
jannniiitoday at 11:11 AM

Okay so is this ”inspired” by nanoclaw that was featured here two days ago?

show 2 replies
loveparadetoday at 11:47 AM

What are people using these things for? The use cases I've seen look a bit contrived and I could ask Claude or ChatGPT to do it directly

show 7 replies
resonioustoday at 10:24 PM

I hate to side-track like this but I'm having trouble understanding the architecture diagram. LLM has 2 arrows to Tools - what does that mean? Similarly, Tools has both a doublesided arrow and an outgoing arrow to Context. Chat Apps having outgoing arrows to both Message and LLM also kinda tripped me up but I suppose you could say it's because the apps both provide messaging and context for the LLM.

vanillameowtoday at 11:18 AM

Yeah I mean idk, my takeaway from OpenClaw was pretty much the same - why use someone's insane vibecoded 400k LoC CLI wrapper with 50k lines of "docs" (AI slop; and another 50k Chinese translation of the same AI slop) when I can just Claude Code myself a custom wrapper in 30 mins that has exactly what I need and won't take 4 seconds to respond to a CLI call.

But my reaction to this project is again: Why would I use this instead of "vibecoding" it myself. It won't have exactly what I need, and the cost to create my own version is measured in minutes.

I suspect many people will slowly come to understand this intrinsic nature of "vibecoded software" soon - the only valuable one is one you've made yourself, to solve your own problems. They are not products and never will be.

show 5 replies
lxgrtoday at 1:47 PM

Can this be sandboxed? I've been running OpenClaw in a VM on macOS, which seems more resource intensive than necessary.

manwithmanyfacetoday at 1:30 PM

Is this something I run for my company in Slack, where employees send messages and the LLM processes the text, uses the functions I created to handle different tasks, and then responds back?

sally-suitetoday at 1:32 PM

Not bad, but I’m a bit skeptical. Is it mainly about the way of working in IM?

Tepixtoday at 1:28 PM

What are your solutions for if your AI bot wants to leak your credentials?

tunneytoday at 12:18 PM

Has anyone managed to get the WhatsApp integration working and chatting that way?

Aeroitoday at 1:21 PM

can anyone breakdown a comparison of multi-agent vs subagent?

looking for pro's and cons.

paweldudatoday at 5:32 PM

What? OpenClaw has 450kLoC? Why?

cpursleytoday at 2:32 PM

I'd like to see one of these in Rust (over Python, Node, etc) and in Apple's container environment.

FergusArgylltoday at 12:13 PM

The main novelty I see in openclaw is the amount of channels and how easy it is to set them up. This just has whatsapp, telegram & feishu

halfaxtoday at 5:33 PM

Bottom Line HAL‑AI‑2 is a real system. Nanobot is a toy. They are not peers. They are not even in the same category. Nanobot is useful only as a conceptual sketch of an agent loop. HAL‑AI‑2 is the substrate you’ve been building toward for months.