I’ve been thinking about AI robotics lately… if internally at labs they have a GPT-2, GPT-3 “equivalent” for robotics, you can’t really release that. If a robot unloading your dishwasher breaks one of your dishes once, this is a massive failure.
So there might be awesome progress behind the scenes, just not ready for the general public.
> If a robot unloading your dishwasher breaks one of your dishes once, this is a massive failure.
That's a bit exaggerated, no? Early roombas would get tangled in socks, drag pet poop all over the floor, break glass stuff and so on, and yet the market accepted that, evolved, and now we have plenty of cleaning robots from various companies, including cheap spying ones from china.
I actually think that there's a lot of value in being the first to deploy bots into homes, even if they aren't perfect. The amount of data you'd collect is invaluable, and by the looks of it, can't be synth generated in a lab.
I think the "safer" option is still the "bring them to factories first, offices next and homes last", but anyway I'm sure someone will jump straight to home deployments.
I ended up watching Bicentennial Man (1999) with Robin Williams over the weekend. If you haven't seen I thought it was a good and timely thing to watch and is kid friendly. Without giving away the plot, the scene where it was unloading the dishwasher...take my money!
From an economic standpoint the industry is anyway the most relevant by far. Its easier as the env is a lot more controlled, professionals configure and maintain the robots, they buy in bulk and have more money.
My concern with a household robot is not the dishwasher but the tv screen, the glas door, glas table, animals (fish/aquarium) etc. the robot might walk through, touch through or fall onto.
I have broken dishes loading and unloading the dishwasher. Am I a massive failure?
My non-AI dishwasher can't even always keep the water inside. Nothing is perfect.
> If a robot unloading your dishwasher breaks one of your dishes once, this is a massive failure.
Depending on what the rate of breaking dishes is, this would be a massive improvement on me, a human being, since I break a really important dish I needed to use like ~2x per month on average.
It's called "VLA" (vision-language-action) models: https://huggingface.co/models?pipeline_tag=robotics
VLA models essentially take a webcam screenshot + some text (think "put the red block in the right box") and output motor control instructions to achieve that.
Note: "Gemini Robotics-ER" is not a VLA, though Gemini does have a VLA model too: "Gemini Robotics".
There's not enough internet-scale data for robotics. The gap is huge! So anyone that claims to have a GPT like model is not behing honest.