What is the "ELI5" summary of the practical limits & scaling laws that govern robotics?
The current "futurist" vision is one of humanoid robots taking over many/most jobs done by humans today, but - as someone that routinely hires human welders & assemblers - the dexterity required for most ad-hoc tasks seems many many decades (if not more?) away from what I see robots do--yes, even the fancy chinese jumping ones.
This has led me to think one of two things:
1. The robotics revolution will not come. It's predicated on the idea that advances in robotics will follow a curve of the same shape as advances in compute/ai, which will not happen. OR...
2. There has been some paradigm-shift or some breakthrough that has put robotics improvement on a new curve.
To an outsider, what I see in robots is not categorically different than like, the sony AIBO dog in 1999. It's significantly better of course, but is it really that different? (Whereas what we can do in compute-land today is categorically diffrent because of the transformer model breakthrough).
So:
1. Have there been any breakthroughs that would lead us to believe that a robot will be able to like, look under a table to adjust a screw?
2. What are the scaling laws & practical limits to present-day robotic dexterity? Is it materials? Energy density? What?
3. What is the real rate of improvement along these key dimensions? Are robots improving linearly? Geometrically? Exponentially?
4.Or should I keep discounting robotics until we get our first robots that are made of meat? That I'd believe would result in exponential change!
I'm fairly ignorant on this but robots that are teleoperated seem completely capable of doing basically any household task and using tools like screw drivers. It may be slow, sure, but autonomy and speed seems like a solvable software problem.
There's also endless welding and assembly robots and have been for a long time. Sure they're huge and weight 3 tons or whatever, but it's not like we're building humanoid robots to do work like that anyway.
Consider 2 welding systems, a hungover human on a 3 legged ladder with a scratched up welding helmet doing an overhead TIG weld holding the filler rod a foot away from the weld pool, and a 6 DOF Kuka bot doing a weld in the same position on a completely rigid work piece clamped down to a precision machined fixture table which is clamped down to a precision machined floor that the robot is also mounted to.
The human system weighs 250lbs and can be placed anywhere. Let's ask what it takes to walk the factory robot in that direction. First let's have the work piece be moving, let's say on a conveyor belt. The old robotics way of thinking would be to introduce this variable into the programming of the bot/station, create simple sensors for either the work piece or conveyor itself to indicate to the programming loop where the part is with as little error as possible, and continue to keep accuracy while maintaining as much precision as possible using rigidity (which equals mass and space). Now the whole system is functionally 7 DOF, and you add in the error and failure modes of the 7th DOF (the conveyor system) and accumulate some error. Now just imagine instead of a conveyor the part is on a rolling table with random Z height, and so it the robot arm, and you can see this will fall apart, you can't fight this battle with deterministic programming, machining precision, and rigidity. Obviously if you extended this system to be a humanoid robot on a 3 legged ladder which would be 30+ DOF between the weld and the ground, it couldn't possibly work.
But back to the hungover human, why does this system work so well? The human has very good eyes and a very good internal IMU. They are looking at the end of the filler rod and the weld pool, and even though the information isn't that good coming through the scratched welding helmet, they can compensate for all that error and run an internal function that holds the torch and filler rod in the optimum position to do a good TIG weld while ignoring or automatically adjusting for tons of other variables. Now to address your original question, in our system 1. Are current cameras good enough to get an equivalent amount of information about the weld that the hungover welder has? Yes, in fact can get more information than a human can 2. Are IMUs as good as a hungover human has? Hard to really know, but seems like it, though if you need many IMUs attached to different limbs on a robot its probably not as good as humans yet 3. Is the power density of actuators and power storage good enough to approximate this 250lb system of a human on a ladder with some combination of DOF that reaches a sufficient range of motion to emulate the humans hands (whether the robot looks like a human or not?) - yeah, plus in this case the welder is plugged into the ground for the human anyway so that system is already attached to mains power
So given all this, seems like the limiter is just software, which is the bull case for this prospected robotics revolution
From your early point -- both 1) and 2) are true. True human level dexterity is ver far (few decades surely), it would require further advancements in hardware, learning approaches etc. Recent approaches provide a glimmer of hope and maybe we can have some intermediate robots -- to be honest even waymo's and tesla's are robots and we will see much more of such robots with vision, working with humans etc. in narrow settings - chinese dancing robots are examples of this.