You raise a great point. And the Amazon picking staff are onshore in wealthy countries. I guess the minimum wage paid by Amazon is around 15 USD per hour.
I wonder: Is the task of automating this work primaryly difficult in vision or dexterity (motion)? Or maybe they are equally difficult for different reasons.
If we're talking about picking objects at random from one bin and putting it in another, I don't need my eyes to do that. Proprioception (shape and location) and touch (texture) are enough to do that.
Probably both vision and dexterity, and the first mistake we make as roboticists/engineers might be to distinguish the two like they're separate problems to solve or that a solution exists where the two live a separate life.
https://rodneybrooks.com/why-todays-humanoids-wont-learn-dex...