Nvidia researchers have built a small neural network that controls humanoid robots more effectively than specialized systems, even though it uses far fewer resources. The system works with multiple input methods, from VR headsets to motion capture.
1.5 million parameters indeed is small compared to an LLM. But why would that be more efficient than old-school inverse kinematics? That’s either simple trigonometry or some moderately complicated matrix maths… Something you’d be able to do at an undergraduate level. And not that many CPU cycles.
Capturing the human movements just with a regular rgb camera and having thr robot mimick it, is impressive, though.
1.5 million parameters indeed is small compared to an LLM. But why would that be more efficient than old-school inverse kinematics? That’s either simple trigonometry or some moderately complicated matrix maths… Something you’d be able to do at an undergraduate level. And not that many CPU cycles.
Capturing the human movements just with a regular rgb camera and having thr robot mimick it, is impressive, though.