Michael Truell and Joshua Gruenstein
Horace Mann School, USA
Scientific Tracks Abstracts: Adv Robot Autom
A mobile robot deep reinforcement learning system is created that converges on common robotic tasks using four times less feedback than pre-existing solutions. The system achieves this leap in efficiency through context-aware action selection and aggressive online hyper-parameter optimization while still maintaining performance on embedded hardware. A core algorithm of deep wire fitted q-learning is supplemented with active measurement of robot uncertainty, defined as the derivative of error between expected and received reward. This uncertainty value directly scales the temperature of Boltzmann probabilistic exploration policy in addition to the learning rate of stochastic gradient descent. Furthermore, to provide generality across robots and tasks, neural network topology is efficiently evolved throughout training and evaluation. Finally, experience replay is extended to changing environments and is integrated with our uncertainty value. Human operators successfully trained the system on multiple robots in a matter of minutes to perform tasks such as driving to a point with a differential drive system, following a line using holonomic Swedish wheels or playing ping pong with a robot arm. All are without any manual hyperparameter adjustment in both simulation and hardware.
Michael Truell is presently at the Horace Mann School in Bronx, NY.
Email:mntruell@gmail.com
Advances in Robotics & Automation received 1275 citations as per Google Scholar report