Elmar Rueckert is the research group leader of the neurorobotics division at the Intelligent Autonomous Systems (IAS) lab. He has a strong expertise in recurrent neural networks, learning movement primitives, probabilistic planning and motor control in tasks with contacts. He is the team leader of the European project GOAL-robots and was responsible for learning approaches in the highly successful project CoDyCo. Before joining IAS, he has been with the Institute for Theoretical Computer Science at Graz University of Technology, where he received his Ph.D. under the supervision of Wolfgang Maass.
Experience Replay in Model-based Reinforcement Leaning for Open-Ended Learning
Learning control policies in robotic tasks requires a large number of interactions due to small learning rates, bounds on the updates or unknown constraints. In contrast humans can infer protective and safe solutions after a single failure or unexpected observation. In this talk, a neural Bayesian optimization algorithm is presented that replicates the cognitive inference and memorization process for avoiding failures in motor control tasks. A neural model implements the sampling of the acquisition function which enables rapid learning with large learning rates while a mental replay phase ensures that policy regions that led to failures are inhibited during the sampling process. The features of the neural Bayesian optimization method are evaluated in a humanoid postural balancing task and in modeling human adaption in postural control tasks. The presented cognitive inference and memorization approach is an efficient reinforcement learning algorithm that can be applied to other deep networks such as LSTM networks or convolutional networks.