Publications Freek Stulp

Back to Homepage

• Sorted by Date • Classified by Publication Type • Classified by Research Category •
Variable Impedance Control - A Reinforcement Learning Approach
	Jonas Buchli, Evangelos Theodorou, Freek Stulp, and Stefan Schaal. Variable Impedance Control - A Reinforcement Learning Approach. In Robotics: Science and Systems Conference (RSS), 2010.
	Download
	[PDF]454.9kB
	Abstract
	One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday human environments. It is, however, not trivial to derive variable impedance controllers for practical high DOF robotic tasks. In this contribution, we accomplish such gain scheduling with a reinforcement learning approach algorithm, PI2 (Policy Improvement with Path Integrals). PI2 is a model-free, sampling based learning method derived from first principles of optimal control. The PI$^2$ algorithm requires no tuning of algorithmic parameters besides the exploration noise. The designer can thus fully focus on cost function design to specify the task. From the viewpoint of robotics, a particular useful property of PI2 is that it can scale to problems of many DOFs, so that RL on real robotic systems becomes feasible.We sketch the PI2 algorithm and its theoretical properties, and how it is applied to gain scheduling. We evaluate our approach by presenting results on two different simulated robotic systems, a 3-DOF Phantom Premium Robot and a 6-DOF Kuka Lightweight Robot. We investigate tasks where the optimal strategy requires both tuning of the impedance of the end-effector, and tuning of a reference trajectory. The results show that we can use path integral based RL not only for planning but also to derive variable gain feedback controllers in realistic scenarios. Thus, the power of variable impedance control is made available to a wide variety of robotic systems and practical applications.
	BibTeX

@InProceedings{buchli10variable,
  title                    = {Variable Impedance Control - A Reinforcement Learning Approach},
  author                   = {Jonas Buchli and Evangelos Theodorou and Freek Stulp and Stefan Schaal},
  booktitle                = {Robotics: Science and Systems Conference (RSS)},
  year                     = {2010},
  abstract                 = {One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday human environments. It is, however, not trivial to derive variable impedance controllers for practical high DOF robotic tasks. In this contribution, we accomplish such gain scheduling with a reinforcement learning approach algorithm, PI2 ({\bf P}olicy {\bf I}mprovement with {\bf P}ath {\bf I}ntegrals). PI2 is a model-free, sampling based learning method derived from first principles of optimal control. The PI$^2$ algorithm requires no tuning of algorithmic parameters besides the exploration noise. The designer can thus fully focus on cost function design to specify the task. From the viewpoint of robotics, a particular useful property of PI2 is that it can scale to problems of many DOFs, so that RL on real robotic systems becomes feasible.We sketch the PI2 algorithm and its theoretical properties, and how it is applied to gain scheduling. We evaluate our approach by presenting results on two different simulated robotic systems, a 3-DOF Phantom Premium Robot and a 6-DOF Kuka Lightweight Robot. We investigate tasks where the optimal strategy requires both tuning of the impedance of the end-effector, and tuning of a reference trajectory. The results show that we can use path integral based RL not only for planning but also to derive variable gain feedback controllers in realistic scenarios. Thus, the power of variable impedance control is made available to a wide variety of robotic systems and practical applications.},
  bib2html_pubtype         = {Refereed Conference Paper},
  bib2html_rescat          = {Reinforcement Learning of Variable Impedance Control}
}

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints.

Generated by bib2html.pl (written by Patrick Riley ) on Mon Jul 20, 2015 21:50:11