
Jonas Buchli, Freek Stulp, Evangelos Theodorou, and Stefan Schaal. Learning Variable Impedance Control. International Journal
of Robotics Research, 30(7):820–833, 2011.



[PDF]1.3MB



One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the
impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this
principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday human environments.
It is, however, not trivial to derive variable impedance controllers for practical high degreeoffreedom (DOF) robotic tasks.
In this contribution, we accomplish such variable impedance control with the reinforcement learning (RL) algorithm PI2 (Policy
Improvement with Path Integrals). PI2 is a modelfree, sampling based learning method derived from first
principles of stochastic optimal control. The PI2 algorithm requires no tuning of algorithmic parameters besides the exploration
noise. The designer can thus fully focus on cost function design to specify the task. From the viewpoint of robotics, a particular
useful property of PI2 is that it can scale to problems of many DOFs, so that reinforcement learning on real robotic systems
becomes feasible. We sketch the PI2 algorithm and its theoretical properties, and how it is applied to gain scheduling for
variable impedance control. We evaluate our approach by presenting results on several simulated and real robots. We consider
tasks involving accurate tracking through viapoints, and manipulation tasks requiring physical contact with the environment.
In these tasks, the optimal strategy requires both tuning of a reference trajectory and the impedance of the endeffector.
The results show that we can use path integral based reinforcement learning not only for planning but also to derive variable
gain feedback controllers in realistic scenarios. Thus, the power of variable impedance control is made available to a wide
variety of robotic systems and practical applications.



@Article{buchli11learning,
title = {Learning Variable Impedance Control},
author = {Jonas Buchli and Freek Stulp and Evangelos Theodorou and Stefan Schaal},
journal = {International Journal of Robotics Research},
year = {2011},
number = {7},
pages = {820833},
volume = {30},
abstract = {One of the hallmarks of the performance, versatility, and robustness of biological motor control is the ability to adapt the impedance of the overall biomechanical system to different task requirements and stochastic disturbances. A transfer of this principle to robotics is desirable, for instance to enable robots to work robustly and safely in everyday human environments. It is, however, not trivial to derive variable impedance controllers for practical high degreeoffreedom (DOF) robotic tasks. In this contribution, we accomplish such variable impedance control with the reinforcement learning (RL) algorithm PI2 ({\bf P}olicy {\bf I}mprovement with {\bf P}ath {\bf I}ntegrals). PI2 is a modelfree, sampling based learning method derived from first principles of stochastic optimal control. The PI2 algorithm requires no tuning of algorithmic parameters besides the exploration noise. The designer can thus fully focus on cost function design to specify the task. From the viewpoint of robotics, a particular useful property of PI2 is that it can scale to problems of many DOFs, so that reinforcement learning on real robotic systems becomes feasible. We sketch the PI2 algorithm and its theoretical properties, and how it is applied to gain scheduling for variable impedance control. We evaluate our approach by presenting results on several simulated and real robots. We consider tasks involving accurate tracking through viapoints, and manipulation tasks requiring physical contact with the environment. In these tasks, the optimal strategy requires both tuning of a reference trajectory \emph{and} the impedance of the endeffector. The results show that we can use path integral based reinforcement learning not only for planning but also to derive variable gain feedback controllers in realistic scenarios. Thus, the power of variable impedance control is made available to a wide variety of robotic systems and practical applications.},
bib2html_pubtype = {Journal},
bib2html_rescat = {Reinforcement Learning of Variable Impedance Control},
url = {http://ijr.sagepub.com/content/early/2011/03/31/0278364911402527}
}

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein
are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the
terms and constraints.
Generated by
bib2html.pl
(written by Patrick Riley
) on
Mon Jul 20, 2015 21:50:11 