|
Freek Stulp and Pierre-Yves Oudeyer. Emergent Proximo-Distal Maturation through Adaptive Exploration. In International
Conference on Development and Learning (ICDL), 2012. Paper of Excellence Award
|
|
|
[PDF]912.1kB
|
|
|
Life-long robot learning in the high-dimensional real world requires guided and structured exploration mechanisms. In this
developmental context, we investigate here the use of the recently proposed PI2-CMAES episodic reinforcement learning algorithm,
which is able to learn high-dimensional motor tasks through adaptive control of exploration. By studying PI2-CMAES in a reaching
task on a simulated arm, we observe two developmental properties. First, we show how PI2-CMAES autonomously and continuously
tunes the global exploration/exploitation trade-off, allowing it to re-adapt to changing tasks. Second, we show how PI2-CMAES
spontaneously self-organizes a maturational structure whilst exploring the degrees-of-freedom (DOFs) of the motor space. In
particular, it automatically demonstrates the so-called proximo-distal maturation observed in humans: after first freezing
distal DOFs while exploring predominantly the most proximal DOF, it progressively frees exploration in DOFs along the proximo-distal
body axis. These emergent properties suggest the use of PI2-CMAES as a general tool for studying reinforcement learning of
skills in life-long developmental learning contexts.
|
|
|
@InProceedings{stulp12emergent,
title = {Emergent Proximo-Distal Maturation through Adaptive Exploration},
author = {Freek Stulp and Pierre-Yves Oudeyer},
booktitle = {International Conference on Development and Learning (ICDL)},
year = {2012},
note = {{\bf Paper of Excellence Award}},
abstract = {Life-long robot learning in the high-dimensional real world requires guided and structured exploration mechanisms. In this developmental context, we investigate here the use of the recently proposed PI2-CMAES episodic reinforcement learning algorithm, which is able to learn high-dimensional motor tasks through adaptive control of exploration. By studying PI2-CMAES in a reaching task on a simulated arm, we observe two developmental properties. First, we show how PI2-CMAES autonomously and continuously tunes the global exploration/exploitation trade-off, allowing it to re-adapt to changing tasks. Second, we show how PI2-CMAES spontaneously self-organizes a maturational structure whilst exploring the degrees-of-freedom (DOFs) of the motor space. In particular, it automatically demonstrates the so-called \emph{proximo-distal maturation} observed in humans: after first freezing distal DOFs while exploring predominantly the most proximal DOF, it progressively frees exploration in DOFs along the proximo-distal body axis. These emergent properties suggest the use of PI2-CMAES as a general tool for studying reinforcement learning of skills in life-long developmental learning contexts.},
bib2html_pubtype = {Refereed Conference Paper,Awards},
bib2html_rescat = {Reinforcement Learning of Robot Skills}
}
|
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein
are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the
terms and constraints.
Generated by
bib2html.pl
(written by Patrick Riley
) on
Mon Jul 20, 2015 21:50:11 |