UT Austin Villa Publications

Sorted by DateClassified by Publication TypeClassified by TopicSorted by First Author Last Name

Adversarial Imitation Learning from Video using a State Observer

Haresh Karnan, Garrett Warnell, Faraz Torabi, and Peter Stone. Adversarial Imitation Learning from Video using a State Observer. In International Conference on Robotics and Automation, 2022, May 2022.
Video

Download

[PDF]933.2kB  

Abstract

The imitation learning research community has recently made significant progress towards the goal of enabling artificial agents to imitate behaviors from video demonstrations alone. However, current state-of-the-art approaches developed for this problem exhibit high sample complexity due, in part, to the high-dimensional nature of video observations. Towards addressing this issue, we introduce here a new algorithm called Visual Generative Adversarial Imitation from Observation using a State Observer VGAIfO-SO. At its core, VGAIfO-SO seeks to address sample inefficiency using a novel, self-supervised state observer, which provides estimates of lower-dimensional proprioceptive state representations from high-dimensional images. We show experimentally in several continuous control environments that VGAIfO-SO is more sample efficient than other IfO algorithms at learning from video-only demonstrations and can sometimes even achieve performance close to the Generative Adversarial Imitation from Observation (GAIfO) algorithm that has privileged access to the demonstrator's proprioceptive state information.

BibTeX

@InProceedings{ICRA22-karnan,
  author = {Haresh Karnan and Garrett Warnell and Faraz Torabi and Peter Stone},
  title = {Adversarial Imitation Learning from Video using a State Observer},
  booktitle = {International Conference on Robotics and Automation, 2022},
  location = {Online},
  month = {May},
  year = {2022},
  abstract = {The imitation learning research community has recently made significant progress towards the goal of enabling artificial agents to imitate behaviors from video demonstrations alone. However, current state-of-the-art approaches developed for this problem exhibit high sample complexity due, in part, to the high-dimensional nature of video observations. Towards addressing this issue, we introduce here a new algorithm called Visual Generative Adversarial Imitation from Observation using a State Observer VGAIfO-SO. At its core, VGAIfO-SO seeks to address sample inefficiency using a novel, self-supervised state observer, which provides estimates of lower-dimensional proprioceptive state representations from high-dimensional images. We show experimentally in several continuous control environments that VGAIfO-SO is more sample efficient than other IfO algorithms at learning from video-only demonstrations and can sometimes even achieve performance close to the Generative Adversarial Imitation from Observation (GAIfO) algorithm that has privileged access to the demonstrator's proprioceptive state information.},
  wwwnote={<a href="https://www.youtube.com/watch?v=q21OCKJPXNo&ab_channel=Hareshkarnan">Video</a>}
}

Generated by bib2html.pl (written by Patrick Riley ) on Tue Nov 19, 2024 10:29:30