Peter Stone's Selected Publications

• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •

DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation

DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation.
Faraz Torabi, Garrett Warnell, and Peter Stone.
In Proceedings of The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2021.
Video presentation

Download

[PDF]742.7kB

Abstract

In imitation learning from observation (IfO), a learning agent seeks to imitatea demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator. Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms. This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk. In this work, we hypothesize that we can incorporate ideas from model-based reinforcement learning with adversarial methods for IfO in order to increase the data efficiency of these methods without sacrificing performance. Specifically, we consider time-varying linear Gaussian policies, and propose a method that integrates the linear-quadratic regulator with path integral policy improvement into an existing adversarial IfO framework. The result is a more data-efficient IfO algorithm with better performance, which we show empirically in four simulation domains: using far fewer interactions with the environment, the proposed method exhibits similar or better performance than the existing technique.

BibTeX Entry

@InProceedings{IROS2021-torabi,
  author = {Faraz Torabi and Garrett Warnell and Peter Stone},
  title = {DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation},
  booktitle = {Proceedings of The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  location = {Prague, Czech Republic},
  month = {September},
  year = {2021},
  abstract = {
In imitation learning from observation (IfO), a learning agent seeks to imitate
a demonstrating agent using only observations of the demonstrated behavior 
without access to the control signals generated by the demonstrator. Recent 
methods based on adversarial imitation learning have led to state-of-the-art 
performance on IfO problems, but they typically suffer from high sample 
complexity due to a reliance on data-inefficient, model-free reinforcement 
learning algorithms. This issue makes them impractical to deploy in real-world 
settings, where gathering samples can incur high costs in terms of time, 
energy, and risk. In this work, we hypothesize that we can incorporate ideas 
from model-based reinforcement learning with adversarial methods for IfO in 
order to increase the data efficiency of these methods without sacrificing 
performance. Specifically, we consider time-varying linear Gaussian policies, 
and propose a method that integrates the linear-quadratic regulator with path 
integral policy improvement into an existing adversarial IfO framework. The 
result is a more data-efficient IfO algorithm with better performance, which 
we show empirically in four simulation domains: using far fewer interactions 
with the environment, the proposed method exhibits similar or better 
performance than the existing technique.
  },
  wwwnote = {<a href="https://www.youtube.com/watch?v=o3t0mo_o7W8">Video presentation</a>}
}

Generated by bib2html.pl (written by Patrick Riley ) on Fri Jun 20, 2025 08:27:19