Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors

VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors.
Yifeng Zhu, Abhishek Joshi, Peter Stone, and Yuke Zhu.
In Proceedings of the 6th Conference on Robot Learning (CoRL 2022), December 2022.
Project page
Code

Download

[PDF]4.4MB  

Abstract

We introduce VIOLA, an object-centric imitation learning approach to learning closed-loop visuomotor policies for robot manipulation. Our approach constructs object-centric representations based on general object proposals from a pre-trained vision model. VIOLA uses a transformer-based policy to reason over these representations and attend to the task-relevant visual factors for action prediction. Such object-based structural priors improve deep imitation learning algorithm's robustness against object variations and environmental perturbations. We quantitatively evaluate VIOLA in simulation and on real robots. VIOLA outperforms the state-of-the-art imitation learning methods by 45.8 percents in success rate. It has also been deployed successfully on a physical robot to solve challenging long-horizon tasks, such as dining table arrangement and coffee making.

BibTeX Entry

@inproceedings{corl2022-zhu,
  title={VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors},
  author={Yifeng Zhu and Abhishek Joshi and Peter Stone and Yuke Zhu},
  booktitle={Proceedings of the 6th Conference on Robot Learning (CoRL 2022)},
  location = {Auckland, New Zealand},
  month={December},
  year={2022},
  doi={},
  abstract={We introduce VIOLA, an object-centric imitation learning approach to learning closed-loop visuomotor policies for robot manipulation. Our approach constructs object-centric representations based on general object proposals from a pre-trained vision model. VIOLA uses a transformer-based policy to reason over these representations and attend to the task-relevant visual factors for action prediction. Such object-based structural priors improve deep imitation learning algorithm's robustness against object variations and environmental perturbations. We quantitatively evaluate VIOLA in simulation and on real robots. VIOLA outperforms the state-of-the-art imitation learning methods by 45.8 percents in success rate. It has also been deployed successfully on a physical robot to solve challenging long-horizon tasks, such as dining table arrangement and coffee making.},
  wwwnote={<a href="https://ut-austin-rpl.github.io/VIOLA" target="_blank">Project page</a><br><a href="https://github.com/UT-Austin-RPL/VIOLA">Code</a>}
}

Generated by bib2html.pl (written by Patrick Riley ) on Tue Nov 19, 2024 10:24:41