Peter Stone's Selected Publications

• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •

RTMBA: A Real-Time Model-Based Reinforcement Learning Architecture for Robot Control

RTMBA: A Real-Time Model-Based Reinforcement Learning Architecture for Robot Control.
Todd Hester, Michael Quinlan, and Peter Stone.
In IEEE International Conference on Robotics and Automation (ICRA), May 2012.

Download

[PDF]359.7kB [postscript]1.9MB

Abstract

Reinforcement Learning (RL) is a paradigm forlearning decision-making tasks that could enable robots to learnand adapt to their situation on-line. For an RL algorithm tobe practical for robotic control tasks, it must learn in very fewsamples, while continually taking actions in real-time. Existingmodel-based RL methods learn in relatively few samples, buttypically take too much time between each action for practicalon-line learning. In this paper, we present a novel parallelarchitecture for model-based RL that runs in real-time by1) taking advantage of sample-based approximate planningmethods and 2) parallelizing the acting, model learning, andplanning processes in a novel way such that the acting process issufficiently fast for typical robot control cycles. We demonstratethat algorithms using this architecture perform nearly as well asmethods using the typical sequential architecture when both aregiven unlimited time, and greatly out-perform these methodson tasks that require real-time actions such as controlling anautonomous vehicle.

BibTeX Entry

@InProceedings{ICRA12-hester,
  author="Todd Hester and Michael Quinlan and Peter Stone",
  title="{RTMBA}: A Real-Time Model-Based Reinforcement Learning Architecture for Robot Control",
  booktitle = "{IEEE} International Conference on Robotics and Automation (ICRA)",
  location = "St. Paul, MN, USA",
  month = "May",
  year = "2012",
  abstract = "Reinforcement Learning (RL) is a paradigm for
learning decision-making tasks that could enable robots to learn
and adapt to their situation on-line. For an RL algorithm to
be practical for robotic control tasks, it must learn in very few
samples, while continually taking actions in real-time. Existing
model-based RL methods learn in relatively few samples, but
typically take too much time between each action for practical
on-line learning. In this paper, we present a novel parallel
architecture for model-based RL that runs in real-time by
1) taking advantage of sample-based approximate planning
methods and 2) parallelizing the acting, model learning, and
planning processes in a novel way such that the acting process is
sufficiently fast for typical robot control cycles. We demonstrate
that algorithms using this architecture perform nearly as well as
methods using the typical sequential architecture when both are
given unlimited time, and greatly out-perform these methods
on tasks that require real-time actions such as controlling an
autonomous vehicle.",
}

Generated by bib2html.pl (written by Patrick Riley ) on Mon Feb 02, 2026 12:00:25