Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source

Relaxed Exploration Constrained Reinforcement Learning

Relaxed Exploration Constrained Reinforcement Learning.
Shahaf S. Shperberg, Bo Liu, and Peter Stone.
In Conference on Autonomous Agents and Multiagent Systems, May 2024.




This research introduces a novel setting for reinforcement learning withconstraints, termed Relaxed Exploration Constrained Reinforcement Learning(RECRL). Similar to standard constrained reinforcement learning (CRL), theobjective in RECRL is to discover a policy that maximizes the environmentalreturn while adhering to a predefined set of constraints. However, in somereal-world settings, it is possible to train the agent in a setting that does notrequire strict adherence to the constraints, as long as the agent adheres to themonce deployed. To model such settings, we introduce RECRL, which explicitlyincorporates an initial training phase where the constraints are relaxed,enabling the agent to explore the environment more freely. Subsequently, duringdeployment, the agent is obligated to fully satisfy all constraints. To addressRECRL problems, we introduce a curriculum-based approach called CLiC, designed toenhance the exploration of existing CRL algorithms during the training phase andfacilitate convergence towards a policy that satisfies the full set ofconstraints by the end of training. Empirical evaluations demonstrate that CLiCyields policies with significantly higher returns during deployment compared totraining solely under the strict set of constraints.

BibTeX Entry

  author   = {Shahaf S. Shperberg and Bo Liu and Peter Stone},
  title    = {Relaxed Exploration Constrained Reinforcement Learning},
  booktitle = {Conference on Autonomous Agents and Multiagent Systems},
  year     = {2024},
  month    = {May},
  location = {Auckland, New Zealand},
  abstract = {This research introduces a novel setting for reinforcement learning with
constraints, termed Relaxed Exploration Constrained Reinforcement Learning
(RECRL). Similar to standard constrained reinforcement learning (CRL), the
objective in RECRL is to discover a policy that maximizes the environmental
return while adhering to a predefined set of constraints. However, in some
real-world settings, it is possible to train the agent in a setting that does not
require strict adherence to the constraints, as long as the agent adheres to them
once deployed. To model such settings, we introduce RECRL, which explicitly
incorporates an initial training phase where the constraints are relaxed,
enabling the agent to explore the environment more freely. Subsequently, during
deployment, the agent is obligated to fully satisfy all constraints. To address
RECRL problems, we introduce a curriculum-based approach called CLiC, designed to
enhance the exploration of existing CRL algorithms during the training phase and
facilitate convergence towards a policy that satisfies the full set of
constraints by the end of training. Empirical evaluations demonstrate that CLiC
yields policies with significantly higher returns during deployment compared to
training solely under the strict set of constraints.

Generated by (written by Patrick Riley ) on Sun Mar 09, 2025 09:56:05