• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •
On-Policy vs. Off-Policy Updates for Deep Reinforcement Learning.
Matthew
Hausknecht and Peter Stone.
In Deep Reinforcement Learning: Frontiers
and Challenges, IJCAI Workshop, July 2016.
Temporal-difference-based deep-reinforcement learning methods havetypically been driven by off-policy, bootstrap Q-Learning updates. Inthis paper, we investigate the effects of using on-policy, Monte Carloupdates. Our empirical results show that for the DDPG algorithm in acontinuous action space, mixing on-policy and off-policy updatetargets exhibits superior performance and stability compared to usingexclusively one or the other. The same technique applied to DQN in adiscrete action space drastically slows down learning. Our findingsraise questions about the nature of on-policy and off-policy bootstrapand Monte Carlo updates and their relationship to deep reinforcementlearning methods.
@InProceedings{DeepRL16-hausknecht, author = {Matthew Hausknecht and Peter Stone}, title = {On-Policy vs. Off-Policy Updates for Deep Reinforcement Learning}, booktitle = {Deep Reinforcement Learning: Frontiers and Challenges, IJCAI Workshop}, location = {New York}, month = {July}, year = {2016}, abstract = { Temporal-difference-based deep-reinforcement learning methods have typically been driven by off-policy, bootstrap Q-Learning updates. In this paper, we investigate the effects of using on-policy, Monte Carlo updates. Our empirical results show that for the DDPG algorithm in a continuous action space, mixing on-policy and off-policy update targets exhibits superior performance and stability compared to using exclusively one or the other. The same technique applied to DQN in a discrete action space drastically slows down learning. Our findings raise questions about the nature of on-policy and off-policy bootstrap and Monte Carlo updates and their relationship to deep reinforcement learning methods. }, }
Generated by bib2html.pl (written by Patrick Riley ) on Tue Nov 19, 2024 10:24:47