• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •
Deep Imitation Learning for Parameterized Action Spaces.
Matthew Hausknecht,
Yilun Chen, and Peter
Stone.
In AAMAS Adaptive Learning Agents (ALA) Workshop, May 2016.
Recent results have demonstrated the ability of deep neural networksto serve as effective controllers (or function approximators of thevalue function) for complex sequential decision-making tasks,including those with raw visual inputs. However, to the best of ourknowledge, such demonstrations have been limited to tasks either fullydiscrete or fully continuous actions. This paper introduces animitation learning method to train a deep neural network to mimic astochastic policy in a parameterized action space. The network uses anovel dual classification/regression loss mechanism to decide whichdiscrete action to select as well as the continuous parameters toaccompany that action. This method is fully implemented and tested ina subtask of simulated RoboCup soccer. To the best of our knowledge,the resulting networks represent the first demonstration of successfulimitation learning in a task with parameterized continuous actions.
@InProceedings{ALA16-hausknecht2, author = {Matthew Hausknecht and Yilun Chen and Peter Stone}, title = {Deep Imitation Learning for Parameterized Action Spaces}, booktitle = {AAMAS Adaptive Learning Agents (ALA) Workshop}, location = {Singapore}, month = {May}, year = {2016}, abstract = { Recent results have demonstrated the ability of deep neural networks to serve as effective controllers (or function approximators of the value function) for complex sequential decision-making tasks, including those with raw visual inputs. However, to the best of our knowledge, such demonstrations have been limited to tasks either fully discrete or fully continuous actions. This paper introduces an imitation learning method to train a deep neural network to mimic a stochastic policy in a parameterized action space. The network uses a novel dual classification/regression loss mechanism to decide which discrete action to select as well as the continuous parameters to accompany that action. This method is fully implemented and tested in a subtask of simulated RoboCup soccer. To the best of our knowledge, the resulting networks represent the first demonstration of successful imitation learning in a task with parameterized continuous actions. }, }
Generated by bib2html.pl (written by Patrick Riley ) on Tue Nov 19, 2024 10:24:47