• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •
Building Self-Play Curricula Online by Playing with Expert Agents in Adversarial Games.
Felipe
Leno Da Silva, Anna Helena Reali Costa, and Peter
Stone.
In Proceedings of the 8th Brazilian Conference on Intelligent Systems (BRACIS), October 2019.
Multiagent reinforcement learning algorithms are designed to enable an autonomous agent to adapt to an opponent's strategy based on experience. However, most such algorithms require a relatively large amount of experience to perform well. This requirement is problematic when opponent interactions are expensive, for example, when the agent has limited access to the opponent during training. In order to make good use of the opponent as a resource to support learning, we propose SElf-PLay by Expert Modeling (SEPLEM), an algorithm that models the opponent policy in a few episodes, and uses it to train in a simulated environment where it is cheaper to perform learning steps than in the real environment. Our empirical evaluation indicates that SEPLEM, by iteratively building a Curriculum of simulated tasks, achieves better performance than both only playing against the expert and using pure Self-Play techniques. SEPLEM is a promising technique to accelerate learning in multiagent adversarial tasks.
@InProceedings{BRACIS2019-Leno, author={Felipe Leno Da Silva and Anna Helena Reali Costa and Peter Stone}, title={Building Self-Play Curricula Online by Playing with Expert Agents in Adversarial Games}, booktitle={Proceedings of the 8th Brazilian Conference on Intelligent Systems (BRACIS)}, location={Salvador, Bahia, Brazil}, year={2019}, abstract={Multiagent reinforcement learning algorithms are designed to enable an autonomous agent to adapt to an opponent's strategy based on experience. However, most such algorithms require a relatively large amount of experience to perform well. This requirement is problematic when opponent interactions are expensive, for example, when the agent has limited access to the opponent during training. In order to make good use of the opponent as a resource to support learning, we propose SElf-PLay by Expert Modeling (SEPLEM), an algorithm that models the opponent policy in a few episodes, and uses it to train in a simulated environment where it is cheaper to perform learning steps than in the real environment. Our empirical evaluation indicates that SEPLEM, by iteratively building a Curriculum of simulated tasks, achieves better performance than both only playing against the expert and using pure Self-Play techniques. SEPLEM is a promising technique to accelerate learning in multiagent adversarial tasks. }, month = {October} }
Generated by bib2html.pl (written by Patrick Riley ) on Tue Nov 19, 2024 10:24:43