• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •
Team-Partitioned, Opaque-Transition Reinforcement Learning.
Peter Stone
and Manuela Veloso.
In Minoru
Asada and Hiroaki Kitano, editors, RoboCup-98: Robot Soccer
World Cup II, Lecture Notes in Artificial Intelligence, pp. 261–72, Springer Verlag, Berlin, 1999. Also in Proceedings
of the Third International Conference on Autonomous Agents, 1999
[PDF]151.4kB [postscript]144.7kB
We present a novel multi-agent learning paradigm called team-partitioned, opaque-transition reinforcement learning (TPOT-RL). TPOT-RL introduces the use of action-dependent features to generalize the state space. In our work, we use a learned action-dependent feature space to aid higher-level reinforcement learning. TPOT-RL is an effective technique to allow a team of agents to learn to cooperate towards the achievement of a specific goal. It is an adaptation of traditional RL methods that is applicable in complex, non-Markovian, multi-agent domains with large state spaces and limited training opportunities. TPOT-RL is fully implemented and has been tested in the robotic soccer domain, a complex, multi-agent framework. This paper presents the algorithmic details of TPOT-RL as well as empirical results demonstrating the effectiveness of the developed multi-agent learning approach with learned features.
@InCollection(LNAI98-tpot-rl, Author="Peter Stone and Manuela Veloso", Title="Team-Partitioned, Opaque-Transition Reinforcement Learning", booktitle= "{R}obo{C}up-98: Robot Soccer World Cup {II}", Editor="Minoru Asada and Hiroaki Kitano", series="Lecture Notes in Artificial Intelligence", volume="1604", pages="261--72", Publisher="Springer Verlag",address="Berlin",year="1999", note= "Also in {\it Proceedings of the Third International Conference on Autonomous Agents}, 1999", abstract={ We present a novel multi-agent learning paradigm called team-partitioned, opaque-transition reinforcement learning (TPOT-RL). TPOT-RL introduces the use of action-dependent features to generalize the state space. In our work, we use a {\it learned} action-dependent feature space to aid higher-level reinforcement learning. TPOT-RL is an effective technique to allow a team of agents to learn to cooperate towards the achievement of a specific goal. It is an adaptation of traditional RL methods that is applicable in complex, non-Markovian, multi-agent domains with large state spaces and limited training opportunities. TPOT-RL is fully implemented and has been tested in the robotic soccer domain, a complex, multi-agent framework. This paper presents the algorithmic details of TPOT-RL as well as empirical results demonstrating the effectiveness of the developed multi-agent learning approach with learned features. }, )
Generated by bib2html.pl (written by Patrick Riley ) on Wed Jan 15, 2025 08:40:52