Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


A Social Reinforcement Learning Agent

Charles Lee Isbell, Jr., Christian R. Shelton, Michael Kearns, Satinder Singh, and Peter Stone. A Social Reinforcement Learning Agent. In Proceedings of the Fifth International Conference on Autonomous Agents, pp. 377–384, ACM Press, Montreal, Canada, 2001.
BEST PAPER AWARD at Agents-2001

Download

[PDF]243.5kB  [postscript]1.7MB  

Abstract

We report on our reinforcement learning work on Cobot, a software agent that resides in the well-known online chat community LambdaMOO. Our initial work on Cobot provided him with the ability to collect social statistics report them to users in a reactive manner. Here we describe our application of reinforcement learning to allow Cobot to proactively take actions in this complex social environment, and adapt his behavior from multiple sources of human reward. After 5 months of training, Cobot has received 3171 reward and punishment events from 254 different LambdaMOO users, and has learned nontrivial preferences for a number of users. Cobot modifies his behavior based on his current state in an attempt to maximize reward. Here we describe LambdaMOO and the state and action spaces of Cobot, and report the statistical results of the learning experiment.

BibTeX Entry

@InProceedings(AA01-cobot,
    Author="Charles Lee Isbell and Jr. and Christian R. Shelton and Michael Kearns and Satinder Singh and Peter Stone",
    Title="A Social Reinforcement Learning Agent",
    BookTitle="Proceedings of the Fifth International Conference on 
    Autonomous Agents", Year="2001",
    publisher = "ACM Press",
    address = "Montreal, Canada",
    editor = {J{\"o}rg P. M{\"u}ller and Elisabeth Andre and Sandip
              Sen and Claude Frasson},
    pages = "377--384",
    abstract={
              We report on our reinforcement learning work on Cobot, a
              software agent that resides in the well-known online
              chat community LambdaMOO.  Our initial work on Cobot
              provided him with the ability to collect social
              statistics report them to users in a reactive
              manner. Here we describe our application of
              reinforcement learning to allow Cobot to proactively
              take actions in this complex social environment, and
              adapt his behavior from multiple sources of human
              reward.  After 5 months of training, Cobot has received
              3171 reward and punishment events from 254 different
              LambdaMOO users, and has learned nontrivial preferences
              for a number of users.  Cobot modifies his behavior
              based on his current state in an attempt to maximize
              reward. Here we describe LambdaMOO and the state and
              action spaces of Cobot, and report the statistical
              results of the learning experiment.
    },
    wwwnote={<b>BEST PAPER AWARD at</b> <a href="http://www.csc.liv.ac.uk/~agents2001/">Agents-2001</a>},
)

Generated by bib2html.pl (written by Patrick Riley ) on Wed Apr 19, 2006 17:23:13