Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


Leveraging Commonsense Reasoning and Multimodal Perception for Robot Spoken Dialog Systems

Leveraging Commonsense Reasoning and Multimodal Perception for Robot Spoken Dialog Systems.
Dongcai Lu, Shiqi Zhang, Peter Stone, and Xiaoping Chen.
In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017.

Download

[PDF]1.6MB  

Abstract

Probabilistic graphical models, such as partially observable Markov decision processes (POMDPs), have been used in stochastic spoken dialog systems to handle the inherent uncertainty in speech recognition and language understanding. Such dialog systems suffer from the fact that only a relatively small number of domain variables are allowed in the model, so as to ensure the generation of good-quality dialog policies. At the same time, the non-language perception modalities on robots, such as vision-based facial expression recognition and Lidar-based distance detection, can hardly be integrated into this process. In this paper, we use a probabilistic commonsense reasoner to “guide” our POMDP-based dialog manager, and present a principled, multimodal dialog management (MDM) framework that allows the robot’s dialog belief state to be seamlessly updated by both observations of human spoken language, and exogenous events such as the change of human facial expressions. The MDM approach has been implemented and evaluated both in simulation and on a real mobile robot using guidance tasks.

BibTeX Entry

@InProceedings{IROS17-Lu,
  author = {Dongcai Lu and Shiqi Zhang and Peter Stone and Xiaoping Chen},
  title = {Leveraging Commonsense Reasoning and Multimodal Perception for Robot
    Spoken Dialog Systems},
  booktitle = {Proceedings of the IEEE/RSJ International Conference on
    Intelligent Robots and Systems (IROS)},
  location = {Vancouver, Canada},
  month = {September},
  year = {2017},
  abstract = {
    Probabilistic graphical models, such as partially observable Markov decision
    processes (POMDPs), have been used in stochastic spoken dialog systems to handle
    the inherent uncertainty in speech recognition and language understanding. Such
    dialog systems suffer from the fact that only a relatively small number of
    domain variables are allowed in the model, so as to ensure the generation of
    good-quality dialog policies. At the same time, the non-language perception
    modalities on robots, such as vision-based facial expression recognition and
    Lidar-based distance detection, can hardly be integrated into this process. In
    this paper, we use a probabilistic commonsense reasoner to “guide” our
    POMDP-based dialog manager, and present a principled, multimodal dialog
    management (MDM) framework that allows the robot’s dialog belief state to be
    seamlessly updated by both observations of human spoken language, and exogenous
    events such as the change of human facial expressions. The MDM approach has been
    implemented and evaluated both in simulation and on a real mobile robot using
    guidance tasks.
  },
}

Generated by bib2html.pl (written by Patrick Riley ) on Tue Nov 19, 2024 10:24:43