• Classified by Topic • Classified by Publication Type • Sorted by Date • Sorted by First Author Last Name • Classified by Funding Source •
Learning to Look: Seeking Information for Decision Making via Policy Factorization.
Shivin Dass, Jiaheng
Hu, Ben Abbatematteo, Peter Stone, and Roberto Martín-Martín.
In Conference
on Robot Learning (CoRL), November 2024.
Many robot manipulation tasks require active or interactive exploration behaviorin order to be performed successfully. Such tasks are ubiquitous in embodieddomains, where agents must actively search for the information necessary for eachstage of a task, e.g., moving the head of the robot to find information relevantto manipulation, or in multi-robot domains, where one scout robot may search forthe information that another robot needs to make informed decisions. We identifythese tasks with a new type of problem, factorized Contextual Markov DecisionProcesses, and propose DISaM, a dual-policy solution composed of aninformation-seeking policy that explores the environment to find the relevantcontextual information and an information-receiving policy that exploits thecontext to achieve the manipulation goal. This factorization allows us to trainboth policies separately, using the information-receiving one to provide rewardto train the information-seeking policy. At test time, the dual agent balancesexploration and exploitation based on the uncertainty the manipulation policy hason what the next best action is. We demonstrate the capabilities of our dualpolicy solution in five manipulation tasks that require information-seekingbehaviors, both in simulation and in the real-world, where DISaM significantlyoutperforms existing methods. More information athttps://sites.google.com/view/disam24/.
@InProceedings{dass_corl2024, author = {Shivin Dass and Jiaheng Hu and Ben Abbatematteo and Peter Stone and Roberto MartÃn-MartÃn}, title = {Learning to Look: Seeking Information for Decision Making via Policy Factorization}, booktitle = {Conference on Robot Learning (CoRL)}, year = {2024}, month = {November}, location = {Munich}, abstract = {Many robot manipulation tasks require active or interactive exploration behavior in order to be performed successfully. Such tasks are ubiquitous in embodied domains, where agents must actively search for the information necessary for each stage of a task, e.g., moving the head of the robot to find information relevant to manipulation, or in multi-robot domains, where one scout robot may search for the information that another robot needs to make informed decisions. We identify these tasks with a new type of problem, factorized Contextual Markov Decision Processes, and propose DISaM, a dual-policy solution composed of an information-seeking policy that explores the environment to find the relevant contextual information and an information-receiving policy that exploits the context to achieve the manipulation goal. This factorization allows us to train both policies separately, using the information-receiving one to provide reward to train the information-seeking policy. At test time, the dual agent balances exploration and exploitation based on the uncertainty the manipulation policy has on what the next best action is. We demonstrate the capabilities of our dual policy solution in five manipulation tasks that require information-seeking behaviors, both in simulation and in the real-world, where DISaM significantly outperforms existing methods. More information at https://sites.google.com/view/disam24/. }, }
Generated by bib2html.pl (written by Patrick Riley ) on Tue Nov 19, 2024 10:24:40