Description of the projectBack to TopThe ability to understand natural-language instructions is critical to building intelligent agents that interact with humans. In this project we look at building a system that learns to transform natural-language navigation instructions into executable formal plans. Given no prior linguistic knowledge, the system learns by only observing how humans follow navigation instructions. The system is trained and evaluated based on the instructor and follower data collected by MacMahon et al. (2006). There are three virtual indoor environments in total. Each environment consists of interconnecting hallways with objects placed at various intersections. There are several different floor patterns as well as wall painting which were used in conjunctiong with the objects for giving directions. This project is part of our larger effort in developing learning techniques for ground language acquisition. Compared to our earlier project on Learning to Sportscast, this project has a more complex ambiguous supervsion problem. Instead of considering only a handful of possible events referred to by a sportscasting comment, we have to consider an exponential number of navigation plans for each instruction. The interactive nature of the navigation task also allows for more interesting learning scenarios where a human participant is involved. |
DemoBack to TopBelow is an example of a successful parse by our system trained on refined landmarks plan. In addition to the simulation, the parse for each instruction is also shown. Notice that even though it does not correctly parse everything, it captures enough of the meaning to form a sufficient plan. |
Publication and TalksBack to Top
|
Data and CodeBack to Top
OverviewThe MARCO code and data used in all our experiments were originally produced by Matt MacMahon as described in his AAAI 2006 paper. There are three environments used (named grid, l, and jelly) with instructions collected from 6 different subjects. The included map files contain information about the layout of the environments and the locations of the objects. The included MARCO code is a modified version of Matt's original code to facilitate easier usage of the MARCO parser and executor. Citations
Please use the following citation when referencing the original MARCO code and data: @InProceedings{macmahon:aaai06, title = "Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions", author = "Matt MacMahon and Brian Stankiewicz and Benjamin Kuipers", booktitle = "Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-2006)", address = "Boston, MA, USA", month = "July", year = 2006 }Please use the following citation when referencing our modified version of the MARCO code and data: @InProceedings{chen:aaai11, title = "Learning to Interpret Natural Language Navigation Instructions fro mObservations", author = "David L. Chen and Raymond J. Mooney", booktitle = "Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI-2011)", address = "San Francisco, CA, USA", month = "August", year = 2011 }The Mandarin Chinese translation of the data was first mentioned in the following paper: @InProceedings{chen:acl12, title = "Fast Online Lexicon Learning for Grounded Language Acquisition", author = "David L. Chen", booktitle = "Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL-2012)", address = "Jeju, Republic of Korea", month = "July", year = 2012 } DownloadsCompressed tarballs of data and code: LearningNavigationInstructions.tgzYou can also browse the data and code here |
Contact InformationBack to TopIf you have any questions or comments, please contact David Chen If you are interested in reading more literature in this area, check out our reading group CLAMP |