Department of Computer Science

Machine Learning Research Group

University of Texas at Austin Artificial Intelligence Lab

Publications: Connecting Language and Perception

To truly understand language, an intelligent system must be able to connect words, phrases, and sentences to its perception of objects and events in the world. Ideally, an AI system would be able to learn language like a human child, by being exposed to utterances in a rich perceptual environment. The perceptual context would provide the necessary supervisory information, and learning the connection between language and perception would ground the system's semantic representations in its perception of the world. As a step in this direction, our research is developing systems that learn semantic parsers and language generators from sentences paired only with their perceptual context. It is part of our research on natural language learning. Our research on this topic is supported by the National Science Foundation through grants IIS-0712097 and IIS-1016312.
  • Grounded Language Learning [Video Lecture]
  • Raymond J. Mooney, Invited Talk, AAAI, 2013.
  • Learning Language from its Perceptual Context [Video Lecture]
  • Raymond J. Mooney, Invited Talk, ECML-PKDD, 2008.

Sub-areas:
  1. Temporally Streaming Audio-Visual Synchronization for Real-World Videos
    [Details] [PDF]
    Jordan Voas, Wei-Cheng Tseng, Layne Berry, Xixi Hu, Puyuan Peng, James Stuedemann, and David Harwath
    In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), February 2025.
  2. Measuring Sound Symbolism in Audio-visual Models
    [Details] [PDF] [Poster]
    Wei-Cheng Tseng, Yi-Jen Shih, David Harwath, Raymond Mooney
    In IEEE Spoken Language Technology (SLT) Workshop, December 2024.
  3. Multimodal Contextualized Semantic Parsing from Speech
    [Details] [PDF] [Slides (PDF)] [Poster] [Video]
    Jordan Voas, Raymond Mooney, David Harwath
    In Association for Computational Linguistics (ACL), August 2024.
  4. What is the Best Automated Metric for Text to Motion Generation?
    [Details] [PDF]
    Jordan Voas
    Masters Thesis, Department of Computer Science, UT Austin, Austin, TX, May 2023.
  5. What is the Best Automated Metric for Text to Motion Generation?
    [Details] [PDF] [Slides (PPT)] [Video]
    Jordan Voas, Yili Wang, Qixing Huang, Raymond Mooney
    In ACM SIGGRAPH Asia, December 2023.
  6. Directly Optimizing Evaluation Metrics to Improve Text to Motion
    [Details] [PDF]
    Yili Wang
    Masters Thesis, Department of Computer Science, UT Austin, May 2023.
  7. Systematic Generalization on gSCAN with Language Conditioned Embedding
    [Details] [PDF] [Video]
    Tong Gao, Qi Huang and Raymond J. Mooney
    In The 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing , December 2020.
  8. Dialog as a Vehicle for Lifelong Learning
    [Details] [PDF] [Slides (PDF)] [Video]
    Aishwarya Padmakumar, Raymond J. Mooney
    In Position Paper Track at the SIGDIAL Special Session on Physically Situated Dialogue (RoboDial 2.0), July 2020.
  9. Generating Animated Videos of Human Activities from Natural Language Descriptions
    [Details] [PDF] [Poster]
    Angela S. Lin, Lemeng Wu, Rodolfo Corona , Kevin Tai , Qixing Huang , Raymond J. Mooney
    In Proceedings of the Visually Grounded Interaction and Language Workshop at NeurIPS 2018, December 2018.
  10. Learning a Policy for Opportunistic Active Learning
    [Details] [PDF]
    Aishwarya Padmakumar, Peter Stone, Raymond J. Mooney
    In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP-18), Brussels, Belgium, November 2018.
  11. Learning to Connect Language and Perception
    [Details] [PDF]
    Raymond J. Mooney
    In Proceedings of the 23rd AAAI Conference on Artificial Intelligence (AAAI), 1598--1601, Chicago, IL, July 2008. Senior Member Paper.
  12. Learning Language Semantics from Ambiguous Supervision
    [Details] [PDF]
    Rohit J. Kate and Raymond J. Mooney
    In Proceedings of the 22nd Conference on Artificial Intelligence (AAAI-07), 895-900, Vancouver, Canada, July 2007.
  13. Learning Language from Perceptual Context: A Challenge Problem for AI
    [Details] [PDF]
    Raymond J. Mooney
    In Proceedings of the 2006 AAAI Fellows Symposium, Boston, MA, July 2006.