Department of Computer Science

Machine Learning Research Group

University of Texas at Austin Artificial Intelligence Lab

Publications: Speech

Spoken Language Technology; language-audio processing
  1. Temporally Streaming Audio-Visual Synchronization for Real-World Videos
    [Details] [PDF]
    Jordan Voas, Wei-Cheng Tseng, Layne Berry, Xixi Hu, Puyuan Peng, James Stuedemann, and David Harwath
    In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), February 2025.
  2. Measuring Sound Symbolism in Audio-visual Models
    [Details] [PDF] [Poster]
    Wei-Cheng Tseng, Yi-Jen Shih, David Harwath, Raymond Mooney
    In IEEE Spoken Language Technology (SLT) Workshop, December 2024.
  3. Multimodal Contextualized Semantic Parsing from Speech
    [Details] [PDF] [Slides (PDF)] [Poster] [Video]
    Jordan Voas, Raymond Mooney, David Harwath
    In Association for Computational Linguistics (ACL), August 2024.