Publications: 2025
- MET-Bench: Multimodal Entity Tracking for Evaluating the Limitations of Vision-Language and Reasoning Models
[Details] [PDF]
Vanya Cohen, Raymond Mooney
Preprint, January 2025.
- Temporally Streaming Audio-Visual Synchronization for Real-World Videos
[Details] [PDF]
Jordan Voas, Wei-Cheng Tseng, Layne Berry, Xixi Hu, Puyuan Peng, James Stuedemann, and David Harwath
In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), February 2025.