UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
Using Closed Captions to Train Activity Recognizers that Improve Video Retrieval (2009)
Sonal Gupta
and
Raymond Mooney
Recognizing activities in real-world videos is a difficult problem exacerbated by background clutter, changes in camera angle & zoom, rapid camera movements etc. Large corpora of labeled videos can be used to train automated activity recognition systems, but this requires expensive human labor and time. This paper explores how closed captions that naturally accompany many videos can act as weak supervision that allows automatically collecting labeled data for activity recognition. We show that such an approach can improve activity retrieval in soccer videos. Our system requires no manual labeling of video clips and needs minimal human supervision. We also present a novel caption classifier that uses additional linguistic information to determine whether a specific comment refers to an ongoing activity. We demonstrate that combining linguistic analysis and automatically trained activity recognizers can significantly improve the precision of video retrieval.
View:
PDF
Citation:
In
Proceedings of the CVPR-09 Workshop on Visual and Contextual Learning from Annotated Images and Videos (VCL)
, Miami, FL, June 2009.
Bibtex:
@inproceedings{gupta:cvpr09vcl, title={Using Closed Captions to Train Activity Recognizers that Improve Video Retrieval}, author={Sonal Gupta and Raymond Mooney}, booktitle={Proceedings of the CVPR-09 Workshop on Visual and Contextual Learning from Annotated Images and Videos (VCL)}, month={June}, address={Miami, FL}, url="http://www.cs.utexas.edu/users/ai-lab?gupta:cvpr09vcl", year={2009} }
People
Sonal Gupta
Masters Alumni
sonal [at] cs stanford edu
Raymond J. Mooney
Faculty
mooney [at] cs utexas edu
Areas of Interest
Language and Vision
Machine Learning
Labs
Machine Learning