UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
Identifying Phrasal Verbs Using Many Bilingual Corpora (2013)
Karl Pichotta and John DeNero
We address the problem of identifying multiword expressions in a language, focusing on English phrasal verbs. Our
polyglot ranking
approach integrates frequency statistics from translated corpora in 50 different languages. Our experimental evaluation demonstrates that combining statistical evidence from many parallel corpora using a novel ranking-oriented boosting algorithm produces a comprehensive set of English phrasal verbs, achieving performance comparable to a human-curated set.
View:
PDF
Citation:
In
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013)
, pp. 636--646, Seattle, WA, October 2013.
Bibtex:
@inproceedings{pichotta:emnlp13, title={Identifying Phrasal Verbs Using Many Bilingual Corpora}, author={Karl Pichotta and John DeNero}, booktitle={Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013)}, month={October}, address={Seattle, WA}, pages={636--646}, url="http://www.cs.utexas.edu/users/ai-labpub-view.php?PubID=127402", year={2013} }
Presentation:
Poster
Areas of Interest
Lexical Semantics
Natural Language Processing
Labs
Machine Learning