UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
Using Biomedical Literature Mining to Consolidate the Set of Known Human Protein-Protein Interactions (2005)
A. Ramani, E. Marcotte, R. Bunescu and
Raymond J. Mooney
This paper presents the results of a large-scale effort to construct a comprehensive database of known human protein interactions by combining and linking known interactions from existing databases and then adding to them by automatically mining additional interactions from 750,000 Medline abstracts. The end result is a network of 31,609 interactions amongst 7,748 proteins. The text mining system first identifies protein names in the text using a trained Conditional Random Field (CRF) and then identifies interactions through a filtered co-citation analysis. We also report two new strategies for mining interactions, either by finding explicit statements of interactions in the text using learned pattern-based rules or a Support-Vector Machine using a string kernel. Using information in existing ontologies, the automatically extracted data is shown to be of equivalent accuracy to manually curated data sets.
View:
PDF
,
PS
Citation:
In
Proceedings of the ISMB/ACL-05 Workshop of the BioLINK SIG: Linking Literature, Information and Knowledge for Biology
, Detroit, MI, June 2005.
Bibtex:
@InProceedings{ramani:biolink05, title={Using Biomedical Literature Mining to Consolidate the Set of Known Human Protein-Protein Interactions}, author={A. Ramani and E. Marcotte and R. Bunescu and Raymond J. Mooney}, booktitle={Proceedings of the ISMB/ACL-05 Workshop of the BioLINK SIG: Linking Literature, Information and Knowledge for Biology}, month={June}, address={Detroit, MI}, key={RMN, IE, BIODM}, url="http://www.cs.utexas.edu/users/ai-lab?ramani:biolink05", year={2005} }
People
Razvan Bunescu
Ph.D. Alumni
bunescu [at] ohio edu
Raymond J. Mooney
Faculty
mooney [at] cs utexas edu
Areas of Interest
Bioinformatics
Information Extraction
Machine Learning
Labs
Machine Learning