We introduce a new entity typing task: given a sentence with an entity mention, the goal is to predict a set of free-form phrases (e.g. skyscraper, songwriter, or criminal) that describe appropriate types for the target entity. Our entity mention is not limited to named entities, and encompasses pronoun mentions and common noun expressions. Here are some examples in our dataset:
Example (with entity bolded) | Crowdsourced type labels |
---|---|
Young ‘s mother says , when she sees her son , she plans on hugging him for a good solid half hour. | person, son, relative, child, man, male |
`` There is a wealth of good news in this report , and I ‘m particularly encouraged by the progress we are making against AIDS , ‘’ HHS Secretary Donna Shalala said in a statement . | group, organization, government, hospital, administration, socialist |
In contrast to the male way of thinking , in which priority has always been given to considerations of political and economic power, Annette Lu has emphasized “soft national power.” | person, officeholder, president, official, leader, incumbent |
For starters , it ‘s not an ordinary sun but a Cepheid variable - a giant , pulsating star shining with the light of at least a thousand suns . | object, celestial body, sun, star |
Our crowd-sourced evaluation sets are much more diverse and fine-grained than existing benchmarks, requiring 429 types to cover 80% of data. FIGER requires onlythe top 7 types, while OntoNotes needs only 4.
This formulation allows us to use a new type of distant supervision at large scale: head words, which indicate the type of the noun phrases they appear in. This new type of supervision is contextualized, unlike prior supervision from entity linking which lists all types of the entity. Here are examples of distant supervision used in our task.
Example (with entity bolded) | Distant supervision labels | Source |
---|---|---|
Alexis Kaniaris, CEO of the organizing company Eu-ropartners, explained, speaking in a radio program in national radio station NET. | radio, station, radio station | Headword |
Toyota recalled more than 8 million vehicles globally oversticky pedals that can become entrapped in floor mats. | manufacter | Entity Linking to Wikipedia |
Iced Earth’s musical style is influenced by many traditionalheavy metal groups such as Black Sabbath. | person, artist, actor, author, musician | Entity Linking to Knowledge Base |
We present a neural model that can predict ultra-fine types, and is trained using a multitask objective that pools our new head-word supervision with prior supervision from entity linking. Experimental results demonstrate that our model is effective in predicting entity types at varying granularity; it achieves state of the art performance on an existing OntoNotes fine-grained entity typing benchmark, and sets baselines for our newly-introduced datasets.
Pretrained model / outputs (212MB)
Wikipedia Entity -> Type mapping (414MB)
Code (github repo) (Evaluation script is inside the repo as scorer.py)
@InProceedings{Choi:2018:ACL, author = {Choi, Eunsol and Levy, Omer and Choi, Yejin and Zettlemoyer}, title = {Ultra-Fine Entity Typing}, booktitle = {Proceedings of the ACL}, year = {2018}, publisher = {Association for Computational Linguistics} }