UTCS Artificial Intelligence
courses
talks/events
demos
people
projects
publications
software/data
labs
areas
admin
CONTRADOC: Understanding Self-Contradictions in Documents with Large Language Models (2024)
Jierui Li
, Vipul Raheja, Dhruv Kumar
In recent times, large language models (LLMs) have shown impressive performance on various document-level tasks such as document classification, summarization, and question-answering. However, research on understanding their capabilities on the task of self-contradictions in long documents has been very limited. In this work, we introduce CONTRADOC, the first human-annotated dataset to study self-contradictions in long documents across multiple domains, varying document lengths, self-contradiction types, and appearance scope. We then analyze the current capabilities of four state-of-the-art open-source and commercially available LLMs: GPT3.5, GPT4, PaLM2, and LLaMAv2 on this dataset. While GPT4 performs the best and can outperform humans on this task, we find that it is still unreliable and struggles with self-contradictions that require more nuance and context. We release the dataset 1 and all the code associated with the experiments.
View:
PDF
,
Arxiv
Citation:
North American Chapter of the Association for Computational Linguistics (NAACL)
(2024).
Bibtex:
@article{li:naacl24, title={CONTRADOC: Understanding Self-Contradictions in Documents with Large Language Models}, author={Jierui Li and Vipul Raheja and Dhruv Kumar}, booktitle={North American Chapter of the Association for Computational Linguistics (NAACL)}, month={June}, url="http://www.cs.utexas.edu/users/ai-labpub-view.php?PubID=128059", year={2024} }
Presentation:
Poster
People
Jierui Li
Ph.D. Student
jierui [at] cs utexas edu
Areas of Interest
Deep Learning
Natural Language Processing
Labs
Machine Learning