Department of Computer Science

Machine Learning Research Group

University of Texas at Austin Artificial Intelligence Lab

Publications: 2021

  1. TellMeWhy: A Dataset for Answering Why-Questions in Narratives
    [Details] [PDF] [Slides (PDF)] [Video]
    Yash Kumar Lal, Nathanael Chambers, Raymond Mooney, Niranjan Balasubramanian
    In Findings of ACL 2021, August 2021.
    Answering questions about why characters perform certain actions is central to understanding and reasoning about narratives. Despite recent progress in QA, it is not clear if existing models have the ability to answer “why” questions that may require common-sense knowledge external to the input narrative. In this work, we introduceTellMeWhy, a new crowd-sourced dataset that consists of more than 30k questions and free-form answers concerning why characters in short narratives perform the actions described. For a third of this dataset, the answers are not present within the narrative. Given the limitations of automated evaluation for this task, we also present a systematized human evaluation interface for this dataset. Our evaluation of state-of-the-art models shows that they are far below human performance on answering such questions. They are especially worse on questions whose answers are external to the narrative, thus providing a challenge for future QAand narrative understanding research.
    ML ID: 406
  2. Zero-shot Task Adaptation using Natural Language
    [Details] [PDF]
    Prasoon Goyal, Raymond J. Mooney, Scott Niekum
    In Arxiv, June 2021.
    Imitation learning and instruction-following are two common approaches to communicate a user’s intent to a learning agent. However, as the complexity of tasks grows, it may be beneficial to use both demonstrations and language to communicate with an agent. In this work, we propose a novel setting where, given a demonstration for a task (the source task), and a natural language description of the differences between the demonstrated task and a related but different task (the target task), our goal is to train an agent to complete the target task in a zero-shot setting that is, without any demonstrations for the target task. To this end, we introduce Language-Aided Reward and Value Adaptation (LARVA) which, given a source demonstration and a linguistic description of how the target task differs, learns to output either a reward or value function that accurately reflects the target task. Our experiments show that on a diverse set of adaptations, our approach is able to complete more than 95% of target tasks when using template-based descriptions, and more than 70% when using free-form natural language.
    ML ID: 402
  3. Facilitating Software Evolution through Natural Language Comments and Dialogue
    [Details] [PDF] [Slides (PDF)] [Video]
    Sheena Panthaplackel
    October 2021. Ph.D. Proposal.
    Software projects are continually evolving, as developers incorporate changes to refactor code, support new functionality, and fix bugs. To uphold software quality amidst constant changes and also facilitate the prompt implementation of critical changes, it is desirable to have automated tools for guiding developers in making methodical software changes. We explore tasks and data and design machine learning approaches which leverage natural language to serve this purpose. When developers make code changes, they sometimes fail to update the accompanying natural language comments documenting various aspects of the code, which can lead to confusion and vulnerability to bugs. We present our completed work on alerting developers of inconsistent comments upon code changes and suggesting updates by learning to correlate comments and code. When a bug is reported, developers engage in a dialogue to collaboratively understand it and ultimately resolve it. While the solution is likely formulated within the discussion, it is often buried in a large amount of text, making it difficult to comprehend, which delays its implementation through the necessary repository changes. To guide developers in more easily absorbing information relevant towards making these changes and consequently expedite bug resolution, we investigate generating a concise natural language description of the solution by synthesizing relevant content as it emerges in the discussion. In completed work, we benchmark models for generating solution descriptions and design a classifier for determining when sufficient context for generating an informative description becomes available. We also investigate a pipelined approach for real-time generation, entailing separate classification and generation models. For future work, we propose an improved classifier and also a more intricate system that is jointly trained on generation and classification. Next, we intend to study a system that can interactively generate natural language descriptions that can drive code changes. Finally, we plan to investigate how we can leverage the discussion context to also suggest concrete code changes for bug resolution
    ML ID: 399
  4. Using Natural Language to Aid Task Specification in Sequential Decision Making Problems
    [Details] [PDF] [Slides (PDF)] [Video]
    Prasoon Goyal
    October 2021. Ph.D. Proposal.
    Intelligent agents that can help humans accomplish everyday tasks, such as a personal robot at home or a robot in a work environment, is a long-standing goal of artificial intelligence. One of the requirements for such general-purpose agents is the ability to teach them new tasks or skills relatively easily. Common approaches to teaching agents new skills include reinforcement learning (RL) and imitation learning (IL). However, specifying the task to the learning agent, i.e. designing effective reward functions for reinforcement learning and providing demonstrations for imitation learning, are often cumbersome and time-consuming. We aim to use natural language as an auxiliary signal to aid task specification, which reduces the burden on the end user. To make reward design easier, we propose a novel framework that is used to generate language-based rewards in addition to the extrinsic rewards from the environment for faster policy training using RL. To ameliorate the problem of providing demonstrations, we propose a new setting that enables an agent to learn a new task without demonstrations in an IL setting, given a demonstration from a related task and a natural language description of the difference between the desired task and the demonstrated task. The primary contributions of this dissertation will be new frameworks that enable incorporating natural language in RL and IL, which would enable non-expert users to specify new tasks to intelligent agents more conveniently.
    ML ID: 398
  5. Incorporating Textual Resources to Improve Visual Question Answering
    [Details] [PDF] [Slides (PDF)]
    Jialin Wu
    September 2021. Ph.D. Proposal.
    Recently, visual question answering (VQA) emerged as a challenge multi-modal task and gained in popularity. The goal is to answer questions that query information associated with the visual content in the given image. Since the required information could be from both inside and outside the image, common types of visual features, such as object and attribute detection, fail to provide enough materials for answering the questions. Textual resources, such as captions, explanations, encyclopedia articles, can help VQA systems comprehensively understand the image, reason following the right path, and access external facts. Specifically, they provide concise descriptions of the image, precise reasons for the correct answer, and factual knowledge beyond the image. We presented completed work on generating image captions that are targeted to help answer a specific visual question. We introduced an approach that generates textual explanations and used these explanations to determine which answer is mostly supported. We used explanations to recognize the critical objects for solving the visual question and trained the VQA systems to be influenced by these objects most. We also explored using textual resources to provide external knowledge beyond the visual content that is indispensable for a recent trend towards knowledge-based VQA. We further propose to break down visual questions such that each segment, which carries a single piece of semantic content in the question, can be associated with its specific knowledge. This separation aims to help the VQA system understand the question structure to satisfy the need for linking different aspects of the question to different types of information within and beyond the image.
    ML ID: 397
  6. Supervised Attention from Natural Language Feedback for Reinforcement Learning
    [Details] [PDF]
    Clara Cecilia Cannon
    Masters Thesis, Department of Computer Science, The University of Texas at Austin, May 2021.
    In this paper, we introduce a new approach to Reinforcement Learning (RL) called “supervised attention” from human feedback which focuses on novel task learning from human interaction on relevant features of the environment, which we hypothesize will allow for effective learning from limited training data. We wanted to answer the following question: does the addition of language to existing RL frameworks improve agent learning? We wanted to show that language helps the agent pick out the most important features in its perception. We tested many methods for implementing this concept and settled on incorporating language feedback via a template matching scheme. While more sophisticated techniques, such as attention, would be better at grounding the language, we discovered this task is non-trivial for our choice of environment. Using deep learning methods, we translate human linguistic narration to a saliency map over the perceptual field. This saliency map is used to inform a deep-reinforcement learning system which features in the visual observation are most important relative to its position in the environment and optimize task learning. We establish a baseline model using deep TAMER and test our framework on Montezuma’s Revenge, the most difficult game in theAtari Arcade suite. However, our final framework demonstrates the incompatibility of language in the Atari suite in a supervised attention setting. The ultimate result showed that as long as the agent’s position in the observation was clear, the model ignores surrounding contextual information, regardless of potential benefit. We conclude that the Atari network of games is unsuitable for grounding natural language in high-dimensional state spaces. Further development of sophisticated simulations is required.
    ML ID: 396
  7. Copy That! Editing Sequences by Copying Spans
    [Details] [PDF] [Slides (PPT)] [Slides (PDF)] [Poster]
    Sheena Panthaplackel, Miltiadis Allamanis, Marc Brockschmidt
    In The AAAI Conference on Artificial Intelligence (AAAI), February 2021.
    Neural sequence-to-sequence models are finding increasing use in editing of documents, for example in correcting a text document or repairing source code. In this paper, we argue that common seq2seq models (with a facility to copy single tokens) are not a natural fit for such tasks, as they have to explicitly copy each unchanged token. We present an extension of seq2seq models capable of copying entire spans of theinput to the output in one step, greatly reducing the number of decisions required during inference. This extension means that there are now many ways of generating the same output, which we handle by deriving a new objective for training and a variation of beam search for inference that explicitly handles this problem.In our experiments on a range of editing tasks of natural language and source code, we show that our new model consistently outperforms simpler baselines.
    ML ID: 393
  8. A Recap of Early Work onTheory and Knowledge Refinement
    [Details] [PDF] [Slides (PPT)]
    Raymond J. Mooney, Jude W. Shavlik
    In AAAI Spring Symposium on Combining Machine Learning and Knowledge Engineering, March 2021.
    A variety of research on theory and knowledge refinement that integrated knowledge engineering and machine learning was conducted in the 1990's. This work developed a variety of techniques for taking engineer knowledge in the form of propositional or first-order logical rule bases and revising them to fit empirical data using symbolic, probabilistic, and/or neural-network learning methods. We review this work to provide historical context for expanding these techniques to integrate modern knowledge engineering and machine learning methods.
    ML ID: 392
  9. Deep Just-In-Time Inconsistency Detection Between Comments and Source Code
    [Details] [PDF] [Slides (PDF)] [Poster] [Video]
    Sheena Panthaplackel, Junyi Jessy Li, Milos Gligoric, Raymond J. Mooney
    In The AAAI Conference on Artificial Intelligence (AAAI), February 2021.
    Natural language comments convey key aspects of source code such as implementation, usage, and pre- and post-conditions. Failure to update comments accordingly when the corresponding code is modified introduces inconsistencies, which is known to lead to confusion and software bugs. In this paper, we aim to detect whether a comment becomes in-consistent as a result of changes to the corresponding body of code, in order to catch potential inconsistencies just-in-time, i.e., before they are committed to a version control system.To achieve this, we develop a deep-learning approach that learns to correlate a comment with code changes. By evaluating on a large corpus of comment/code pairs spanning various comment types, we show that our model outperforms multiple baselines by significant margins. For extrinsic evaluation, we show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system that can both detect and resolve inconsistent comments based on code changes.
    ML ID: 391
  10. Improving VQA and its Explanations by Comparing Competing Explanations
    [Details] [PDF] [Slides (PDF)]
    Jialin Wu, Liyan Chen, Raymond J. Mooney
    In The AAAI Conference on Artificial Intelligence (AAAI), Explainable Agency in Artificial Intelligence Workshop, February 2021.
    Most recent state-of-the-art Visual Question Answering (VQA) systems are opaque black boxes that are only trained to fit the answer distribution given the question and visual content. As a result, these systems frequently take shortcuts, focusing on simple visual concepts or question priors. This phenomenon becomes more problematic as the questions become complex that requires more reasoning and commonsense knowledge. To address this issue, we present a novel framework that uses explanations for competing answers to help VQA systems select the correct answer. By training on human textual explanations, our framework builds better representations for the questions and visual content, and then reweights confidences in the answer candidates using either generated or retrieved explanations from the training set. We evaluate our framework on the VQA-X dataset, which has more difficult questions with human explanations, achieving new state-of-the-art results on both VQA and its explanations.
    ML ID: 387
  11. Dialog Policy Learning for Joint Clarification and Active Learning Queries
    [Details] [PDF] [Slides (PDF)] [Poster] [Video]
    Aishwarya Padmakumar, Raymond J. Mooney
    In The AAAI Conference on Artificial Intelligence (AAAI), February 2021.
    Intelligent systems need to be able to recover from mistakes, resolve uncertainty, and adapt to novel concepts not seen during training. Dialog interaction can enable this by the use of clarifications for correction and resolving uncertainty, and active learning queries to learn new concepts encountered during operation. Prior work on dialog systems has either focused on exclusively learning how to perform clarification/ information seeking, or to perform active learning. In this work, we train a hierarchical dialog policy to jointly perform both clarification and active learning in the context of an interactive language-based image retrieval task motivated by an on-line shopping application, and demonstrate that jointly learning dialog policies for clarification and active learning is more effective than the use of static dialog policies for one or both of these functions.
    ML ID: 385