Leveraging Natural Language Understanding to Build Robust Question Answering Models

Geetanjali Rakshit
Computer Science Ph.D. Student
Computer Science Ph.D.
Abstract: To solve many of the current challenges in Question Answering (QA), there is a need to build more robust models that have a deep understanding of text. This work leverages the use of natural language as a meaning representation in downstream tasks like question answering. As a step towards this goal, I propose a method to convert text in the form of a sentence into natural question-answer pairs, using its abstract meaning representation, and develop a tool, AMR Sourced Questions (ASQ), to do this. Data generated from ASQ could be used to train neural models to convert a sentence into question answer meaning representation or vice versa, for any domain. QA models have been shown to be sensitive to the choice of words in the questions, with a drop in performance when the questions are paraphrased. Towards addressing this performance gap, I propose to automatically create rich annotations on top of an existing QA dataset, in which the text containing the answer is paraphrased in a sequence of natural language steps deriving the answer. With this data, I propose to build neural models that learn to do more than surface-level processing of text and can handle paraphrases, with the aim of performing better than competitive baselines. As a third contribution, in my current work I demonstrate that state-of-the-art question answering models may be ranked improperly due to annotation errors in the evaluation sets. I propose a more rigorous and careful automatic evaluation method where predicted answers are evaluated against an exhaustive set of ground-truth answers, and present a human analysis of incorrect answers predicted by various QA models.

Jeffrey Flanigan
Computer Science Ph.D.