Richard Bergmair's Publications

Bergmair, Richard. 2010. Monte Carlo Semantics: Robust Inference and Logical Pattern Processing with Natural Language Text.” Ph.D. thesis, Cambridge, England: University of Cambridge.

This thesis develops several pieces of theory and computational techniques which can be deployed for the purpose of allowing a computer to analyze short pieces of text (e.g. “Socrates is a man and every man is mortal.”) and, on the basis of such an analysis, to decide yes/no questions about the text (“Is Socrates mortal?”). More particularly, the problem is seen as a logical inferencing task. The computer must decide whether or not a logical consequence relation “therefore” holds between the two pieces of text. (“Socrates is a man and every man is mortal, therefore Socrates is mortal.”)

This problem is a pervasive theme in logic and semantics but has also been subject over the last five years to a wave of renewed attention in computational linguistics sparked by the Recognizing Textual Entailment (RTE) challenge. A critical reevaluation of this line of work is presented here which demonstrate several problems concerning the empirical methodology used at RTE and the results derived from it. This thesis is thus more theory-driven, but nevertheless inspired by RTE in that it addresses problems raised by RTE which have not previously received sufficient attention from a theoretical viewpoint, such as the problem of robustness.

With this goal in mind, two of the results on Natural Language Reasoning (NLR) established here become particularly important: (1) Assuming the syllogism as a benchmark fragment of NLR, the model theory which underlies NLR is not necessarily a two-valued logic, but it can be the many-valued Lukasiewicz logic. (2) Despite the fact that the syllogism is a logical language of less expressive power than natural language as a whole, a good approximation to NLR can still be obtained by using the method outlined here for rewriting natural language text into syllogistic premises.

These two properties of NLR enable the approach to robust inference and logical pattern processing called Monte Carlo semantics, which, in turn, demonstrates that a single logically based theory can account for the semantic informativity of deep techniques using theorem proving and for the robustness of bag-of-words shallow inference.