Mats Rooth
Professor
Linguistics and Computing and Information Science


Research areas: computational linguistics, semantics
more specifically: ellipsis, intonation, lexicon induction, parse forest algorithms, statistical parsing

Schedule for introductory courses:

Cornell NLP Group
PhD Concentration in Computational Linguistics

Prepublication versions of papers (Some pdfs of older papers are converted from ps. The ps version may print better.)

Notions of Focus Anaphoricity (to appear). In Féry, C., G. Fanselow & M. Krifka, eds., The Notions of Information Structure. In Interdisciplinary Studies on Information Structure, vol. 6. SFB 632, University of Potsdam. (pdf)

Scope Disambiguation by Ellipsis and Focus without Scope Economy. 15th Amsterdam Colloquium, 2005. (pdf)

Topic Accents on Quantifiers. Appears in Reference and Quantification: the Partee Effect, Greg Carlson and Jeffrey Pelletier, (eds.) Stanford: CSLI Publications 2005. (ps)

Parse Forest Computation of Expected Governors. With Helmut Schmid. In 39th Annual Meeting of the ACL , 2001, Maryland. (.ps/ .ps.gz)

Inducing a Semantically Annotated Lexicon via EM-Based Clustering. With Stefan Riezler, Detlef Prescher, Glenn Carroll, and Franz Beil. In 37th Annual Meeting of the ACL, 1999, Maryland. (.ps/ .ps.gz)

Inside-Outside Estimation of a Lexicalized PCFG for German. With Franz Beil, Glenn Carroll, Detlef Prescher, and Stefan Riezler. In 37th Annual Meeting of the ACL, 1999, Maryland. (.ps/ .ps.gz)

EM-Based Clustering for NLP Applications. With Stefan Riezler, Detlef Prescher, Glenn Carroll, and Franz Beil. In Inducing Lexicons with the EM Algorithm, AIMS Report 4(3), 1998, IMS, Universität Stuttgart. 97-124. (.ps/.ps.gz)

Inside-Outside Estimation of a Lexicalized PCFG for German - GOLD. With Franz Beil, Glenn Carroll, Detlef Prescher, and Stefan Riezler. In Inducing Lexicons with the EM Algorithm, AIMS Report 4(3), 1998, IMS, Universität Stuttgart. 75-96. (.ps/.ps.gz)

Valence induction with a head-lexicalized PCFG. With Glenn Carroll. In proceedings of the 3rd conference on empirical methods in natural language processing (EMNLP 3), 1998, Granada. Summary: inside-outside estimation of the parameters of a hand-written lexicalized probabilistic context free grammar. A longer version of the paper, with details e.g. on smoothing.

Two-dimensional clusters in grammatical relations Paper presented at IJCAI Lexicon workshop, Stanford, 1995. Summary: EM induction of a latent class model of word pairs standing in a grammatical relation such as verb/object. A substantial model.

On the interface principles for intonational focus, final version appears in proceedings of SALT VI (pdf, ps, ps.gz). Summary: an argument that metrical prominence in the postnuclear tail is semantically significant (may mark focus). Some speech data files from the paper, in Waves format: only_manny_s.d, only_name_so2.d, only_name_so1.d.

A theory of focus interpretation , final version appears in Natural Language Semantics 1, 75-116, 1992.

Ellipsis redundancy and reduction redundancy , this version paper appears in an SFB 340 report.

Indefinites, adverbs of quantification, and focus semantics , final version appears in The Generic Book, F.J. Pelletier and G. Carlson, eds.

Focus , a survey paper, final version appears in Handbook of Contemporary Semantic Theory.

Association with focus or association with presupposition?.

Structural ambiguity and lexical relations, with Donald Hindle. Final version appears in Computational Linguistics 18, 1993.

Epistemic NP Modifiers (pdf / ps / ps.gz ). With Dorit Abusch. Final version appears in proceedings of SALT VII. (Mac/Word)


Current Presentations

Invited commentary given with Dorit Abusch on Richmond H. Thomason, Matthew Stone, and David DeVault, ``Enlightened Update: A Computational Architecture for Presupposition and Other Pragmatic Phenomena.'' OSU Accommodation Workshop, October 13, 2006. (pdf)

S-Focus and Relativized Stress F. International Conference on Information Structure. SFB 632, Universität Potsdam and Humboldt-Universät zu Berlin. June 6-8, 2006. The same talk was given at Tuebingen and Frankfurt in June 2006. (pdf, ps)

Scope Disambiguation by Ellipsis and Focus without Scope Economy. 15th Amsterdam Colloquium, December 2005. (ps, pdf)


Software

English 97 is a Lopar lexicalized statistical language model trained on circa 50 million words of Wall Street Journal data. The grammar is as in Carroll and Rooth 1998, except that the the chunk trigram robustness rules have been replaced with bigram rules, and the lexicon is lemmatized.

BankBaseline is a lexicalized Lopar model based on Penn Treebank II sections 0-15. Andrew Jonas created the lemma mapping in the lexicon.

PF Linear Expectation is an implementation in Java of a generalization of the governor algorithm of Schmid and Rooth (2001). Expected values for markup functions of a certain linear form can be computed; governor and depth-of-embedding functions are included.

The software may be used for education and scientific research.


Research Index| Archive | plan | דורית אבוש 

This idea you linguists have of using trees is a very good idea. --Martin Kay
Determiners are verbs. --James McCawley
I think I am a verb. --R. Buckminster Fuller
Use the compiler, Luke. Let it catch your errors. -- kwc