German Multiple Fronting Corpus

The German Multiple Fronting Corpus was compiled within Project A6, Sonderforschungsbereich 632 (Information Structure). It documents a relatively rare phenomenon in German (a verb-second language), namely the multiple filling of a sentence's prefield position. The sentences were extracted between 2008 and 2011 from the publicly accessible part of the corpora hosted at Institut für Deutsche Sprache (IDS), Mannheim. Due to copyright restrictions, the Multiple Fronting Corpus can only be accessed via the IDS (following the link below).

Instances of multiple fronting were sampled along with some preceding and following material (usually several sentences) in order to examine their function in discourse, in particular their information structural properties. All sentences were post-processed with the TreeTagger (lemmatization, part-of-speech tagging using STTS). In the "target" sentences (those exhibiting multiple fronting), the individual constituents in the prefield were manually annoted according to their syntactic category and function and also (partially) according to their focus, topic and givenness status. In addition, there is a topological annotation available for these sentences (annotation details).

The corpus is searchable with the ANNIS-tool (Project D1 / SFB 632). The web interface provides a tutorial of the query language (Annis Query Language). (Hint: Type field="vf1" in the query window to see all instances of multiple fronting in the corpus.)

For a discussion of the corpus data, see: Bildhauer, Felix (2011). Mehrfache Vorfeldbesetzung und Informationsstruktur. Eine Bestandsaufnahme. Deutsche Sprache 4/2011, 362–379.

Some discussion in English is available in: Bildhauer, Felix & Cook, Philippa (2010). German Multiple Fronting and Expected Topichood. In Stefan Müller (ed.), Proceedings of the HPSG 2010 Conference, Université Paris Diderot, France. Stanford: CSLI Publications, 68–79.

