The York Poetry Corpus: Syntactic Annotations, File Formats, and Search Tools

The syntactic annotations of the York Poetry Corpus enable the users to pose and answer questions about word order, constituent order, abstract structure, and syntactic, morphological and lexical characteristics of the texts in the corpus. The annotations are general-purpose and as theory-neutral as possible, while still incorporating the insights of modern linguistic theory, and they can be used by scholars with widely varying research interests. But it must be emphasized that the annotations should in no way be regarded as the implementation of a structural analysis, and that the annotation schemes were developed primarily as a tool for the investigation of the structure of earlier stages of English.

The syntactic annotations mark constituents, both clausal and non-clausal, by labelled brackets, with some relations marked by empty categories. The structure assigned to a sentence by the labelled bracketing can be quite complex, but it is not a complete syntactic analysis: the function of the bracketing is not to assign a structure to Old English sentences but rather to facilitate searches.

Below is an example of an annotated sentence from the York Poetry Corpus:

( (IP-MAT (NP-NOM (PRO^N He))
          (NP-ACC (N^A beot))
          (NEG ne)
          (VBDI aleh)
          (. ,))
  (ID cobeowul,5.80.62))

The sentence begins with the subject 'He', followed by the accusative object 'beot', followed by the negative particle 'ne', followed by the past-tense indicative verb 'aleh', followed by a comma, which marks the end of the sentence. The annotation is described in detail in the manuals accompanying the corpus.

The format of the texts in the York Poetry Corpus is suitable for searching with CorpusSearch, a powerful search engine developed by Beth Randall for the second edition of the Penn-Helsinki Parsed Corpus of Middle English (PPCME2). For information about PPCME2 and CorpusSearch, and for order forms, click here.