The York-Helsinki Parsed Corpus of Old English Poetry

The York-Helsinki Parsed Corpus of Old English Poetry (henceforth the York Poetry Corpus) is a selection of poetic texts from the Old English Section of the Helsinki Corpus of English Texts (henceforth the Helsinki Corpus), annotated to facilitate searches on lexical items and syntactic structure. It is intended for the use of students and scholars of the history of the English language. The York Poetry Corpus contains 71,490 words of Old English text; the samples from the longer texts are 4,000 to 17,000 words in length. The texts included in the corpus represent a range of dates of composition and authors. The texts are syntactically and morphologically annotated. The size of the corpus is approximately 2.5 megabytes.

The York Poetry Corpus was funded by ESRC grant R000222434, whose support is gratefully acknowledged. The annotation scheme was developed by Susan Pintzuk, Ann Taylor, Anthony Warner, Leendert Plug, and Frank Beths, and implemented by Leendert Plug. The scheme was based on the one developed at the University of Pennsylvania for the second edition of the Penn-Helsinki Parsed Corpus of Middle English, and it is the same as the one used for the York-Helsinki Parsed Corpus of Old English (under construction at the University of York). Our intent was to make the syntactic annotation of the three corpora as similar as possible, while taking into account the syntactic and morphological differences between Old and Middle English and between poetry and prose.

The syntactic annotations of the York Poetry Corpus enable the users to pose and answer questions about word order, constituent order, abstract structure, and syntactic, morphological and lexical characteristics of the texts in the corpus.

The York Poetry Corpus is available without fee for educational and research purposes, but it is not in the public domain. Copyright to the Helsinki Corpus texts in their computerized form is retained by the Helsinki Corpus (© 1991); copyright to the annotated files is retained by Susan Pintzuk and Leendert Plug (© 2001); and copyright to the York Poetry Corpus Manual is retained by Ann Taylor and Leendert Plug (© 2001). Some of the original texts are also under copyright and are distributed under permission granted to the Helsinki Corpus.

Viewing the manuals on-line is unrestricted, but the texts themselves are available only to users who agree formally to the conditions of use by filling out the access request form and returning it via e-mail to Susan Pintzuk (sp20@york.ac.uk). The York Poetry Corpus is also available through the Oxford Text Archive (OTA). The easiest way to find the York Poetry Corpus in the OTA catalogue is to do an advanced search on 'parsed corpus' or 'York-Helsinki' as the title.

The York Poetry Corpus is part of a larger project to produce syntactically annotated corpora for all stages of the history of English: