York-Helsinki Corpus of Old English, POS Annotation


    final punctuation         . (period)
    non-final punctuation     , (comma)

Any punctuation that ends a token (periods, commas, semi-colons, question marks, etc.) is tagged with a period. Note, however, that tokens can terminate without any punctuation. Any punctuation which does not coincide with the end of token is tagged with a comma. A token consists of one main verb and its associated arguments and adjuncts. Conjoined subordinate clauses are included within the same token.

    He_PRO^N folgode_VBD +tam_D^D kasere_N^D uncu+d_ADJ^N him_PRO^D swa_ADV
    +teah_ADV ,_, na_NEG+ADV swylce_P he_PRO^N ne_NEG dorste_MDD for_P 
    his_PRO$ drihtne_NPR^D +drowian_VB ,_, ac_CONJ he_PRO^N wolde_MDD 
    gehyrtan_VB +da_D^A +te_C se_D^N h+a+dena_ADJ^N casere_N^N 
    d+aghwamlice_ADV acwealde_VBD for_P Cristes_NPR^G geleafan_N ._.

    &_CONJ Drihten_NPR^N cw+a+d_VBDI to_P him_PRO^D :_, Hwi_WADV eart_BEPI 
    +du_PRO^N yrre_ADJ^N ?_.

    And_CONJ ealle_Q^N +ta_D^N hyredmenn_N^N hine_PRO^A h+afdon_HVDI 
    for_P f+ader_N^A

Conjoined sentences and VPs are separated. Only main verbs clearly conjoined at the word level (as indicated by shared arguments) are kept together. In parsing (LINK TO SYNTAX) empty subjects are added to tokens lacking a subject due to elision under conjunction. For more detail see the PPCME2 rules for clausal conjunction.

    Ac_CONJ +ta_D^N h+a+denan_N^N hyna+d_VBPI and_CONJ hergia+d_VBPI 
    +ta_D^A Cristenan_N^A

    He_PRO^N gesette_VBD hine_PRO^A to_P ealdre_N^D ofer_P an_NUM^A 
    werod_N^A ,_.

    and_CONJ het_VBDI hine_PRO^A symble_ADV^T beon_BE +atforan_P his_PRO$
    gesih+de_N ._.

Periods in the text which are not used as sentential punctuation, such as periods indicating abbreviation, surrounding numbers and certain words (e.g., .x. .Mon. etc.), are not separated from the word they belong to.



Unlike in the PPCME2 where case-marking on nominal elements is largely non-existent and when present often unclear, case is still fully productive in Old English. Case is dealt with differently in the poetry and prose parts of the corpus.

Case in Poetry

In the poetry corpus case is marked on inflecting words following, in ambiguous cases, the decision of the editor of the edition. A few items which belong to normally inflecting categories (quantifiers, numbers, etc.) do not regularly inflect and are consequently not labelled for case. These are:

Present and past participles are labelled for case when modifying or attributive. When acting as part of the main verb sequence, past participles are only case marked if the case marking is overt (i.e., non-zero) and present participles if the case-ending is other than -E. See Case marking on participles.

Case in Prose

While case is a fully productive category in Old English, many case forms are formally ambiguous, and sometimes remain ambiguous even in context. Our basic approach to indicating case in the prose corpus is to mark it when it is clear, but not when it is ambiguous, or potentially ambiguous, tempered by considerations of the effort involved and the needs of the system as a whole.

The following parts of speech may be labelled for case:

  • nouns
  • adjectives
  • quantifiers
  • determiners
  • numbers
  • participles

In addition, the so-called "inflected infinitive" is labelled with dative case.

Certain items are never labelled for case. These are:

Other items with special rules for determining when to indicate case are the quantifier EALL when uninflected and the cardinal numbers AN, TWEGEN, +TRY in combination with other numbers.

Case is labelled on all case-inflecting words in the following circumstances:

  • when the case is lexically unambiguous; that is, when it is apparent from the lexical item in isolation (e.g. datives in -UM, genitives in -RA, accusatives in -NE, most determiners, etc.)

  • when one word in a constituent is lexically unambiguous for case, all other words in the same constituent inherit its case, whether they themselves are ambiguous or not (e.g. SIGE is nom/acc/dat ambiguous in the singular, but in +TONE SIGE it is labelled accusative because of +TONE)

  • a constituent ambiguous for case inherits case from an unambiguously marked conjoined constituent (e.g. in METE & DRINC in non-subject use, DRINC is unambiguously accusative, while METE is acc/dat ambiguous, but would be labelled accusative, inheriting this case from DRINC)

  • a constituent ambiguous for case inherts case from an unambiguously marked appositive and vice versa

When case-marking is ambiguous in isolation, it is nevertheless marked in the following circumstances:

  • nominative is marked on any word (apart from those listed as never taking case) which is part of the subject of a tensed clause

  • accusative is marked on non-subjects when the form in question is identical only with the nominative form (most masculines and all neuters); for most feminines in which nominative is distinctive, but accusative falls together with genitive and dative, and masculines/neuters in -E (nominative, accusative, and dative singular are the same), non-subjects are usually ambiguous on their own

  • genitive is marked on all parts of a potentially genitive nominal constituent which stands in relation (often but not always possessive) with another noun; genitive is also marked on forms which are acc/gen/dat or gen/dat ambiguous following prepositions that only take the genitive (i.e., ANDLANG(ES), INNAN, TO, TOFORAN, TOWEARD, UTAN, WI+T, and compounds of these, e.g., WI+TUTAN, etc.)

  • dative is marked on acc/dat ambiguous forms representing the person affected in all copular constructions (often a pronoun, ME, +TE, US, etc.)

  • the nom/acc/gen pl. ambiguous complement of an undeclinable quantifier (FELA, MA, LYT) is assumed to be genitive, while with all other quantifiers it is assumed to agree with the quantifier. This is a default decision that goes with the majority occuring pattern for each type, although there are clear examples of both that follow the other pattern (fela_Q suna_N^G but micele_Q^N suna_N^N)

Case on arguments of verbs and prepositions

Note particularly that words do not receive case from verbs or prepositions in a straightforward way; that is, ambiguous case forms acting as complements of verbs or prepositions are not generally labelled for case based on the case-taking properties of the governing verb or preposition. Thus, for instance, an acc/dat ambiguous complement of a verb/preposition which normally takes the dative will not be labelled dative, but rather left unmarked. The exception to this is that dat/gen ambiguous complements of verbs/prepositions not listed as taking genitive are assumed to be dative. This approach was adopted largely for efficiency reasons, to avoid having to find reliable information on the case requirement of every verb in the corpus. For consistency, the same rules are applied to prepositions.

Case on participles

When participles are part of the main verb sequence, they are only marked for case if the case is overt (i.e., non-zero) in the case of past participles, or not -E in the case of present participles.

    Eoforlic_N^N scionon_VBDI ofer_P hleorberan_N^D gehroden_VBN^N
    golde_N^D ,_, fah_ADJ^N ond_CONJ fyrheard_ADJ^N ;_.

    Sy+d+dan_P +arest_ADV^T $wear+d_BEDI feasceaft_ADJ^N funden_VBN ,_,
    he_PRO^N +t+as_D^G frofre_N^A gebad_VBDI ,_.

In the majority of cases modifying participles are appropriately case-marked, although there are a small number of exceptions (e.g., acc.sg. in zero rather than -NE). Thus, with one exception (see below), modifying and attributive participles are labelled with the case of the item they modify, whether the participle is appropriately case-marked or not.

    feower_NUM bearn_N^N for+d_RP gerimed_VBN^N

    se_D^N +de_C ealfela_Q ealdgesegena_N^G
    worn_N^A gemunde_VBD ,_, word_N^A o+ter_ADJ^A fand_VBDI so+de_ADV
    gebunden_VBN^A ;_.

The exceptional case is that of "naming" participles (GEHATEN, GECIGEN, etc.) which rarely if ever inflect in attributive use. These are therefore not labelled with case unless it is overt, in the same way as participles which are part of the main verb sequence.


    W+as_BEDI min_PRO$^N f+ader_N^N folcum_N^D gecy+ted_VBN ,_,
    +a+tele_ADJ^N ordfruma_N^N ,_, Ecg+teow_NPR^N haten_VBN ._.

Case on left-dislocations

Left-dislocated NPs may be in the nominative case even when the resumptive element is oblique. This means that in the case of a left-dislocated nom/acc ambiguous NP with an accusative resumptive element, the ambiguity cannot be resolved, and thus the left-dislocated NP is not labelled for case. The following rules are applied for labelling case on left-dislocated NPs.

  • a nom/acc ambiguous left-dislocated NP with a nominative resumptive element is labelled nominative
  • a nom/acc ambiguous left-dislocated NP with an accusative resumptive element is unmarked for case
  • a nom/acc ambiguous left-dislocated NP with an oblique resumptive element is labelled nominative
  • a acc/gen/dat ambiguous left-dislocated NP is given the same case as the resumptive element; if the resumptive element is ambiguous for case then so is the left-dislocation

Case-marking Flow Chart used by annotators.

Nouns and pronouns

Pronouns (PRO, PRO$)

        PRO     Pronoun
        PRO$    Pronoun, possessive
	MAN     Indefinite MAN

All personal pronouns are labelled PRO with the exception of indefinite MAN. Pronouns are tagged for case according to the case-marking rules. In the prose, 1st and 2nd person sg/pl pronouns are acc/dat ambiguous and thus are not generally labelled for case except in copular constructions.

    He_PRO^N gesette_VBD hine_PRO^A to_P ealdre_N^D
    ofer_P an_NUM^A werod_N^A ,_.

    Eala_INTJ ge_PRO^N godes_NPR^G cempan_N^N ,_, ge_PRO^N
    becomon_VBDI to_P sige_N^D ,_.

    +Tas_D^N +te_C her_ADV^L nu_ADV^T wepa+d_VBPI
    woldon_MDDI mid_P eow_PRO blissian_VB ,_, 

    gif_P +t+at_D^N is_BEPI so+d_ADJ^N +t+at_D^A ic_PRO^N eow_PRO s+ade_VBD

Reflexive pronouns

Personal pronouns can be used as reflexives in Old English, but they are not marked as such at the part-of-speech level, but rather in the parsing.

    Ic_PRO^N me_PRO gebidde_VBP                   

    Him_PRO^D +da_ADV Scyld_NPR^N gewat_VBDI   
    to_P gesc+aphwile_N^D felahror_ADJ^N feran_VB
    on_P frean_NPR^G w+are_N^A ._.

Forms of SELF are tagged ADJ. The occasional uses of pronoun-plus-SELF (HIMSELF, THEMSELVES etc.) are split to faciliate parsing.

    hi_PRO^N sylfe_ADJ^N

    Hi_PRO^N hyne_PRO^A +ta_ADV^T
    +atb+aron_VBDI to_P brimes_N^G faro+de_N^D ,_,
    sw+ase_ADJ^N gesi+tas_N^N ,_, swa_P he_PRO^N
    selfa_ADJ^N b+ad_VBDI ,_,

    $him_PRO^D $selfum_ADJ^D {TEXT:himselfum}_CODE

Possessive pronouns

Tagged as possessive pronouns are MIN, +TIN, HIS, HIRE, UNCER, URE, INCER, EOWER, HEORA. Of these, HIS, HIRE, HEORA are not tagged for case; the other possessives are declined like adjectives and are therefore consistently case-marked. Notice that all these forms can also be tagged PRO^G if the use is clearly genitival rather than possessive.

    Eadig_ADJ^N bi+d_BEPI se_D^N +te_C
    in_P his_PRO$ e+tle_N^D ge+tih+d_VBPI ,_.

    A_ADV^T ic_PRO^N symles_ADV^T w+as_BEDI on_P
    wega_N^G gehwam_Q^D willan_N^G +tines_PRO$^G
    georn_ADJ^N on_P mode_N^D ;_.

    Ic_PRO^N his_PRO^G $bidan_VB ne_NEG            <--- genitival use
    dear_MDPI ,_, re+tes_N^G on_P geruman_N^D ,_.

    W+as_BED hira_PRO^G Matheus_NPR^N sum_Q^N     <--- genitival use
    ,_, se_D^N mid_P Iudeum_NPR^D ongan_RP+AXDI
    godspell_N^A +arest_ADV^T wordum_N^D writan_VB
    wundorcr+afte_N^D ._. 

URE and EOWER sometimes fail to agree with a following noun, in which case they are tagged simply PRO$ without case. For URE this applies to nouns in the masc/neut. dat/gen.sg., masc/fem/neut. gen/dat.pl. and masc. acc.sg; for EOWER, it applies to masc/fem/neut gen.pl. and to fem. acc/gen/dat.sg. In addition EOWRE is ambiguous for case with fem. non-nominatives in -E (unlike most adjectives in -E) since the gen/dat form EOWERRE is often simplified to EOWRE falling together with the acc.

    ure_PRO$ goda_N^G                            (masc. gen.pl.)
    ure_PRO$ lenctenlicum_ADJ^D f+astene_N^D     (neut. dat.sg.)
    ure_PRO$ un+tances_N^G                       (masc. gen.sg.)

    eower_PRO$ wifa_N^G                  (neut. gen.pl.)
    +durh_P eower_PRO$ hiwr+adene_N      (fem. acc/dat/gen.sg.)

    eowre_PRO$ gewitleaste_N             (fem. acc/dat/gen.sg.)

Indefinite MAN (MAN)

Indefinite MAN is tagged MAN. It is always a subject so always case-marked nominative.

    Ac_CONJ +ta_D^N halgan_N^N tihton_VBDI +t+at_C
    man_MAN^N +ta_D^A ofnas_N^A ontende_RP+VBPS ,_.

    O+d+de_CONJ hi_PRO^N synd_BEPI st+anene_ADJ^G mid_P +tam_D^D +te_C
    man_MAN^N str+ata_N^A wyrc+d_VBPI ._.

Existential there

Existential +T+AR is not distinguished from locative +T+AR in the York Corpus as in the PPCME2; +T+AR is always treated as an locative adverb.

Singular, plural, and collective common nouns (N)

Unlike in the PPCME2, in the York Corpus no distinction is made between singular and plural nouns.

Singular, plural and collective nouns are all tagged N. All common nouns are tagged for case according to the case-marking rules.

    +Ta_ADV^T w+aron_BEDI twegen_NUM^N gebro+dra_N^N
    +a+telborene_VBN^N for_P worulde_N ,_, Marcus_NPR^N and_CONJ
    Marcellianus_NPR^N ,_, mycclum_Q^D geswencte_VBN^N on_P bendum_N^D
    and_CONJ on_P swingelum_N^D for_P +dam_D^D so+tan_ADJ^D geleafan_N^D

Possessives and genitives

Unlike in the PPCME2, in the York Corpus no distinction is made between genitive and possessive nouns. All genitive nouns have a case tag ^G; the $ tag is only used to distinguish possessive pronouns (PRO$).

Compass points

Compass points do not seem to be used nominally in Old English (as in, She lived in the east), and so, unlike in the PPCME2 compass points are tagged as adverbs.

As parts of compound names (EAST ENGLA, etc.), however, compass points are tagged NPR.

Adverbial use of nouns

Nouns in oblique cases used adverbially are tagged as adverbs.

Proper nouns (NPR)

Names of people

All personal names are tagged NPR.

The word SANCTA/SANCTE/SANCTUS used in conjunction with a proper name is tagged NPR, but other (native) words possibly used as titles are not. SANCTA/SANCTE/SANCTUS is not case-marked since it does not inflect according to a native pattern (or reliably at all).


     sanctus_NPR Paulus_NPR^N
     Sancte_NPR Dunstan_NPR^N
     Sancta_NPR Maria_NPR^N

     +A+telstan_NPR^N cyning_N^N 

Two-part names like EAST ENGLA, NOR+T SEAXE, etc. when written as separate words are treated as compounds. Thus the first part is not tagged for case.

    East_NPR Engle_NPR^N
    Nor+t_NPR Walas_NPR^N 
    Middel_NPR Seaxe_NPR^N
    Ald_NPR Seaxe_NPR^N

Epithets of people or peoples are not tagged NPR.

    Sceotta_NPR^G leoda_N^N and_CONJ scipflotan_N^N <--- 'pirate host_Vikings'
    f+age_ADJ^N feollan_VBDI ,_,                       

    Freond_N^N $onsegon_VBDI la+dum_ADJ^D
    eagan_N^D landmanna_N^G cyme_N^A ._.            <--- 'landlubbers_Egyptians'

    Heht_VBDI +ta_ADV^T onlice_ADV
    +a+delinga_N^G hleo_N^N ,_,                     <--- multiple epithets
    beorna_N^G beaggifa_N^N ,_,
    swa_P he_PRO^N +t+at_D^A beacen_N^A
    geseah_VBDI ,_, heria_N^G hildfruma_N^N ,_,        
    +t+at_C him_PRO^D on_P heofonum_N^D +ar_ADV^T
    geiewed_VBN wear+d_BEDI ,_, ofstum_N^D
    myclum_Q^D ,_, Constantinus_NPR^N ,_,
    Cristes_NPR^G rode_N^D ,_, tireadig_ADJ^N
    cyning_N^N ,_, tacen_N^A gewyrcan_VB ._.

However, compounds containing a proper noun are tagged NPR.

     Gardene_NPR       spear-Danes
     Hringdene_NPR     ring-Danes
     Arscyldingas_NPR  honour-Scildings

In the poetry, nouns used as names in a particular context are tagged NPR.

    Is_BEPI +t+at_D^N deor_N^N pandher_NPR^N bi_P        <--- 'Panther'
    noman_N^D haten_VBN ,_,

    Nama_N^N w+as_BEDI gecyrred_VBN
    beornes_N^G in_P burgum_N^D on_P +t+at_D^A
    betere_ADJ^A for+d_RP ,_, +a_NPR^N h+alendes_NPR^G   <--- 'Saviour's Revelation

    Leoht_N^N w+as_BEDI +arest_ADV^T +turh_P
    drihtnes_NPR^G word_N^A d+ag_NPR^N genemned_VBN ,_,  <--- 'Day'
    wlitebeorhte_ADJ^N $gesceaft_N^N ._.

Adjectives corresponding to proper nouns are tagged ADJ, even when used substantively.

     Scittisc_ADJ^N              the Scottish
     cristenra_ADJ^G cwen_N^N    queen of the Christians
     Ebreisce_ADJ^N +a_N^N       Hebrew law

Names of places

Names consisting of a name plus a common noun (Rome burh, Elig mynster) are treated as compounds but only the name is tagged NPR; the common noun is tagged N. The case of the first part of the compound is often difficult to determine with certainty. It is generally either uninflected (i.e., the same as the nominative singular form) or a possible genitive, singular or plural. We have tagged all these cases as compounds, regardless of the presence or absence of an identifiable case on the first element. The name is therefore tagged only NPR with no case indicated; case is indicated on the common noun if appropriate according to the case-marking rules. Note that this only applies to place names and not to other potentially similar cases (as, for instance, the names of peoples NOR+TUMBRA CYNNE).

     Dinges_NPR mere_N^N
     Elig_NPR mynstre_N^D
     Rome_NPR byrig_N^D
     Egypta_NPR lond_N^A

Days of the week, months, and religious festivals/seasons

The names of the days of the week and months of the year are tagged as proper nouns. Religious seasons, such as Lent, and festivals, such as Easter, are also tagged NPR. Massdays (HLAFM+ASS, CANDELM+ASS, etc.) and DOMES D+AG are not tagged NPR.


LENCTEN can be either a noun or an adjective. When preceding a noun (e.g., LENCTENES F+ASTENES) it is tagged as an adjective, otherwise a noun.
        lenctenes_ADJ^G f+astenes_N^G
	+tam_D^D halgan_ADJ^D lenctene_NPR^D
KALEND is not proper in phrases like MAIAS KALEND.
        Maias_NPR^G kalend_N^N      the month of May
In conjunction with Latin-inflected month names KALEND- is tagged FW.

        vi_FW Kalend+a_FW Novembris_FW
        iii_FW Kalend+a_FW IUNII_FW 

Names of languages

The names of languages are either proper nouns or adjectives.

	on_P Englisc_ADJ^A

Names of God

The following are taken as names of God: GOD, DRIHTEN, JESUS, CRIST. All other ways of referring to God (H+ALEND, SCYPPEND, HALIG GAST, F+A+DER, SUNU, etc.) are considered to be epithets and are tagged as common nouns

Adjectives (ADJ)

Note that in YCOE comparative and superlative adjectives are not distinguished from positives as they are in the PPCME2. The same tag is used for all forms.

Positive, comparative and superlative adjectives are labelled ADJ. Adjectives are tagged for case according to the case-marking rules.

     Gif_P him_PRO^D wan_ADJ^N fore_P          <--- positive adjective
     wolcen_N^N hanga+d_VBPI ,_, ne_NEG m+agen_MDPS
     hi_PRO^N swa_ADV leohtne_ADJ^A leoman_N^A
     ansendan_VB ,_, +ar_P se_D^N +ticca_ADJ^N                       
     mist_N^N +tynra_ADJ^N weor+de_BEPS ._.    <--- comparative adjective

Weak adjective/noun ambiguity

*Applies to prose only*
When an adjective has a corresponding weak noun associated with it (e.g. HALIG/HALGA, CRISTEN/CRISTENA) many cases following a determiner are ambiguous between a noun and an adjective reading. The default tagging in these cases is that if the word precedes a noun it is tagged as an adjective, but otherwise as a noun. Only cases with a weak noun form listed in the dictionary fall under this rule; e.g. SEOCA is always an adjective because there is no noun SEOCA.

    +done_D^A halgan_ADJ^A w+ar_N^A
    +ta_D^A halgan_N^A

Ordinal numbers

Ordinal numbers are tagged ADJ.

     +ta_ADV^T com_VBDI ofer_P foldan_N^A
     fus_ADJ^N si+dian_VB m+are_ADJ^N mergen_N^N
     +tridda_ADJ^N ._.

+AREST may also be tagged ADV^T when used as a temporal adverb.

Adjectival use of quantifiers (MICEL and LYTEL)

MICEL and LYTEL and their comparative forms are tagged as quantifiers even when they clearly mean large, small.

Note that this is slightly different from the PPCME2 where in some cases, notably following a determiner and in copular constructions, these words are tagged as adjectives.

NEAH (adjective)

NEAH is only tagged ADJ when it is overtly inflected or clearly part of a noun phrase. In all other cases it is tagged as an adverb. Thus most cases of NEAH will be tagged ADV, even those in which although it is not overtly inflected it might be taken as agreeing with a masculine or neuter singular noun.

    sumre_Q^G neah_ADJ^G cyrican_N^G 

    +tam_D^D neah_ADJ^D wuda_N^D 


SWELC and +TYLLIC are tagged as adjectives. SWELCE may also be an adverb or preposition.

    mid_P swylcum_ADJ^D frofre_N^D 

    beo_BEPS +tin_PRO$^N wif_N^N swylc_ADJ^N swa_P Uenus_NPR^N


Forms of SELF are always tagged as adjectives.

    ta_ADV^T adrencte_VBD he_PRO^N hiene_PRO^A selfne_ADJ^A

    on_P +tam_D^D ge_PRO^N sylfa_ADJ^N moton_MDPI mid_P him_PRO^D
    +afre_ADV^T wunian_VB

NUM + WINTRE/GEARE (adjective)

Adjectives ending in WINTRE/GEARE (ANWINTRE, TWELFWINTRE) meaning x years old, are tagged ADJ when written as single words. When the numbers are written separately, WINTRE is still tagged ADJ and the other parts of the phrase are tagged literally.

    xviii_NUM wintre_ADJ^N
    fif_NUM &_CONJ sixtigwintre_ADJ^N
    an_NUM and_CONJ twentig_NUM geare_ADJ^N

Quantifiers (Q)

With two exceptions, the words on the following list are tagged Q in all functions (modifying, abosolute, adverbial, etc.). The exceptions are the wh-indefinites (HWA, HWILC, etc.) and +AG+TER, NA+TOR.

wiht (and derivatives na(n)wiht, naht, na(n)wuht), +alc, +anig (and derivative n+anig), begen, butu, eall, feawe, fela, hwa (and derivatives nathwa, +aghwa, +athwa, gehwa, hw+atwugu, nateshwan), hw+a+ter (and derivatives +aghw+a+ter, +ag+ter, gehw+a+ter, nahw+a+ter, na+tor), hwilc (and derivatives nathwilc, +aghwilc, gehwilc, (ge)welhwilc, hwilcwugu), lyt, lytel (and derivative unlytel), ma, manig, micel, sum

    monegum_Q^D m+ag+tum_N^D
    +aghwylc_Q^N +tara_D^G ymbsittendra_N^G
    madma_N^G fela_Q 


Note that the hw-words (HWA, HWILC, etc.) are also used as wh-words in questions, where they are not tagged Q, but WPRO, WADJ, etc.

Negative quantifiers

Negative derivatives starting with N- are tagged NEG+Q; but note that quantifiers starting with NAT- (e.g., NATHWA, NATHW+AT, etc. from I know not who/what/etc) are not negative.

+AG+TER, NA+TOR (quantifiers)

+AG+TER and NA+TOR are also used as conjunctions, in which case they are tagged CONJ and NEG+CONJ, respectively.


Note especially that (UN)LYTEL and MICEL are consistently tagged Q, even when their meaning is more adjectival than quantificational, i.e. when LYTEL is better interpreted as small and MICEL as great. Distinguishing between the two readings can be quite difficult, especially with plural nouns, and we have not attempted to do so, despite the fact that it creates infelicitous readings in some cases.

    for_P +tam_D^D mycclan_Q^D gewynne_N^D 
    on_P anre_NUM^D lytlan_Q^D byrig_N^D 

Undeclinable quantifiers FELA, LYT, MA

The undeclinable quantifiers FELA, LYT AND MA are not tagged for case.

    t+at_C +ter_ADV^L ne_NEG mihte_MDD na_NEG+ADV ma_Q muneca_N^G wunian_VB 

    and_CONJ +t+ara_D^G ma+dma_N^G ne_NEG rohte_VBD +te_D^I ma_Q +te_C
    reocendes_VAG^G meoxes_N^G ._.

    Se_D^N feond_N^N h+afde_HVD him_PRO^D mid_P fela_Q o+dre_ADJ^A 
    sceoccan_N^A ,_.

    +t+at_C heo_PRO^N heora_PRO$ deadra_ADJ^G to_ADV lyt_Q h+afden_HVDS 

A quantifier used in conjunction with a nom/acc/gen plural ambiguous NP can often be taken as either a head with a genitive complement or a modifier of a nom/acc head. In these cases we apply the default rule that undeclinable quantifiers (FELA, LYT, MA) take a genitive complement, while all other quantifiers are taken as modifiers. This follows the majority pattern although there are clear examples of the other pattern for both types.

    fela_Q suna_N^G

    manige_Q^N suna_N^N

Uninflected EALL

EALL with no overt inflection is treated as follows. When EALL immediately precedes an NP with which it potentially agrees (masc/neut. nom.sg. etc.) it is tagged with the same case as the words of the NP. In all other positions it is tagged only Q with no case. This includes cases when it follows a NP with which it potentially agrees.

    eall_Q^N +t+as_D^G cyninges_N^G r+ad_N^N

    all_Q^A woruld+ding_N^A 

    +t+at_D^A mynster_N^A eall_Q 

    and_CONJ fleow_VBDI eall_Q blode_N^D ._.

    +After_P +tysum_D^D worde_N^D he_PRO^N wear+d_BEDI eall_Q 
    geh+aled_VBN ,_.

Determiners (D)

    +d+am_D^D eafera_N^N w+as_BEDI +after_ADV^T
    cenned_VBN ,_, geong_ADJ^N in_P geardum_N^D ,_,
    +tone_D^A god_NPR^N sende_VBD folce_N^D to_P       <--- relative clause
    frofre_N^D ;_.

    Gewat_AXDI him_PRO^D +ta_ADV^T
    Andreas_NPR^N inn_RP on_P ceastre_N^A
    gl+admod_ADJ^N gangan_VB ,_, to_P +t+as_D^G +de_C  <--- 'to where'
    he_PRO^N gramra_ADJ^G gemot_N^A ,_, 

Indefinite AN

Although AN can sometimes be interpreted as an indefinite determiner, it is always tagged as a cardinal number.

Disambiguating +TA

In some cases, +TA is ambiguous between an adverb or preposition introducing a clause and a determiner. In difficult cases the ambiguity is resolved as follows.

    hine_PRO^A fyrwyt_N^N br+ac_VBDI modgehygdum_N^D
    ,_, hw+at_WPRO^N +ta_D^N men_N^N w+aron_BEDI ._.

    Aledon_VBDI +ta_D^N leofne_ADJ^A
    +teoden_N^A ,_, beaga_N^G bryttan_N^A ,_, on_P
    bearm_N^A scipes_N^G ,_, m+arne_ADJ^A be_P
    m+aste_N^D ._.


Likewise, +T+AT can be ambiguous between a determiner functioning as wh-word and a complementiser in relative-clause constructions. By default, +T+AT is treated as a determiner in these cases if it matches the antecedent in gender and number, and as a complementizer otherwise.

    Wulfgar_NPR^N ma+telode_VBD +t+at_D^N
    w+as_BEDI Wendla_NPR^G leod_N^N ;_.

    _CODE &_CONJ +ta_ADV^T swi+de_ADV ra+te_ADV
    +after_P +t+am_D^D ,_, swa_P +ta_D^N o+tre_ADJ^N ham_ADV^D comon_VBDI
    ,_, +ta_ADV^T fundon_VBDI hie_PRO^N o+tre_ADJ^A flocrade_N^A ,_,
    +t+at_C rad_VBDI ut_RP wi+d_P Lygtunes_NPR^G ,_.

Cardinal numbers (NUM)

All cardinal numbers except those in foreign language sequences are tagged NUM, whether they are written out or in number form. Roman numerals on their own do not count as foreign, only in conjunction with other foreign words.

        libro_FW 5=o=_FW ,_, capitulo_FW 24=o=_FW ._.

Note that AN in YCOE does not have a special tag (ONE) as in the PPCME2.

Case on numbers

Numbers up to three are inflected and tagged for case; all others are only tagged for case when it is overt. Numbers up to three as part of larger numbers (e.g., TWEGEN HUND) are only tagged for case if case is overt.

    Git_PRO^N on_P w+ateres_N^G +aht_N^A
    seofon_NUM niht_N^A swuncon_VBDI ;_.            <--- no overt case

    XVna_NUM^G sum_Q^N sundwudu_N^A sohte_VBD ;_.   <--- overt case

Weak ANA

The weak form ANA is tagged as a focus particle.


While BETWEONUM is tagged as a preposition (P), BE ... TWEONUM is tagged be_P ... tweonum_NUM^D.

    monig_Q^N oft_ADV gecw+a+d_VBDI +t+atte_C
    su+d_ADV^L ne_NEG+CONJ nor+d_ADV^L be_P s+am_N^D
    tweonum_NUM^D ofer_P eormengrund_N^A o+ter_ADJ^N
    n+anig_NEG+Q^N under_P swegles_N^G begong_N^A
    selra_ADJ^N n+are_NEG+BEDS rondh+abbendra_N^G
    ,_, rices_N^G wyr+dra_ADJ^N ._.


While BUTU is tagged as a quantifier (Q), BU ... TU is tagged Q ... NUM.

    +da_ADV^T gen_ADV^T ic_PRO^N
    gecr+afte_VBD +t+at_C se_D^N cempa_N^N ongon_RP+AXDI
    waldend_NPR^A wundian_VB ,_, weorud_N^N to_RP
    segon_VBDI  +t+at_C +t+ar_ADV^L blod_N^N ond_CONJ
    w+ater_N^N bu_Q^N tu_NUM +atg+adre_ADV
    eor+tan_N^A sohtun_VBDI ._.


Wh-pronoun (WPRO)

Tagged WPRO are forms of HWA/HW+AT heading a wh-NP, and HW+A+DER meaning which of .... *difference*
The genitive/possessive form of HWA (HW+AS) is tagged WPRO^G in the YCOE, rather than as a possessive (WPRO$ as in the PPCME2).

    Ic_PRO^N sceal_MDPI hra+de_ADV cunnan_MD
    hw+at_WPRO^A +du_PRO^N us_PRO^D to_P
    $dugu+dum_N^D gedon_VB wille_MDPS ._.

    oldon_MDDI cunnian_VB hw+a+der_WQ
    cwice_ADJ^N lifdon_VBDI +ta_D^N +te_C on_P
    carcerne_N^D clommum_N^D f+aste_ADV hleoleasan_ADJ^A
    wic_N^A hwile_N^A wunedon_VBDI ,_, hwylcne_WPRO^A
    hie_PRO^N to_P +ate_N^D +arest_ADV^T mihton_MDDI
    +after_P fyrstmearce_N^G feores_N^G ber+adan_VB

    +Ta_P he_PRO^N hie_PRO^A ascade_VBD his_PRO$
    $godas_N^A hw+a+ter_WPRO^N heora_PRO^G sceolde_MDD on_P o+trum_ADJ^D
    sige_N habban_HV ,_, +te_CONJ he_PRO^N on_P Romanum_NPR^D ,_, +te_CONJ
    Romane_NPR^N on_P him_PRO^D ,_, +ta_ADV^T ondwyrdon_VBDI hie_PRO^N
    him_PRO^D tweolice_ADV ,_.

    and_CONJ axode_VBD +tone_D^A halgan_N^A +turh_P hw+as_WPRO^G mihte_N
    he_PRO^N gefremode_VBD +ta_D^A wundorlican_ADJ^A tacna_N^A ,_, +t+at_C
    swa_ADV micel_Q^N werod_N^N him_PRO^D folgode_VBD ._.

Wh-adjective (WADJ)

HWILC is tagged WADJ in the YCOE in all cases; i.e., whether it modifies a noun or not (as with other adjectives such as SWILC, O+TER, etc.).

WHICH is tagged as a wh-determiner (WD) in the PPCME2.

    befran_VBDI for_P hwylcum_WADJ^D intingan_N^D hi_PRO^N hine_PRO^A
    axodon_VBDI ._.

    We_PRO^N moton_MDPI nu_ADV^T secgan_VB swutellicor_ADV be_P 
    +dysum_D^D ,_, hwylce_WADJ^N mettas_N^N w+aron_BEDI mannum_N^D 
    forbodene_VBN^N on_P +d+are_D^D ealdan_ADJ^D +a_N^D

    &_CONJ cw+a+d_VBDI ,_, hwylc_WADJ^N is_BEPI min_PRO$^N modor_N^N 
    &_CONJ mine_PRO$^N gebro+tru_N^N ?_.

Wh-adverb (WADV)

Hw-adverbs (HU, HWONNE, HW+AR and HWI) are tagged WADV both in direct questions and introducing a wh-clause. Note that +TA and +TONNE are tagged P when introducing a subordinate clause (see Subordinating conjunctions) and +T+AR is always tagged as an locative adverb even when acting as a relative pronoun.

    Hu_WADV +tearf_MDPI mannes_N^G sunu_N^N         <--- direct question
    maran_Q^A treowe_N^A ?_.

    +da_ADV^T w+as_BEDI forma_ADJ^N si+d_N^N
    +t+at_C hine_PRO^A weroda_N^G god_NPR^N wordum_N^D
    n+agde_NEG+VBD ,_, +t+ar_ADV^L he_PRO^N him_PRO^D               
    ges+agde_VBD so+dwundra_N^G fela_Q ,_,
    hu_WADV +tas_D^A woruld_N^A worhte_VBD witig_ADJ^N      <--- wh-clause
    drihten_NPR^N ,_, eor+dan_N^G ymbhwyrft_N^A
    and_CONJ uprodor_N^A ,_, gesette_VBD sigerice_N^A ,_,


When introducing a WHETHER question, HW+A+DER is tagged WQ; when it acts as a wh-pronoun meaning which of two it is tagged as a wh-pronoun WPRO.

    swa_P +d+at_C hit_PRO^N n+as_NEG+BEDI gesene_ADJ^N hwe+der_WQ
    he_PRO^N seoc_ADJ^N w+are_BEDS 

    and_CONJ axodon_VBDI +at_P +tam_D^D hiwum_N^D hw+a+der_WQ se_D^N
    halga_ADJ^N Petrus_NPR^N +t+ar_ADV^L wununge_N h+afde_HVD ,_.

    Gebide_VBI ge_PRO^N on_P beorge_N^D byrnum_N^D werede_VBN^N ,_,
    secgas_N^N on_P searwum_N^D ,_, hw+a+der_WPRO^N sel_ADV m+age_MDPS
    +after_P w+alr+ase_N^D wunde_N^A gedygan_VB uncer_PRO^G twega_NUM^G
    ._. COBEOWUL

GIF in indirect questions is tagged as a complementizer.


Differences from the PPCME2

Verbs are treated slightly differently in the YCOE from the PPCME2. The main differences are:

Categories of Verbs

Modal verbs
Auxiliary verbs (apart from HAVE and BE)
Lexical verbs

Modal verbs

        MD          infinitive
        MDI         imperative
        MDPI        present indicative
        MDPS        present subjunctive
        MDP         present tense (ambiguous subjunctive/indicative)
	MDPH        present tense (ambiguous subjunctive/imperative)
        MDDI        past indicative
        MDDS        past subjunctive
        MDD         past tense (ambiguous subjunctive/indicative)

The following verbs are always tagged as modals, whether used with an infinitive or independently. "Modal" meanings are given first, followed by independent meanings.

  • to know how to, have power to, be able to, can
  • to be or become acquainted with, be thoroughly conversant with, know
  • to dare, venture, presume
  • to be able to, have permission or power to
  • to be strong, confident, avail, prevail
  • (may), to be allowed to, have opportunity to, be able to, be compelled to, must
  • to be obliged to, have to, must, ought to, owe
  • to need to, be required to, must, have occasion to
  • want, be needy, owe
  • be willing to, wish to, desire to, be used to, be about to, shall, will
  • wish, desire

The verb AGAN, however, is only tagged as a modal when it is used with an infinitive, meaning have to or ought to

        $Ic_PRO^N +te_PRO^D m+ag_MDPI gesecgan_VB +t+at_C
        +tu_PRO^N +tec_PRO^A sylfne_ADJ^A ne_NEG
        +tearft_MDPI swi+tor_ADV swencan_VB ._.

        Ic_PRO^N sceal_MDPI hra+de_ADV cunnan_MD    <--- independent use of cunnan 
        hw+at_WPRO^A +du_PRO^N us_PRO^D to_P
        $dugu+dum_N^D gedon_VB wille_MDPS ._.

        Gif_P him_PRO^D arlice_ADV esne_N^N
        +tena+d_VBPI ,_, se_D^N +te_C agan_VB sceal_MDPI <--- AGAN
        on_P +tam_D^D si+dfate_N^D ,_,

Note that "modal" is a lexical category in the YCOE (apart from AGAN), unlike in the PPCME2, where modals used independently are tagged as lexical verbs.

Auxiliary verbs

        AX          infinitive
        AXI         imperative
        AXPI        present indicative
        AXPS        present subjunctive
        AXP         present tense (ambiguous subjunctive/indicative)
        AXPH        present tense (ambiguous subjunctive/imperative)
        AXDI        past indicative
        AXDS        past subjunctive
        AXD         past tense (ambiguous subjunctive/indicative)
        AXG         present participle
        AXN         past participle

The following verbs may be tagged as auxiliaries. Unlike the modals, these verbs are only tagged AX when used with a bare infinitive, or (marginally) with a participle.

    W+as_BEDI hira_PRO^G Matheus_NPR^N sum_Q^N
    ,_, se_D^N mid_P Iudeum_NPR^D ongan_RP+AXDI
    godspell_N^A +arest_ADV wordum_N^D writan_VB
    wundorcr+afte_N^D ._.

    Gewat_AXDI +da_ADV neosian_VB ,_,        <--- auxiliary use of gewat
    sy+t+dan_P niht_N^N becom_VBDI ,_, hean_ADJ^G <--- main verb use of becom
    huses_N^G ,_, hu_WADV hit_PRO^A Hringdene_NPR^N
    +after_P beor+tege_N^D gebun_VBN h+afdon_HVDI ._.

Utan (UTP)

Forms of the verb UTAN, historically derived from WITAN to go and used to introduce imperative or hortatory clauses (let us..., come...), are tagged UTP. This verb is not used in the subjunctive or in the past tense.

    Uton_UTP nu_ADV^T brucan_VB +tisses_D^G undernmetes_N^G

    uton_UTP wyrcean_VB him_PRO^D sumne_Q^A fultum_N^A to_P his_PRO$
    gelicnysse_N ._.

BE and HAVE (BE, HV)

The verb BE

        BE          infinitive
        BEI         imperative
        BEPI        present indicative
        BEPS        present subjunctive
        BEP         present tense (ambiguous subjunctive/indicative)
	BEPH        present tense (ambiguous subjunctive/imperative)
        BEDI        past indicative
        BEDS        past subjunctive
        BED         past tense (ambiguous subjunctive/indicative)
        BAG         present participle
        BEN         past participle

The verb HAVE

        HV          infinitive
        HVI         imperative
        HVPI        present indicative
        HVPS        present subjunctive
        HVP         present tense (ambiguous subjunctive/indicative)
	HVPH        present tense (ambiguous subjunctive/imperative)
        HVDI        past indicative
        HVDS        past subjunctive
        HVD         past tense (ambiguous subjunctive/indicative)
        HAG         present participle
        HVN         past participle

The forms of BE and HAVE are distinguished from all other verbs, but no distinction is made in the tag for auxiliary and main verb use for BE and HAVE.

All forms of BEON, WESAN, and (GE)WEOR+DAN are labelled with BE tags regardless of meaning. Forms of WEOR+DAN are often used in passive constructions next to BEON, WESAN. Forms of GEWEOR+DAN are more often used independently; nevertheless, they are always tagged as BE.

        Egyptum_NPR^D wear+d_BEDI +t+as_D^G         <--- WEOR+DAN
        d+agweorces_N^G deop_ADJ^N lean_N^N gesceod_VBN

        Ic_PRO^N +turh_P Iudas_NPR^A +ar_ADV^T
        hyhtful_ADJ^N gewear+d_BEDI ,_. ond_CONJ nu_ADV^T  <--- GEWEOR+DAN
        gehyined_VBN eom_BEPI                   <--- BEON, WESAN

All forms of HABBAN and GEHABBAN are tagged HV.

Note that DO is tagged as a lexical verb in the YCOE and not given a special tag as in the PPCME2.

Lexical verbs (VB)

        VB          infinitive
        VBI         imperative
        VBPI        present indicative
        VBPS        present subjunctive
        VBP         present tense (ambiguous subjunctive/indicative)
	VBPH        present tense (ambiguous subjunctive/imperative)
        VBDI        past indicative
        VBDS        past subjunctive
        VBD         past tense (ambiguous subjunctive/indicative)
        VAG         present participle
        VBN         past participle

All lexical verbs are given tags beginning with VB.

Notice that an infinitive following TO can be tagged as a dative form VB^D (see Inflected infinitives).

Modals used as main verbs

Modal verbs are never tagged as main verbs, except AGAN (see Modal verbs).


Unlike in the PPCME2, in the YCOE unambiguous subjunctive verb forms are distinguished from unambiguous indicative forms for all verbs except UTAN.

The following four categories are distinguished:

  1. morphologically unambiguous indicative (tag ends in -I)
  2. morphologically unambiguous subjunctive (tag ends in -S)
  3. ambiguous subjunctive/indicative (base form VBP etc.)
  4. ambiguous subjunctive/imperative (tag ends in -H)

Mood (indicative or subjunctive) is labelled on verbs based on the form of the verb itself and not on context. Only unambiguous forms are labelled; ambiguous forms (e.g., past tense of 3rd sg. weak verbs in -EDE/-ODE) are unmarked.

The following forms are always ambiguous:

    &_CONJ ic_PRO^N worige_VBP
    &_CONJ he_PRO^N wunode_VBD flyma_N^N on_P +dam_D^D eastd+ale_N^D
    Do_VBI swa_P +tu_PRO^N spr+ace_VBD 
    &_CONJ weaxe_VBP ge_PRO^N 

In the past tense plural, -AN and -UN are taken as a variant of -ON and labelled indicative; only -EN is labelled as subjunctive

In the present plural (of non-preterite-present verbs), any vowel plus -N is labelled subjunctive. Sometimes this is dependent on the tense context, as the present subjunctive and past plural have the same root vowel (e.g., SCINEN, SCINON).

    &_CONJ hi_PRO^N wunodan_VBDI +d+ar_ADV^L ._.

    Ic_PRO^N bidde_VBP eow_PRO ,_, Leof_ADJ^N ,_, +t+at_C ge_PRO^N
    cyrron_VBPS to_P minum_PRO$^D huse_N^D ,_, &_CONJ +t+ar_ADV^L
    wunion_VBPS nihtlanges_ADV^T 

    &_CONJ hi_PRO^N scinon_VBPS on_P +d+are_D^G     <--- part of sequence of 
    heofenan_N^G f+astnysse_N^D                          present subjunctives

The imperative/subjunctive ambiguity affects singular imperatives ending in -E and forms of some irregular verbs (DO, GA, BEO).

    &_CONJ ga_VBPH of_P +tissum_D^D men_N^D


Unlike in the PPCME2, verbal and adjectival use of the present and past participles is not distinguished; that is, they are both tagged VAG or VBN (or HAG/HVN, etc).

Only overt case is marked on participles that are part of the main verb sequence. Therefore the following forms do not have a case label in this context.

When used as modifiers or attributively, all participles except GEHATEN, GECIGEN, and other verbs used in naming are labelled with the case of the head they modify. Note that this is so even in those few cases in which the participle does not inflect appropriately.

    gedrince+d_VBPI to_P dryggum_N^D dreosendne_VAG^A <--- overt case
    welan_N^A ,_. and_CONJ +teah_ADV +t+as_D^G
    +tearfan_N^G ne^NEG bi+d_BEPI +turst_N^N
    aceled_VBN ._.                                    <--- no overt case

    Licgende_VAG beam_N^N l+asest_ADV                 <--- adjectival use
    growe+d_VBPI ._.

A form is considered a participle if it corresponds in its entirety to an actively used Old English verb (reference: Clark Hall's Concise Anglo-Saxon Dictionary, 4th edn.). This rules out:

These forms are tagged as adjectives, not participles, and they thus follow the general rules for the case-marking of adjectives and not that of participles given here.

Unlike past participles, present participles are frequently used as nouns. The policy in these cases is as uncontroversial as possible; in general, if a form is listed as a noun by Clark Hall, it is tagged as a noun; for example:

     godhergend        worshipper of God
     sweordwigend      warrior
     wi+derfeohtend    adversary
     ridend            rider
     ceasterbuend      citizen
     godfremmend       doer of good


Inflected infinitives

Infinitives with -NE added to the base form are labelled with the plain infinitive marker VB, HV, etc. plus dative case.

    M+al_N^N is_BEPI me_PRO^D to_TO feran_VB ;_.  <--- plain infinitive

    He_PRO^N bi+d_BEPI +tam_D^D yflum_ADJ^D
    egeslic_ADJ^N ond_CONJ grimlic_ADJ^N to_TO
    geseonne_VB^D ,_,                              <--- inflected infinitive

Infinitive marker TO (TO)

TO used with an infinitive is tagged TO. It is followed by both plain and inflected infinitives.

Adverbs (ADV)

Classes of Adverbs

Adverbs are tagged ADV. Five classes of adverbs are distinguished:

Locative adverbs (ADV^L)

Locative adverbs indicate location and are usually used with stative verbs. Note that the same set of adverbs can also be tagged as contextual directional adverbs when used with a verb of motion. The following is a list of the most common locative adverbs; the list is not exhaustive.
+t+ar, be+aftan, bufan, feor, gehende, gehw+ar, her, innan, inne, neah, feor, utan, ute, wi+dinnan, wi+dutan, nahw+ar
Note that forms including HW+AR, such as +AGHW+AR, GEHW+AR, are tagged ADV, not WADV, when they are used as indefinites, as with all wh-indefinites.

Negated forms such as NAHW+AR have a NEG prefix NEG+ADV^L in the usual way.

In +T+AR +T+AR sequences, both +T+ARs are tagged ADV^L.

    +T+ar_ADV^L comon_VBDI eac_ADV heora_PRO$ magas_N^N

    +t+at_C hi_PRO^A man_MAN^N begen_Q^A ofstunge_RP+VBDS 
    +t+ar_ADV^L +d+ar_ADV^L hi_PRO^N on_P gebedum_N^D

    +Tas_D^N +te_C her_ADV^L nu_ADV^T wepa+d_VBPI
    woldon_MDDI mid_P eow_PRO blissian_VB

    Se_D^N cniht_N^N wear+d_BEDI geancsumod_VBN
    and_CONJ wi+dinnan_ADV^L ablend_VBN +after_P +t+as_D^G m+adenes_N^G

Lexical directional adverbs (ADV^DX)

This set of adverbs is inherently directional and includes the following:
+tanon, +tider, gehwanon, gehwider, heonan, hider, hindan, -weard(es)
Any word ending in -WEARD(ES) (apart from prepositional use) is tagged as an adverb. This includes HAMWEARD(ES).

Indefinite wh-forms like GEHWIDER are tagged ADV not WADV as usual.

Note that +TANON can also be used temporally in which case it is tagged ADV^T.

    and_CONJ hi_PRO^N +tyder_ADV^DX comon_VBDI mid_P mycelre_Q^D
    sarnyssa_N^D +t+ar_ADV^D heora_PRO$ suna_N^N w+aron_BEDI geh+afte_VBN^N

    Hi_PRO^N feordon_VBDI +ta_ADV^T +tanon_ADV^DX
    fram_P +t+are_D^G scire_N^G bisceope_N^D ,_.

    He_PRO^N gegaderode_VBD +ta_ADV^T swi+de_ADV
    gode_ADJ^A wyrhtan_N^A gehwanon_ADV^DX ,_.

Contextual directional adverbs (ADV^D)

The set of adverbs used locatively can also be used directionally with verbs of motion. In this case the adverb is tagged ADV^D.

    and_CONJ +done_D^A cempan_N^A tihton_VBDI +t+at_C he_PRO^N faran_VB
    sceolde_MDD feor_ADV^D fram_P +d+are_D^D byrig_N^D ._.

    +Ta_ADV^T bletsode_VBD Maurus_NPR^N +tone_D^A
    mann_N^A feorran_ADV^D ,_.

    and_CONJ heton_VBDI me_PRO gan_VB for+d_RP o+d+t+at_P we_PRO^N
    becoman_VBDI +t+ar_ADV^D se_D^N cyning_N^N w+as_BEDI ._.

    and_CONJ hi_PRO^A tomiddes_ADV^D besceofan_VB ._.

Temporal adverbs (ADV^T)

Temporal adverbs are tagged ADV^T. Adverbs meaning primarily quickly but shading off into right away, immediately, such as SNELLICE, RECENE, +ADRE, are considered members of the other adverbs class and are tagged ADV. The following are the most common words tagged as temporal adverbs; the list is not exhaustive.
+afre, +aft(er), +ane(s), +ar (+aror, etc.), +t+arrihte, +ta, +tagyt, +tanon, +tonne, +triwa, a, beforan, ealneg, eft, gefyrn, geo, gyt, gyrsand+ag, heononfor+d, iu, lange, late, nu, nugyt, oft (oftor, etc.), si+d+dan, simble, sona, tod+ag(e), tuwa, n+afre
Note that when +T+ARRIHTE is spelled as two words, it is tagged with the PPCME2 numbering system +t+ar_ADV^T21 rihte_ADV^T22.

    Hi_PRO^N sceoldon_MDDI +ta_ADV^T underhnigan_RP+VB nacodum_ADJ^D 
    swurde_N^D ,_.

    and_CONJ het_VBDI hine_PRO^A symble_ADV^T beon_BE +atforan_P his_PRO$
    gesih+de_N ._.

    and_CONJ heora_PRO$ modor_N^N w+as_BEDI Martia_NPR^N gecyged_VBN ,_, 
    h+a+dena_ADJ^N +ta_ADV^T gyt_ADV^T ,_.

Other adverbs (ADV)

All other adverbs, including sentential and manner adverbs, are tagged ADV.
    D+aghwamlice_ADV he_PRO^N gefylde_VBD his_PRO$
    drihtnes_NPR^G +tenunge_N geornlice_ADV ,_.

    He_PRO^N lufode_VBD swa_ADV +teah_ADV +done_D^A
    halgan_ADJ^A w+ar_N^A ,_.

    Nis_NEG+BEPI na_NEG+ADV godes_NPR^G wunung_N^N
    on_P +dam_D^D gr+agum_ADJ^D stanum_N^D ,_, ne_NEG+CONJ on_P
    +arenum_ADJ^D wecgum_N^D ,_.

Negative adverbs

Negative adverbs like NA, NAHW+AR, N+AFRE are tagged NEG+ADV following general principles for negative elements.

    ac_CONJ se_D^N +almihtiga_ADJ^N God_NPR^N eow_PRO n+afre_NEG+ADV^T
    ne_NEG forl+at_VBPI ,_, o+d_P +t+at_C ge_PRO^N gelogode_VBN^N beon_BEPS

    ac_CONJ he_PRO^N ne_NEG leofode_VBD na_NEG+ADV +ta_ADV^T ,_.

    ne_NEG+CONJ +tu_PRO^N ne_NEG +atstand_VBI nahwar_NEG+ADV^L on_P
    +disum_D^D earde_N^D ,_.

Adverbial quantifiers

Quantifiers used adverbially, whether indeclinable (MA, LYT) or case forms (MICCLUM, EALLES) are always tagged as quantifiers (Q). The inflected forms are also labelled for case.

Adverbial nouns

The origin of many adverbs is in the oblique cases of nouns (HWILUM, GEARA, UNWILLUM, etc.), and it is not clear at what point these cease to be nouns and become adverbs (if there is indeed any such "point"). Largely because of the difficulty of making such a division, all unmodified, single word nouns used adverbially are tagged as adverbs with the appropriate extension (T,L,DX,D) indicating function. This also applies to HAM which is tagged as a directional (or occasionally locative) adverb. Note that this does not apply to quantifiers and demonstratives used adverbially since there is no difficulty in identifying these categories.

Adverb vs. preposition

Many words function as both adverbs (ADV) and prepositions (P) with either a clausal complement (subordinating conjunctions) or an NP complement (prepositions).

    Hi_PRO^N hyne_PRO^A +ta_ADV^T
    +atb+aron_VBDI to_P brimes_N^G faro+de_N^D ,_,
    sw+ase_ADJ^N gesi+tas_N^N ,_, swa_P he_PRO^N
    selfa_ADJ^N b+ad_VBDI ,_, +tenden_P
    wordum_N^D weold_VBDI wine_N^N Scyldinga_NPR^G ;_.

    Swa_ADV mec_PRO^A gelome_ADV                        <--- SWA adverb
    la+dgeteonan_N^N +treatedon_VBDI +tearle_ADV ._.
    Ic_PRO^N him_PRO^D +tenode_VBD
    deoran_ADJ^D sweorde_N^D ,_, swa_P hit_PRO^N        <--- SWA preposition
    gedefe_ADJ^N w+as_BEDI ._.

    He_PRO^N w+as_BEDI leof_ADJ^N gode_NPR^D
    and_CONJ lifde_VBD her_ADV^L wintra_N^G 
    hundnigontig_NUM +ar_P he_PRO^N be_P wife_N^D       <--- +AR conjunction
    her_ADV^L $+turh_P gebedscipe_N^A bearn_N^A
    astrynde_VBD ;_.
    him_PRO^D +ta_ADV cenned_VBN wear+d_BEDI
    Cainan_NPR^N +arest_ADV^T eafora_N^N on_P           <--- +AR adverb
    e+dle_N^D ._.


Although NEAH, GEHENDE, and FEOR act in some ways like prepositions in that they appear to take dative complements, they are also modified by such adverbs as SWA, SWI+DE, etc. in a non-prepositional way. We have therefore tagged these three words as adverbs when they do not appear as part of an NP or are not overtly inflected (in which case they are tagged as adjectives). The default tagging for cases where the expected inflection is zero is to take them as adverbs unless they occur within an NP. This includes the copular case. NEAH and GEHENDE are labelled as locative adverbs (apart from the use of NEAH to mean nearly), while FEOR may be locative or directional.

    and_CONJ +tar_ADV^L +anig_Q^N +tingc_N^N neah_ADV^D  <--- directional
    ne_NEG cume_VBPS 

    ac_CONJ eac_ADV ealle_Q^N nytenu_N^N swy+de_ADV neah_ADV 
    forwurdon_VBDI ._.

    &_CONJ +ta_D^N Beormas_NPR^N spr+acon_VBDI neah_ADV an_NUM^A
    N+as_NEG+ADV hie_PRO^N +d+are_D^G
    fylle_N^G gefean_N^A h+afdon_HVDI ,_,
    manford+adlan_N^N ,_, +t+at_C hie_PRO^N me_PRO^A
    +tegon_VBDI ,_, symbel_N^A ymbs+aton_VBDI
    s+agrunde_N^D neah_ADV^L ;_.

    $Hafast_HVPI +tu_PRO^N gefered_VBN
    +t+at_C +de_PRO^A feor_ADV^L ond_CONJ neah_ADV^L
    ealne_Q^A wideferh+t_N^A weras_N^N ehtiga+d_VBPI
    ,_, efne_ADV swa_ADV side_ADV swa_P s+a_N^N
    $bebuge+d_VBPI ,_, windgeard_N^N ,_, weallas_N^A

    Is_BEP +tam_D^D dome_N^D neah_ADV^L +t+at_C      <--- locative with no 
    we_PRO^N gelice_ADV sceolon_MDPI leanum_N^D           overt inflection
    hleotan_VB ,_,  


GELICE is normally an adverb (or an inflected adjective), but it occurs in the subordinating construction "GELICE &" three times in Orosius, where GELICE is tagged as a preposition to make the subordinating nature of the construction clear.


+AREST is tagged ADJ when used as an ordinal number and ADV^T when used as a temporal adverb.

    Leoht_N^N w+as_BEDI +arest_ADV^T +turh_P <--- +arest ADV
    drihtnes_NPR^G word_N^A d+ag_NPR^N genemned_VBN ,_,
    wlitebeorhte_ADJ^N $gesceaft_N^N ._.
    Wel_ADV licode_VBD frean_NPR^D +at_P
    frym+de_N^D for+tb+aro_ADJ^N tid_N^N ,_,
    d+ag_N^N +aresta_ADJ^N ;_.               <--- +arest ADJ


OFER, TO, and FOR when they mean too, very in combination with adjectives or adverbs are labelled ADV.

    ofer_ADV f+at_ADJ^N
    to_ADV lange_ADV^T
    to_ADV god_ADJ^N 
    for_ADV wel_ADV 
    for_ADV oft_ADV^T


+TA can be ambiguous between an adverb, preposition and a determiner. In difficult cases the ambiguity is resolved as follows.


+T+AR is treated as a locative or directional adverb (tagged ADV^L> and ADV^D respectively), even when it introduces a subordinate clause (LINK TO SYNTAX). Existential THERE is not distinguished.

    Eala_INTJ hu_WADV mycel_Q^N god_N^N is_BEPI
    and_CONJ hwylc_WADJ^N wynsumnys_N^N +d+ar_ADV^L +d+ar_ADV^L
    gebro+dru_N^N beo+d_BEPI on_P annysse_N ._.

    +Ta_ADV^T com_VBDI sum_Q^N wudewe_N^N ,_, +te_C
    w+as_BEDI anes_NUM^G martyres_N^G laf_N^N ,_, on_P +t+are_D^D
    ylcan_ADJ^D nihte_N^D ,_, +t+ar_ADV^L he_PRO^N l+ag_VBDI forwundod_VBN


TOD+AG(E), GIESTRAND+AG etc. are tagged ADV^T when written as one word. When TO D+AG(E) is written as two words, it is tagged as a prepositional phrase.

    tod+ag_ADV^T +tu_PRO^N bist_BEPI mid_P me_PRO on_P neorxnawange_N^D
    nu_ADV^T tod+ag_ADV^T he_PRO^N modega+d_VBPI ,_.

    &_CONJ giet_ADV^T tod+age_ADV^T is_BEPI ,_, for_P Romana_NPR^G
    bismere_N^D ._.

    swa_ADV swa_P Crist_NPR^N gyrstand+ag_ADV^T me_PRO cydde_VBD be_P +te_PRO

HAM HAM is tagged as an adverb, either locative or directional depending on context.

    gewat_VBDI him_PRO^D ham_ADV^D +tonon_ADV^DX goldwine_N^N gumena_N^G

Prepositions with NP complements

Prepositions are tagged P.

        ofer_P ealle_Q^A gesceaft_N^A
        on_P +t+are_D^D upplican_ADJ^D +a+delan_ADJ^D ceastre_N^D
        +durh_P +d+at_D^A halige_ADJ^A triow_N^A

Prepositions with R-pronouns

Prepositions cliticized to R-pronouns are labelled ADV+P.


When separated they are tagged: +t+ar_ADV^L inne_P.

Prepositions with demonstratives (FOR+TI, FOR+TAN, etc.)

A preposition may be followed by a demonstrative (FOR +TI, FOR +TAN, FOR +TAT, IN +TAT, WI+T +TAN, etc.) either absolutely, or followed by a clause. In all cases the demonstrative is tagged D. If the preposition and demonstrative are cliticized, the unit is tagged P if it introduces a clause, ADV if it is used as a sentence adverb (FOR+TI) and P+D, if used absolutely.

     for+ti_ADV ic_PRO^N cw+a+d_VBDI Godes_NPR^G word_N^A ,_, for+tan_P
     +te_C he_PRO^N on_P his_PRO$ godspelle_N^D cw+a+d_VBDI ,_,

     And_CONJ for+ti_ADV cw+a+t_VBDI se_D^N stemn_N^N clypigende_VAG 
     to_P Petre_NPR^D

     butan_P he_PRO^N nyde_N^D sceolde_MDD ,_, for+dan_P +te_C he_PRO^N 
     wiste_VBD hw+at_WPRO^N him_PRO^D gewitegod_VBN w+as_BEDI ,_,

        and_CONJ nes_NEG+BEDI se_D^N mann_N^N on_P +t+are_D^D scire_N^D 
     +te_C hi_PRO^A gesawe_VBDS +ar+tan_P+D^I ._.

     +t+ar_ADV^L +d+ar_ADV^L se_D^N god_N^N Baal_NPR^N +ar_ADV^T
     w+as_BEDI gewur+dod_VBN wolice_ADV o+d+t+at_P+D^A

Prepositions and particles

When one of the list of adverbial particles precedes a preposition immediately, it is tagged as a particle, even in cases such as IN TO or UP ON which might be interpreted as cases of split prepositions.

    adune_RP to_P Sebastianes_NPR^G fotum_N^D
    ut_RP on_P s+a_N
    ut_RP to_P anum_NUM^D felda_N^D 
    up_RP to_P +t+are_D^D st+agre_N^D

Prepositions with clausal complements

Subordinating conjunctions are treated as prepositions taking a clausal complement and tagged P.

     +t+ar_ADV^L gecy+ded_VBN wear+d_BEDI +t+at_C
     halig_ADJ^N god_NPR^N helpe_N^A gefremede_VBD ,_,
     +da_P wear+d_BEDI gehyred_VBN heofoncyninges_N^G    <--- P+clause
     stefn_N^N wr+atlic_ADJ^N under_P wolcnum_N^D ,_,    <--- P+NP
     wordhleo+dres_N^G sweg_N^N m+ares_ADJ^G
     +teodnes_NPR^G ._.

     God_NPR^A sceal_MDPI mon_MAN^N +arest_ADV^T
     hergan_VB f+agre_ADV ,_, f+ader_NPR^A userne_PRO$^A
     ,_, for+ton_P +te_C he_PRO^N us_PRO^D +at_P         <--- P+C+clause
     frym+te_N^D geteode_VBD lif_N^A ond_CONJ
     l+anne_ADJ^A willan_N^A :_.

Note that GIF is tagged P when introducing an adverbial clause, but C when introducing an indirect question.

Modified conjunctions (SWA SWA, +TA +TA, etc.)

Modifying adverbs such as SWA, EALL, +TA, etc., which commonly appear before prepositions introducing adverbial clauses, can also occur written together with the preposition, SWASWA, EALLSWA, +TA+TA. In these cases the whole unit is labelled P.
     Swa_ADV swa_P d+agred_N^N todr+af+d_RP+VBPI
     +ta_D^A dimlican_ADJ^A +tystra_N^A ,_, and_CONJ manna_N^G eagan_N^A
     onlyht_RP+VBPI +te_C blinde_ADJ^N w+aron_BEDI on_P niht_N^A ,_, swa_ADV
     adr+afde_VBD +tin_PRO$^N lar_N^N +ta_D^A geleafleaste_N^A fram_P me_PRO

     And_CONJ him_PRO^D eallswa_ADV getimode_VBD swaswa_P +dam_D^D
     o+drum_ADJ^D flocce_N^D ,_, +t+at_C hi_PRO^N wurdon_BEDI
     forb+arnde_VBN^N mid_P brastligendum_VBN^D lige_N^D heofonlices_ADJ^G
     fyres_N^G f+arlice_ADV ealle_Q^N ._.

     +Ta_ADV^T +ta_P se_D^N sunu_N^N +t+at_D^A geseah_VBDI ,_, 
     +ta_ADV^T gesohte_VBD he_PRO^N +t+as_D^G preostes_N^G
     fet_N^A ,_.

     and_CONJ +ta_D^N witan_N^N heton_VBDI hine_PRO^A beheafdian_VB ,_,
     +ta+ta_P he_PRO^N ne_NEG mihte_MDD his_PRO$ mand+ada_N^A betellan_VB

SWA (preposition)

SWA introduces various kinds of adverbial and comparative clauses. It is labelled as a preposition in all cases except as the second SWA in free relatives of the SWA HW- SWA type where it is treated as the complementizer. SWA is also used as an adverb.

    and_CONJ +tu_PRO^N bist_BEPI swa_ADV hal_ADJ^N swa_P ic_PRO^N ._.

    So+dlice_ADV +alc_Q^N libbende_VAG nyten_N^N ,_, swa_ADV swa_P
    Adam_NPR^N hit_PRO^A gecygde_VBD ,_, swa_ADV is_BEPI his_PRO$ nama_N^N ._.

    &_CONJ beheledon_VBDI heora_PRO$ f+aderes_N^G gecynd_N^A ,_, swa_P
    +d+at_C hi_PRO^N ne_NEG gesawon_VBDI his_PRO$ n+acednysse_N ._.

    Abram_NPR^N +da_ADV^T ferde_VBD of_P Aran_NPR
    ,_, swa_ADV swa_P God_NPR^N him_PRO^D bead_VBDI ,_.


+TONNE meaning when or introducing comparative clauses is tagged P

    and_CONJ wolde_MDD beon_BE fur+dor_ADJ^N on_P o+drum_ADJ^D earde_N^D 
    +tonne_P he_PRO^N on_P his_PRO$ agenum_ADJ^D w+are_BEDS

    Agnes_NPR^N him_PRO^D andwyrde_VBD ,_, Se_D^N
    +almihtiga_ADJ^N hera+d_VBPI swi+dor_ADV manna_N^G mod_N^A +tonne_P
    heora_PRO$ mycclan_Q ylde_N ,_.


In the sequence "PREPOSITION NP WEARD", WEARD is tagged P. When the preposition and WEARD are written together and precede the NP, the whole unit is tagged P.

        wi+d_P Rome_NPR weard_P
	to_P mynstre_N^D weard_P 
	toweard_P +t+am_D^D feo_N^D
        mynstre_N^D weard_P

+TY/+TE L+AS (+TE)

+TY_+TE L+AS (+TE) unless introducing a subordinate clause is treated as follows.

        +ty_D^I l+as_P +te_C ...

	and_CONJ clypa_VBI to_P +tam_D^D godum_N^D ,_, +te_D^I l+as_P +de_C
	+tu_PRO^N +din_PRO$^A lif_N^A forl+ate_VBPS on_P iugo+de_N ._.

+TEAH+TE, O+T+TE etc.

One-word combinations of a subordinating conjunction and +TE introducing a subordinate clause, such as +TEAH+TE although, O+T+TE until are separated manually and its two constituents tagged in the usual way. A comment is left to indicate the separation.

        $+teah_P $+te_C {TEXT:+teah+te}_CODE
        $o+t_P $+te_C {TEXT:o+t+te}_CODE

Notice that this policy does not apply to +T+ATTE. +T+ATTE


In addition to acting as a preposition until, O+T+T+AT can be used absolutely meaning until then, in which case it is tagged P+D^A.

        and_CONJ behwurfon_VBDI hire_PRO$ lic_N^A o+t+t+at_P heo_PRO^N
	bebyrged_VBN w+as_BEDI 

	_CODE Worhton_VBDI +ta_ADV^T anne_NUM^A
	gangtun_N^A ,_, +t+ar_ADV^L +d+ar_ADV^L se_D^N god_N^N 
	Baal_NPR^N +ar_ADV^T w+as_BEDI gewur+dod_VBN wolice_ADV 
	o+d+t+at_P+D^A ._.


When TOMIDDES, BETWEONUM, BESU+TAN, etc. are written as single words, they are tagged as prepositions when they take a complement and as adverbs when used absolutely. When written separately they are tagged literally according to their constituent parts, whether used absolutely or taking a complement.

    tomiddes_P +dam_D^D streame_N^D
    and_CONJ hi_PRO^A tomiddes_ADV^D besceofan_VB
    to_P middes_N^G
    be_P tweonum_NUM^D
    him_PRO^D betweonum_P 
    to_P middes_N^G +tam_D^D ise_N^D
    be_P su+tan_ADV +tam_D^D mu+tan_N^D
    on_P middan_ADJ
    to_P foran_P


SAM ... SAM meaning whether ... or is tagged P.

     hy_PRO^N gedo+d_VBPI +t+at_C o+ter_ADJ^N bi+d_BEPI
     oferfroren_RP+VBN ,_, sam_P hit_PRO^N sy_BEPS sumor_N^N sam_P


BUTON is always tagged P, even when it means but and seems to function as a coordinating conjunction. BUTON can also function as a focus particle.

    Seo_D^N Asia_NPR^N ,_, on_P +alce_Q^A healfe_N^A heo_PRO^N is_BEPI
    befangen_VBN mid_P sealtum_ADJ^D w+atre_N^D buton_P on_P easthealfe_N

    +Ta_ADV^T beag_VBDI +t+at_D^N land_N^N +t+ar_ADV^L eastryhte_ADV^D ,_,
    o+t+te_CONJ seo_D^N s+a_N^N in_RP on_P +d+at_D^A lond_N^A ,_, he_PRO^N
    nysse_NEG+VBD hw+a+der_WPRO^N buton_P he_PRO^N wisse_VBD +d+at_C
    he_PRO^N +d+ar_ADV^L bad_VBDI westanwindes_N^G &_CONJ hwon_Q^I 


GELICE is normally an adverb (or an inflected adjective), but it occurs in the subordinating construction "GELICE &" three times in Orosius, where GELICE is tagged P to make the subordinating nature of the construction clear.

    for_P +ton_D^I +te_C elpendes_N^G hyd_N^N wile_MDP drincan_VB w+atan_N ,_,
    gelice_P &_CONJ spynge_N^N de+d_VBPI ._.

    +t+at_C hie_PRO^A an_NUM^N cyning_N^N swa_ADV ie+delice_ADV forneah_ADV
    buton_P +alcon_Q^D gewinne_N^D on_P his_PRO$ geweald_N^A be+tridian_VB
    sceolde_MDD ,_, gelice_P &_CONJ hie_PRO^N him_PRO^D +teowiende_VAG
    w+aron_BEDI ,_,

Complementizers (C)

+T+AT, +T+ATTE and +TE introducing any kind of subordinate clause are tagged C.

    Swa_P +tu_PRO^N ,_, god__NPR^N of_P
    gode_NPR^D gearo_ADV acenned_VBN ,_, sunu_NPR^N
    so+tan_ADJ^G f+ader_NPR^G ,_, swegles_N^G in_P
    wuldre_N^D butan_P anginne_N^D +afre_ADV^T
    w+are_BEDS ,_, swa_ADV +tec_PRO^A nu_ADV^T for_P
    +tearfum_N^D +tin_PRO$^N agen_ADJ^N geweorc_N^N
    bide+d_VBPI +turh_P byldo_N^A ,_, +t+at_C +tu_PRO^N
    +ta_D^A beorhtan_ADJ^A us_PRO^D sunnan_N^A
    onsende_VBPS ,_, ond_CONJ +te_C sylf_N^N cyme_VBPS
    +t+at_C +du_PRO^N inleohte_VBPS +ta_D^A +te_C
    longe_ADV^T +ar_ADV^T ,_, +trosme_N^D be+teahte_VBN^N
    ond_CONJ in_P +teostrum_N^D her_ADV^L ,_,
    s+aton_VBDI sinneahtes_N^G ._.

+T+AT introducing a relative clause, however, is taken as the relative pronoun (and thus tagged as a determiner) unless this is impossible for reasons of number/gender/case, in which case it is tagged C.

Unlike +TEAH+TE and O+T+TE, it is not clear that +T+AT+TE is best analyzed as +T+AT plus complementizer +TE, since it is sometimes used as a determiner (EXAMPLE) and often introduces adverbial clauses [?? IS THIS TRUE ??], and thus it is not split but rather tagged as a unit. [?? THIS MAY CHANGE ??]

SWA in free relative clauses

SWA is tagged C in free relatives of the SWA HW- SWA type.

    Swa_ADV hwa_WPRO^N swa_C agyt_VBPI +d+as_D^G
    mannes_N^G blod_N^A ,_, his_PRO$ blod_N^N by+d_BEPI agoten_VBN ;_.

    on_P swa_ADV hwylcum_WADJ^D d+age_N^D swa_C +du_PRO^N etst_VBPI of_P
    +dam_D^D treowe_N^D ,_, +du_PRO^N scealt_MDPI dea+de_N^D sweltan_VB ._.

GIF in indirect questions

When GIF introduces indirect questions, it is tagged C.
     nu_ADV^T ic_PRO^N sceal_MDPI geseon_VB gif_C Crist_NPR^N
     +de_PRO geh+al+d_VBPI

     and_CONJ het_VBDI his_PRO$ cnapan_N +da_D^A hwile_N^A hawian_VB to_P
     +d+are_D s+a_N ,_, gif_C +anig_Q^N mist_N^N arise_VBDS of_P +dam_D^D
     mycclum_Q^D brymme_N^D ._.

The following are tagged CONJ, or NEG+CONJ if negative, when used as conjunctions:

+ag+ter, +te, ac, ge, na+ter, ne, ond, o+t+te, swa, &

When there is more than one conjunction, (e.g., +AG+TER GE...GE), all are tagged CONJ.

    &_CONJ leoht_N^N w+aar+d_BEDI geworht_VBN ._.

    sceawa_VBI hw+a+der_WQ hyt_PRO^N sy_BEPS +dines_PRO$^G suna_N^G
    +te_CONJ ne_NEG sy_BEPS ._.

    hw+ar_WADV^L m+ag_MDPI ic_PRO^N wysran_ADJ^A findan_VB +tonne_P
    +tu_PRO^N eart_BEPI ,_, o+t+te_CONJ fur+ton_ADV +tinne_PRO$^A
    gelican_N^A ?_.

    &_CONJ ge_PRO^N beo+d_BEPI +donne_ADV^T englum_N^D gelice_ADJ^N ,_,
    witende_VAG +ag+der_CONJ ge_CONJ god_N^A ge_CONJ yfel_N^A ._.

    Quintianus_NPR^N +ta_ADV^T cw+a+d_VBDI +t+at_C
    heo_PRO^N gecure_VBDS o+der_ADJ^A +d+ara_D^G ,_, swa_CONJ heo_PRO^N
    mid_P fordemdum_VBN^D dyslice_ADV forferde_VBD ,_, swa_CONJ heo_PRO^N
    +tam_D^D godum_N^D geoffrode_VBD ,_, swa_ADV swa_P +a+delboren_ADJ^N
    and_CONJ wis_ADJ^N ._.

+AG+TER_NA+TER are also tagged as quantifiers.

SWA is also tagged as preposition, adverb or complementizer.

NE is also tagged as sentential negation (NEG).

The negative particle NE is tagged NEG. Contractions of NE and verb forms, adverbs and quantifiers are tagged NEG+-.

When NE is used as a conjunction, it is tagged NEG+CONJ. In clauses with only one NE where NE could be a conjunction or negation, it is tagged as negation (NEG) if it immediately precedes the verb and as a conjunction (NEG+CONJ) if it does not.

Although NA sometimes seems to function as a second negative particle, it is always tagged as an negative adverb (NEG+ADV).

    For+d+am_ADV hiora_PRO^G
    n+anig_NEG+Q^N n+as_NEG+BEDI +ta_ADV^T gieta_ADV^T
    ,_. ne_NEG+CONJ hi_PRO^A ne_NEG gesawon_VBDI
    sundbuende_N^N ,_. ne_NEG+CONJ ymbutan_P hi_PRO^A
    awer_ADV^L ne_NEG herdon_VBDI ._.

    Nalles_NEG+Q^G wolcnu_N^N +da_ADV^T giet_ADV^T  
    ofer_P rumne_ADJ^A grund_N^A regnas_N^A b+aron_VBDI
    ,_, wann_ADJ^N mid_P winde_N^D ,_.

    gif_P he_PRO^N wyrsa_ADJ^N ne_NEG bi+d_BEPI ,_,
    ne_NEG wene_VBP ic_PRO^N his_PRO^G na_NEG+ADV
    beteran_ADJ^G ._.

Notice that forms starting with UN- are not tagged NEG+-.

Adverbial particles are tagged RP. The following is an exhaustive list of all words tagged as particles. Note that many of these are tagged as prepositions when they take a complement NP or clause. Preceding another preposition, however, they are tagged as particles (see Prepositions and particles).

adun(e), +after , aweg, (of)dune, fore, for+d, fram, geond, in, mid, ni+der, of, ofer, ongean, on, onweg, to, +turh, under, up, ut, wi+d, wi+der, ymb(e).

When a particle is cliticized to the beginning of a verb, the unit is tagged RP+-.

    Fyrst_N^N for+d_RP gewat_VBDI ._.

    +tanon_ADV^D up_RP hra+de_ADV Wedera_NPR^G
    leode_N^N on_P wang_N^A stigon_VBDI ,_.

    folc_N^N to_RP s+agon_VBDI ,_, hatan_ADJ^D heolfre_N^D ._.

    da_ADV^T him_PRO^D Hro+tgar_NPR^N
    gewat_VBDI mid_P his_PRO$ h+ale+ta_N^G 
    gedryht_N^A ,_, eodur_N^N Scyldinga_NPR^G ,_,
    ut_RP of_P healle_N^D ;_.

    Heht_VBDI +da_ADV^T eorla_N^G hleo_N^N
    eahta_NUM^A mearas_N^A f+atedhleore_ADJ^A on_P
    flet_N^A teon_VB ,_, $in_RP under_P eoderas_N^A ._.

ANA, the weak form of AN, is tagged as a focus particle, following Mitchell (1985: 536) who takes ANA as an indeclinable adverbial constituent which acts as a focus particle. All and only instances of ANA are tagged FP; other forms of AN meaning alone are tagged as numbers (NUM).

    +tu_PRO^N ana_FP canst_MDPI
    ealra_Q^G gehygdo_N^A ,_, meotud_NPR^N
    mancynnes_N^G ,_, mod_N^A in_P hre+dre_N^D ._.

    Ic_PRO^N to_P anum_NUM^D +te_PRO^D ,_,
    middangeardes_N^G weard_N^N ,_, mod_N^A
    sta+tolige_VBP ,_, f+aste_ADJ^A fyrh+dlufan_N^A ,_.

BUTAN/BUTE is tagged as a focus particle in the NE...BUTAN construction and in conjunction with numbers when it means only.

    W+as_BEDI +ta_ADV^T lencten_N^N agan_VBN butan_FP VI_NUM nihtum_N^D
    +ar_P sumeres_N^G cyme_N^D on_P Maias_NPR^G $kalend_N^A ._.

    +t+ar_ADV^L ic_PRO^N ne_NEG gehyrde_VBD butan_FP hlimman_VB s+a_N^A ,_,
    iscaldne_ADJ^A w+ag_N^A ._.

In general the INTJ tag is used only if a word has no other use than as an interjection. When words with other functions as well are used as interjections, they are still tagged with their primary POS tag, and not with INTJ. Thus, HW+AT, is tagged WPRO (without case) when used as an interjection. Adverbs which are also used as interjections (EFNE, +TONNE, HURU, etc.) are always tagged as adverbs since it is too difficult to consistently distinguish adverbial from interjection use. Finally GE, although it has other functions as a pronoun and conjunction, is tagged INTJ in interjection function. The following words are tagged INTJ.

alleluia, amen, ge/gea/gyse, eala, la, nese, wa/wala/walawa, wella

    and_CONJ cw+a+d_VBDI to_P +tam_D^D cnihtum_N^D mid_P cenum_ADJ^D
    geleafan_N^D ,_, Eala_INTJ ge_PRO^N Godes_NPR^G cempan_N^N ,_, ge_PRO^N
    becomon_VBDI to_P sige_N^D ,_.


    &_CONJ eorringa_ADV +tus_ADV cw+a+d_VBDI :_.
    wala_INTJ wa_INTJ ._.

    Hw+at_WPRO^N is_BEPI +tis_D^N ,_, la_INTJ ,_,
    manna_N^G ,_, +te_C minne_PRO$^A eft_ADV^T
    +turh_P fyrngeflit_N^A folga+t_N^A wyrde+d_VBPI ,_,
    ice+d_VBPI ealdne_ADJ^A ni+d_N^A ,_, +ahta_N^A
    strude+d_VBPI ?_.

    Huru_ADV ,_, wyrd_N^N gescreaf_VBDI
    +t+at_C he_PRO^N swa_ADV geleaffull_ADJ^N ond_CONJ
    swa_ADV leof_ADJ^N gode_NPR^D in_P worldrice_N^D
    weor+dan_BE sceolde_MDD ,_, Criste_NPR^D
    gecweme_ADJ^N ._.

Everything (words, symbols, numbers, etc.) except punctuation in foreign language sequences is labelled FW.

Outside of foreign language sequences, foreign names (PAULINUS, etc.) are not tagged FW, but NPR.

Latin liturgical terms (PATER NOSTER, TE DEUM, etc.) are tagged FW, except when they follow English inflectional patterns, in which case they are tagged N.

    he_PRO^N cunne_MDPS pater_FW noster_FW

    Mid_P +tam_D^D paternostre_N^D 

Unknown or problematic words can be tagged XX. This tag is rarely used.

Splitting and joining parts of words

Words that are always treated as separate parts

There are two ways in which it may be indicated that we consider a single written sequence as two words.
  1. The parts are physically separated. This applies when the separation is necessary for the correct parse. The following combinations are always separated.

    1. subject pronouns cliticised to verb forms. In some cases separated forms are emended for clarity, e.g. SCEALTU will be separated and emended to SCEALT TU. Such emendations are always marked with the emendation sign ($) and a comment.
          $cymst_VBPI $tu_PRO^N {TEXT:cymstu}_CODE
          $flitst_VBPI $+du_PRO^N {TEXT:flits+du}_CODE 
    2. one-word combinations of a subordinating conjunction plus +TE introducing a subordinate clause, e.g. +TEAH+TE, O+T+TE.
          God_NPR^N nolde_NEG+MDD ofslean_RP+VB +tone_D^A
          scyldigan_ADJ^A Dauid_NPR^A ,_, $+teah_P $+de_C {TEXT:+teah+de}_CODE
          he_PRO^N syngode_VBD 
    3. SE+TE (or other relative pronoun) introducing relative clauses
          Sy_BEPS wuldor_N^N and_CONJ lof_N^N +dam_D^D welwillendan_ADJ^D Gode_NPR^D
          ,_, $se_D^N $+de_C {TEXT:se+de}_CODE wur+da+d_VBPI his_PRO$ halgan_N mid_P
          wuldre_N^D on_P ecnysse_N ._. 
          and_CONJ +ta_D^N h+a+denan_ADJ^N gelyfdon_VBDI on_P +ta_D^A leasan_ADJ^A
          godas_N^A ,_, $+ta_D^N $+de_C {TEXT:+ta+de}_CODE n+aron_NEG+BEDI
          godas_N^N ac_CONJ gramlice_ADJ^N deofle_N^N ._.
    4. a pronoun plus SELF
          and_CONJ ferde_VBD $him_PRO^D $sylf_ADJ^N {TEXT:himsylf}_CODE aweg_RP
          sorhful_ADJ^N on_P mode_N^D

  2. Compound tagging rather than physical separation is used in three cases:

    1. negation plus verb form
          noldon_NEG+MDDI +t+at_D^A                         <--- NEG+MD
          geryne_N^A rihte_ADV cy+dan_VB ,_,
          ne_NEG+CONJ hire_PRO^D andsware_N^A +anige_Q^A    <--- NEG+CONJ
          secgan_VB ,_, torngeni+dlan_N^N ,_, +t+as_D^G
          hio_PRO^N him_PRO^D to_P sohte_VBD ,_.
          ond_CONJ gewritu_N^A herwdon_VBDI ,_,
          f+adera_N^G lare_N^A ,_, n+afre_NEG+ADV^T         <--- NEG+ADV
          fur+dur_ADV +tonne_P nu_ADV^T ,_, +da_P
          ge_PRO^N blindnesse_N^G bote_N^A forsegon_VBDI ,_,
          ond_CONJ him_PRO^D n+anig_NEG+Q^N w+as_BEDI       <--- NEG+Q
          +al+arendra_N^G o+der_ADJ^N betera_ADJ^N
          under_P swegles_N^G hleo_N^A sy+d+tan_ADV^T
          +afre_ADV^T ,_,
          Ic_PRO^N ne_NEG can_MDPI +t+at_D^A                              
          ic_PRO^N nat_NEG+VBPI ,_,                         <--- NEG+VB
          Nis_NEG+BEPI +d+at_D^N f+ager_ADJ^N               <--- NEG+BE
          si+d_N^N ._.
      Note that quantifiers starting with NAT-, which derive historically from NAT ... I know not ..., are not considered negative.

    2. particle plus verb form. The verb is tagged according to its class, thus WI+THABBAN is tagged RP+HV and ONGINNAN when used as an auxiliary verb is tagged RP+AX but when used as a main verb RP+VB
         Nolde_NEG+MDD ic_PRO^N sweord_N^A beran_VB ,_,
         w+apen_N^A to_P wyrme_N^D ,_, gif_P ic_PRO^N
         wiste_VBD hu_WADV wi+d_P +dam_D^D agl+acean_N^D
         $elles_ADV meahte_MDD gylpe_N^D wi+dgripan_RP+VB ,_,    <-- RP+VB
         swa_P ic_PRO^N gio_ADV^T $wi+d_P
         Grendle_NPR^D dyde_VBD ._.
         Hw+at_INTJ +da_ADV^T la_INTJ ongunnon_RP+AXDI           <-- RP+AX
         +ta_D^N godes_NPR^G cempan_N^N hnexian_VB 
         +t+at_C him_PRO^D nan_NEG+Q^N s+a_N^N wi+thabban_RP+HV  <-- RP+HV
         ne_NEG mehte_MDD 
    3. a preposition plus demonstrative in absolute use is tagged P+D plus case; e.g., O+T+T+AT meaning until then, +AR+TAN before then, FOR+TON for that reason.
          and_CONJ nes_NEG+BEDI se_D^N mann_N^N on_P +t+are_D^D scire_N^D +te_C
          hi_PRO^A gesawe_VBDS +ar+tan_P+D^I ._.
          +t+ar_ADV^L +d+ar_ADV^L se_D^N god_N^N Baal_NPR^N +ar_ADV^T
          w+as_BEDI gewur+dod_VBN wolice_ADV o+d+t+at_P+D^A
          se_D^N +te_C +turhseah_RP+VBDI swa_ADV +tone_D^A
          preost_N^A for+don_P+D^I gesealdne_VBN^A deofle_N^D

Words that are sometimes treated as separate parts

Unlike in the PPCME, in the York Corpus the possibly complex morphological structure of forms other than those discussed in the previous section is not marked. Phrases such as ON SUNDRUM, TO MIDDES, FOR +TAM +TE are tagged literally when written apart and taken as a whole when written as one word.

    to_P middes_N^G +tam_D^D ise_N^D
    tomiddes_P +tam_D^D mu+tan_N^D

However, when an orthographically independent word has no meaning outside a particular phrase (NATES in NATES HWON), or when the literal tagging of the parts is misleading (tagging +T+AR in +T+AR RIHTE as locative when the "word" +T+ARRIHTE is temporal), the PPCME2 numbering system is used to indicate that the parts belong together; the first number indicates the number of parts and the second number which part the tagged word is.

    nates_NEG+ADV21 hwon_NEG+ADV22  not at all     also spelt NATESHWON
    +t+ar_ADV^T21 rihte_ADV^T22     straightaway   also spelt +T+ARRIHTE

But note that the original phrase from which NATESHWON derives NA TO +T+AS HWON is tagged according to its constituent parts.

     na_NEG+ADV to_P +t+as_D^G HWON_Q^I


In general noun-noun compounds are written as single words in edited Old English texts. Sometimes however this convention is not followed and the two parts are orthographically separated. In "true" compounds, the first part of the compound is not inflected (WINTER SETL, SU+T RIMAN, NOR+T S+A), and it therefore is not labelled for case; the case of the whole compound is indicated on the second element.

    winter_N setl_N^A
    +tam_D^D su+t_N riman_N^D
    +tan_D^I arcebiscop_N rice_N^I

We also treat as compounds, however, the names of places, whether or not the first part is inflected (e.g., non-inflected ELIG MYNSTER; inflected EGYPTA LOND, ROME BURH). The first part of such compounds is tagged NPR without case (even if the case is fairly obvious), while the second part is tagged N plus case, according to the usual rules.

    Elig_NPR mynstre_N^D
    Egypta_NPR lond_N^A
    ROME_NPR byrig_N^D

Also treated as compound are the names of peoples like EAST ENGLE, etc. In this case both parts are tagged as proper nouns, but again only the second is labelled for case. Note that, unlike with the names of places, phrases such as ONGLE CYNNE are not treated as compounds when written separately.

    East_NPR Engle_NPR^N
    Mercna_NPR^G cynne_N^D

    Scotta_NPR^G cynnes_N^G