final punctuation . (period) non-final punctuation , (comma)Any punctuation that ends a token (periods, commas, semi-colons, question marks, etc.) is tagged with a period. Note, however, that tokens can terminate without any punctuation. Any punctuation which does not coincide with the end of token is tagged with a comma. A token consists of one main verb and its associated arguments and adjuncts. Conjoined subordinate clauses are included within the same token.
He_PRO^N folgode_VBD +tam_D^D kasere_N^D uncu+d_ADJ^N him_PRO^D swa_ADV +teah_ADV ,_, na_NEG+ADV swylce_P he_PRO^N ne_NEG dorste_MDD for_P his_PRO$ drihtne_NPR^D +drowian_VB ,_, ac_CONJ he_PRO^N wolde_MDD gehyrtan_VB +da_D^A +te_C se_D^N h+a+dena_ADJ^N casere_N^N d+aghwamlice_ADV acwealde_VBD for_P Cristes_NPR^G geleafan_N ._. &_CONJ Drihten_NPR^N cw+a+d_VBDI to_P him_PRO^D :_, Hwi_WADV eart_BEPI +du_PRO^N yrre_ADJ^N ?_. And_CONJ ealle_Q^N +ta_D^N hyredmenn_N^N hine_PRO^A h+afdon_HVDI for_P f+ader_N^AConjoined sentences and VPs are separated. Only main verbs clearly conjoined at the word level (as indicated by shared arguments) are kept together. In parsing (LINK TO SYNTAX) empty subjects are added to tokens lacking a subject due to elision under conjunction. For more detail see the PPCME2 rules for clausal conjunction.
Ac_CONJ +ta_D^N h+a+denan_N^N hyna+d_VBPI and_CONJ hergia+d_VBPI +ta_D^A Cristenan_N^A He_PRO^N gesette_VBD hine_PRO^A to_P ealdre_N^D ofer_P an_NUM^A werod_N^A ,_. and_CONJ het_VBDI hine_PRO^A symble_ADV^T beon_BE +atforan_P his_PRO$ gesih+de_N ._.Periods in the text which are not used as sentential punctuation, such as periods indicating abbreviation, surrounding numbers and certain words (e.g., .x. .Mon. etc.), are not separated from the word they belong to.
XII._FW KAL._FW DECEMBRES_FW ,_, PASSIO_FW SANCTI_FW EADMVNDI_FW REGIS_FW ET_FW MARTYRIS_FW ._.
Unlike in the PPCME2 where case-marking on
nominal elements is largely non-existent and when present often unclear,
case is still fully productive in Old English. Case is dealt
with differently in the poetry and
prose parts of the corpus.
Case in Poetry
In the poetry corpus case is marked on inflecting words following, in
ambiguous cases, the decision of the editor of the edition. A few items
which belong to normally inflecting categories (quantifiers, numbers, etc.)
do not regularly inflect and are consequently not labelled for case. These
are:
Present and past participles are labelled for case when modifying or attributive. When acting as part of the main verb sequence, past participles are only case marked if the case marking is overt (i.e., non-zero) and present participles if the case-ending is other than -E. See Case marking on participles.
Case in Prose
While case is a fully productive category in Old English, many case forms
are formally ambiguous, and sometimes remain ambiguous even in
context. Our basic approach to indicating case in the prose corpus is to mark it when it is
clear, but not when it is ambiguous, or potentially ambiguous, tempered by
considerations of the effort involved and the needs of the system as a
whole.
The following parts of speech may be labelled for case:
In addition, the so-called "inflected infinitive" is labelled with dative case.
Certain items are never labelled for case. These are:
Other items with special rules for determining when to indicate case are the quantifier EALL when uninflected and the cardinal numbers AN, TWEGEN, +TRY in combination with other numbers.
Case is labelled on all case-inflecting words in the
following circumstances:
When case-marking is ambiguous in isolation, it is nevertheless marked in
the following circumstances:
Case on arguments of verbs and prepositions
Note particularly that words do not receive case from verbs or prepositions
in a straightforward way; that is, ambiguous case forms acting as
complements of verbs or prepositions are not generally labelled for case
based on the case-taking properties of the governing verb or
preposition. Thus, for instance, an acc/dat ambiguous complement of a
verb/preposition which normally takes the dative will not be
labelled dative, but rather left unmarked. The exception to this is that
dat/gen ambiguous complements of verbs/prepositions not listed as taking
genitive are assumed to be dative. This approach was adopted largely for
efficiency reasons, to avoid having to find reliable information on the
case requirement of every verb in the corpus. For consistency, the same
rules are applied to prepositions.
Case on participles
When participles are part of the main verb
sequence, they are only marked for case if the case is overt (i.e.,
non-zero) in the case of past participles, or not -E in the case of present
participles.
Eoforlic_N^N scionon_VBDI ofer_P hleorberan_N^D gehroden_VBN^N golde_N^D ,_, fah_ADJ^N ond_CONJ fyrheard_ADJ^N ;_. COBEOWULIn the majority of cases modifying participles are appropriately case-marked, although there are a small number of exceptions (e.g., acc.sg. in zero rather than -NE). Thus, with one exception (see below), modifying and attributive participles are labelled with the case of the item they modify, whether the participle is appropriately case-marked or not.Sy+d+dan_P +arest_ADV^T $wear+d_BEDI feasceaft_ADJ^N funden_VBN ,_, he_PRO^N +t+as_D^G frofre_N^A gebad_VBDI ,_. COBEOWUL
feower_NUM bearn_N^N for+d_RP gerimed_VBN^N COBEOWULThe exceptional case is that of "naming" participles (GEHATEN, GECIGEN, etc.) which rarely if ever inflect in attributive use. These are therefore not labelled with case unless it is overt, in the same way as participles which are part of the main verb sequence.se_D^N +de_C ealfela_Q ealdgesegena_N^G worn_N^A gemunde_VBD ,_, word_N^A o+ter_ADJ^A fand_VBDI so+de_ADV gebunden_VBN^A ;_. COBEOWUL
_CODE W+as_BEDI min_PRO$^N f+ader_N^N folcum_N^D gecy+ted_VBN ,_, +a+tele_ADJ^N ordfruma_N^N ,_, Ecg+teow_NPR^N haten_VBN ._. COBEOWUL
Case on left-dislocations
Left-dislocated NPs may be in the nominative case even when the resumptive
element is oblique. This means that in the case of a left-dislocated
nom/acc ambiguous NP with an accusative resumptive element, the ambiguity
cannot be resolved, and thus the left-dislocated NP is not labelled for
case. The following rules are applied for labelling case on left-dislocated
NPs.
Case-marking Flow Chart used by annotators.
PRO Pronoun PRO$ Pronoun, possessive MAN Indefinite MAN
All personal pronouns are labelled PRO with the exception of
indefinite MAN. Pronouns are tagged for case according to the case-marking rules. In the prose, 1st and 2nd person sg/pl
pronouns are acc/dat ambiguous and thus are not generally labelled for case
except in copular
constructions.
He_PRO^N gesette_VBD hine_PRO^A to_P ealdre_N^D ofer_P an_NUM^A werod_N^A ,_. Eala_INTJ ge_PRO^N godes_NPR^G cempan_N^N ,_, ge_PRO^N becomon_VBDI to_P sige_N^D ,_. +Tas_D^N +te_C her_ADV^L nu_ADV^T wepa+d_VBPI woldon_MDDI mid_P eow_PRO blissian_VB ,_, gif_P +t+at_D^N is_BEPI so+d_ADJ^N +t+at_D^A ic_PRO^N eow_PRO s+ade_VBD
Reflexive pronouns
Personal pronouns can be used as reflexives in Old English, but they are
not marked as such at the part-of-speech level, but rather in the parsing.
Ic_PRO^N me_PRO gebidde_VBP Him_PRO^D +da_ADV Scyld_NPR^N gewat_VBDI to_P gesc+aphwile_N^D felahror_ADJ^N feran_VB on_P frean_NPR^G w+are_N^A ._. COBEOWULForms of SELF are tagged ADJ. The occasional uses of pronoun-plus-SELF (HIMSELF, THEMSELVES etc.) are split to faciliate parsing.
hi_PRO^N sylfe_ADJ^N Hi_PRO^N hyne_PRO^A +ta_ADV^T +atb+aron_VBDI to_P brimes_N^G faro+de_N^D ,_, sw+ase_ADJ^N gesi+tas_N^N ,_, swa_P he_PRO^N selfa_ADJ^N b+ad_VBDI ,_, COBEOWUL$him_PRO^D $selfum_ADJ^D {TEXT:himselfum}_CODE
Possessive pronouns
Tagged as possessive pronouns are MIN, +TIN, HIS, HIRE, UNCER, URE, INCER,
EOWER, HEORA. Of these, HIS, HIRE, HEORA are not tagged for case; the
other possessives are declined like adjectives and are therefore
consistently case-marked. Notice that all these forms can also be tagged
PRO^G if the use is clearly genitival rather than possessive.
Eadig_ADJ^N bi+d_BEPI se_D^N +te_C in_P his_PRO$ e+tle_N^D ge+tih+d_VBPI ,_. COEXETER5URE and EOWER sometimes fail to agree with a following noun, in which case they are tagged simply PRO$ without case. For URE this applies to nouns in the masc/neut. dat/gen.sg., masc/fem/neut. gen/dat.pl. and masc. acc.sg; for EOWER, it applies to masc/fem/neut gen.pl. and to fem. acc/gen/dat.sg. In addition EOWRE is ambiguous for case with fem. non-nominatives in -E (unlike most adjectives in -E) since the gen/dat form EOWERRE is often simplified to EOWRE falling together with the acc.A_ADV^T ic_PRO^N symles_ADV^T w+as_BEDI on_P wega_N^G gehwam_Q^D willan_N^G +tines_PRO$^G georn_ADJ^N on_P mode_N^D ;_. COANDREA Ic_PRO^N his_PRO^G $bidan_VB ne_NEG <--- genitival use dear_MDPI ,_, re+tes_N^G on_P geruman_N^D ,_. CORIDDLE W+as_BED hira_PRO^G Matheus_NPR^N sum_Q^N <--- genitival use ,_, se_D^N mid_P Iudeum_NPR^D ongan_RP+AXDI godspell_N^A +arest_ADV^T wordum_N^D writan_VB wundorcr+afte_N^D ._. COANDREA
ure_PRO$ goda_N^G (masc. gen.pl.) ure_PRO$ lenctenlicum_ADJ^D f+astene_N^D (neut. dat.sg.) ure_PRO$ un+tances_N^G (masc. gen.sg.) eower_PRO$ wifa_N^G (neut. gen.pl.) +durh_P eower_PRO$ hiwr+adene_N (fem. acc/dat/gen.sg.) eowre_PRO$ gewitleaste_N (fem. acc/dat/gen.sg.)
Indefinite MAN (MAN)
Indefinite MAN is tagged MAN. It is always a subject so always
case-marked nominative.
Ac_CONJ +ta_D^N halgan_N^N tihton_VBDI +t+at_C man_MAN^N +ta_D^A ofnas_N^A ontende_RP+VBPS ,_. O+d+de_CONJ hi_PRO^N synd_BEPI st+anene_ADJ^G mid_P +tam_D^D +te_C man_MAN^N str+ata_N^A wyrc+d_VBPI ._.
Existential there
*difference*
Existential +T+AR is not distinguished from
locative +T+AR in the York Corpus as in the PPCME2; +T+AR is always treated as
an locative adverb.
Singular, plural, and collective common nouns (N)
*difference*
Unlike in the PPCME2,
in the York Corpus no distinction is made between singular and plural
nouns.
Singular, plural and collective nouns are all tagged N. All common nouns are tagged for case according to the case-marking rules.
+Ta_ADV^T w+aron_BEDI twegen_NUM^N gebro+dra_N^N +a+telborene_VBN^N for_P worulde_N ,_, Marcus_NPR^N and_CONJ Marcellianus_NPR^N ,_, mycclum_Q^D geswencte_VBN^N on_P bendum_N^D and_CONJ on_P swingelum_N^D for_P +dam_D^D so+tan_ADJ^D geleafan_N^D ._.
Possessives and genitives
*difference*
Unlike in the PPCME2, in the York Corpus
no distinction is made between genitive and possessive nouns. All genitive
nouns have a case tag ^G; the $ tag is only used to
distinguish possessive pronouns (PRO$).
Compass points
*difference*
Compass points do not seem to be used nominally in Old English (as in,
She lived in the east), and so, unlike in the PPCME2 compass points are tagged as adverbs.
As parts of compound names (EAST ENGLA, etc.), however, compass points are tagged NPR.
Adverbial use of nouns
Nouns in oblique cases used adverbially are tagged as adverbs.
Names of people
All personal names are tagged NPR.
The word SANCTA/SANCTE/SANCTUS used in conjunction with a proper name is tagged NPR, but other (native) words possibly used as titles are not. SANCTA/SANCTE/SANCTUS is not case-marked since it does not inflect according to a native pattern (or reliably at all).
sanctus_NPR Paulus_NPR^N Sancte_NPR Dunstan_NPR^N Sancta_NPR Maria_NPR^N +A+telstan_NPR^N cyning_N^NTwo-part names like EAST ENGLA, NOR+T SEAXE, etc. when written as separate words are treated as compounds. Thus the first part is not tagged for case.
East_NPR Engle_NPR^N Nor+t_NPR Walas_NPR^N Middel_NPR Seaxe_NPR^N Ald_NPR Seaxe_NPR^NEpithets of people or peoples are not tagged NPR.
Sceotta_NPR^G leoda_N^N and_CONJ scipflotan_N^N <--- 'pirate host_Vikings' f+age_ADJ^N feollan_VBDI ,_, COBRUNANHowever, compounds containing a proper noun are tagged NPR.Freond_N^N $onsegon_VBDI la+dum_ADJ^D eagan_N^D landmanna_N^G cyme_N^A ._. <--- 'landlubbers_Egyptians' COEXODUS Heht_VBDI +ta_ADV^T onlice_ADV +a+delinga_N^G hleo_N^N ,_, <--- multiple epithets beorna_N^G beaggifa_N^N ,_, swa_P he_PRO^N +t+at_D^A beacen_N^A geseah_VBDI ,_, heria_N^G hildfruma_N^N ,_, +t+at_C him_PRO^D on_P heofonum_N^D +ar_ADV^T geiewed_VBN wear+d_BEDI ,_, ofstum_N^D myclum_Q^D ,_, Constantinus_NPR^N ,_, Cristes_NPR^G rode_N^D ,_, tireadig_ADJ^N cyning_N^N ,_, tacen_N^A gewyrcan_VB ._. COCYNEW2
Gardene_NPR spear-Danes Hringdene_NPR ring-Danes Arscyldingas_NPR honour-ScildingsIn the poetry, nouns used as names in a particular context are tagged NPR.
Is_BEPI +t+at_D^N deor_N^N pandher_NPR^N bi_P <--- 'Panther' noman_N^D haten_VBN ,_, COEXETER7Adjectives corresponding to proper nouns are tagged ADJ, even when used substantively.Nama_N^N w+as_BEDI gecyrred_VBN beornes_N^G in_P burgum_N^D on_P +t+at_D^A betere_ADJ^A for+d_RP ,_, +a_NPR^N h+alendes_NPR^G <--- 'Saviour's Revelation ._. COCYNEW2 Leoht_N^N w+as_BEDI +arest_ADV^T +turh_P drihtnes_NPR^G word_N^A d+ag_NPR^N genemned_VBN ,_, <--- 'Day' wlitebeorhte_ADJ^N $gesceaft_N^N ._. COGENESI
Scittisc_ADJ^N the Scottish cristenra_ADJ^G cwen_N^N queen of the Christians Ebreisce_ADJ^N +a_N^N Hebrew law
Names of places
Names consisting of a name plus a common noun (Rome burh, Elig
mynster) are treated as compounds but only the
name is tagged NPR; the common noun is tagged N. The case of the
first part of the compound is often difficult to determine with
certainty. It is generally either uninflected (i.e., the same as the
nominative singular form) or a possible genitive, singular or plural. We
have tagged all these cases as compounds, regardless of the presence or
absence of an identifiable case on the first element. The name is therefore
tagged only NPR with no case indicated; case is indicated on the
common noun if appropriate according to the case-marking
rules. Note that this only applies to place names and not to other potentially similar cases
(as, for instance, the names of peoples NOR+TUMBRA CYNNE).
Dinges_NPR mere_N^N Elig_NPR mynstre_N^D Rome_NPR byrig_N^D Egypta_NPR lond_N^A
Days of the week, months, and religious festivals/seasons
The names of the days of the week and months of the year are tagged as
proper nouns. Religious seasons, such as Lent, and festivals, such as
Easter, are also tagged NPR. Massdays (HLAFM+ASS, CANDELM+ASS, etc.) and DOMES
D+AG are not tagged NPR.
Eastron_NPR Sunnand+agum_NPR^DLENCTEN can be either a noun or an adjective. When preceding a noun (e.g., LENCTENES F+ASTENES) it is tagged as an adjective, otherwise a noun.
lenctenes_ADJ^G f+astenes_N^G +tam_D^D halgan_ADJ^D lenctene_NPR^DKALEND is not proper in phrases like MAIAS KALEND.
Maias_NPR^G kalend_N^N the month of MayIn conjunction with Latin-inflected month names KALEND- is tagged FW.
vi_FW Kalend+a_FW Novembris_FW iii_FW Kalend+a_FW IUNII_FW
Laden_NPR on_P Englisc_ADJ^A
Names of God
The following are taken as names of God: GOD, DRIHTEN, JESUS, CRIST. All
other ways of referring to God (H+ALEND, SCYPPEND, HALIG GAST, F+A+DER,
SUNU, etc.) are considered to be epithets and are tagged as common nouns
*difference*
Note that in YCOE comparative and superlative
adjectives are not distinguished from positives as they are in
the PPCME2. The same tag
is used for all forms.
Positive, comparative and superlative adjectives are labelled ADJ. Adjectives are tagged for case according to the case-marking rules.
Gif_P him_PRO^D wan_ADJ^N fore_P <--- positive adjective wolcen_N^N hanga+d_VBPI ,_, ne_NEG m+agen_MDPS hi_PRO^N swa_ADV leohtne_ADJ^A leoman_N^A ansendan_VB ,_, +ar_P se_D^N +ticca_ADJ^N mist_N^N +tynra_ADJ^N weor+de_BEPS ._. <--- comparative adjective COMETBOE
Weak adjective/noun ambiguity
*Applies to prose only*
When an adjective has a corresponding weak noun associated with it
(e.g. HALIG/HALGA, CRISTEN/CRISTENA) many cases following a determiner are
ambiguous between a noun and an adjective reading. The default tagging in
these cases is that if the word precedes a noun it is tagged as an
adjective, but otherwise as a noun. Only cases with a weak noun form listed
in the dictionary fall under this rule; e.g. SEOCA is always an adjective
because there is no noun SEOCA.
+done_D^A halgan_ADJ^A w+ar_N^A +ta_D^A halgan_N^A
Ordinal numbers
Ordinal numbers are tagged ADJ.
+ta_ADV^T com_VBDI ofer_P foldan_N^A fus_ADJ^N si+dian_VB m+are_ADJ^N mergen_N^N +tridda_ADJ^N ._. COGENESI+AREST may also be tagged ADV^T when used as a temporal adverb.
Adjectival use of quantifiers (MICEL and LYTEL)
MICEL and LYTEL and their comparative forms are tagged as quantifiers even when they clearly mean large,
small.
*difference*
Note that this is slightly different from the PPCME2 where in some
cases, notably following a determiner and in copular constructions, these
words are tagged as adjectives.
NEAH (adjective)
NEAH is only tagged ADJ when it is overtly inflected or clearly
part of a noun phrase. In all other cases it is tagged as an adverb. Thus most cases of NEAH will be tagged
ADV, even those in which although it is not overtly inflected it
might be taken as agreeing with a masculine or neuter singular noun.
sumre_Q^G neah_ADJ^G cyrican_N^G +tam_D^D neah_ADJ^D wuda_N^D
SWELC and +TYLLIC
SWELC and +TYLLIC are tagged as adjectives. SWELCE may also be an adverb or preposition.
mid_P swylcum_ADJ^D frofre_N^D beo_BEPS +tin_PRO$^N wif_N^N swylc_ADJ^N swa_P Uenus_NPR^N
SELF
Forms of SELF are always tagged as adjectives.
ta_ADV^T adrencte_VBD he_PRO^N hiene_PRO^A selfne_ADJ^A on_P +tam_D^D ge_PRO^N sylfa_ADJ^N moton_MDPI mid_P him_PRO^D +afre_ADV^T wunian_VB
NUM + WINTRE/GEARE (adjective)
Adjectives ending in WINTRE/GEARE (ANWINTRE, TWELFWINTRE) meaning x years
old, are tagged ADJ when written as single words. When the
numbers are written separately, WINTRE is still tagged ADJ and the
other parts of the phrase are tagged literally.
twelfwintre_ADJ^N xviii_NUM wintre_ADJ^N fif_NUM &_CONJ sixtigwintre_ADJ^N an_NUM and_CONJ twentig_NUM geare_ADJ^N
With two exceptions, the words on the following list are tagged
Q in all functions (modifying, abosolute, adverbial, etc.). The
exceptions are the wh-indefinites (HWA,
HWILC, etc.) and +AG+TER, NA+TOR.
wiht (and derivatives na(n)wiht, naht, na(n)wuht), +alc, +anig (and derivative n+anig), begen, butu, eall, feawe, fela, hwa (and derivatives nathwa, +aghwa, +athwa, gehwa, hw+atwugu, nateshwan), hw+a+ter (and derivatives +aghw+a+ter, +ag+ter, gehw+a+ter, nahw+a+ter, na+tor), hwilc (and derivatives nathwilc, +aghwilc, gehwilc, (ge)welhwilc, hwilcwugu), lyt, lytel (and derivative unlytel), ma, manig, micel, sum
monegum_Q^D m+ag+tum_N^D +aghwylc_Q^N +tara_D^G ymbsittendra_N^G madma_N^G fela_Q
Wh-indefinites
Note that the hw-words (HWA, HWILC, etc.) are also used as wh-words in questions, where
they are not tagged Q, but WPRO, WADJ, etc.
Negative quantifiers
Negative derivatives starting with N- are tagged NEG+Q; but note that
quantifiers starting with NAT- (e.g., NATHWA, NATHW+AT, etc. from
I know not who/what/etc) are not negative.
+AG+TER, NA+TOR (quantifiers)
+AG+TER and NA+TOR are also used as conjunctions, in which case they are
tagged CONJ and NEG+CONJ, respectively.
MICEL and LYTEL
Note especially that (UN)LYTEL and MICEL are consistently tagged Q, even
when their meaning is more adjectival than quantificational, i.e. when
LYTEL is better interpreted as small and MICEL as
great. Distinguishing between the two readings can be quite
difficult, especially with plural nouns, and we have not attempted to do
so, despite the fact that it creates infelicitous readings in some cases.
for_P +tam_D^D mycclan_Q^D gewynne_N^D on_P anre_NUM^D lytlan_Q^D byrig_N^D
Undeclinable quantifiers FELA, LYT, MA
The undeclinable quantifiers FELA, LYT AND MA are not tagged for case.
t+at_C +ter_ADV^L ne_NEG mihte_MDD na_NEG+ADV ma_Q muneca_N^G wunian_VB and_CONJ +t+ara_D^G ma+dma_N^G ne_NEG rohte_VBD +te_D^I ma_Q +te_C reocendes_VAG^G meoxes_N^G ._. Se_D^N feond_N^N h+afde_HVD him_PRO^D mid_P fela_Q o+dre_ADJ^A sceoccan_N^A ,_. +t+at_C heo_PRO^N heora_PRO$ deadra_ADJ^G to_ADV lyt_Q h+afden_HVDSA quantifier used in conjunction with a nom/acc/gen plural ambiguous NP can often be taken as either a head with a genitive complement or a modifier of a nom/acc head. In these cases we apply the default rule that undeclinable quantifiers (FELA, LYT, MA) take a genitive complement, while all other quantifiers are taken as modifiers. This follows the majority pattern although there are clear examples of the other pattern for both types.
fela_Q suna_N^G manige_Q^N suna_N^N
Uninflected EALL
EALL with no overt inflection is treated as follows. When EALL immediately
precedes an NP with which it potentially agrees (masc/neut. nom.sg. etc.)
it is tagged with the same case as the words of the NP. In all other
positions it is tagged only Q with no case. This includes cases
when it follows a NP with which it potentially agrees.
eall_Q^N +t+as_D^G cyninges_N^G r+ad_N^N all_Q^A woruld+ding_N^A +t+at_D^A mynster_N^A eall_Q and_CONJ fleow_VBDI eall_Q blode_N^D ._. +After_P +tysum_D^D worde_N^D he_PRO^N wear+d_BEDI eall_Q geh+aled_VBN ,_.
All forms of SE and +TES are tagged D plus appropriate case tag,
according to the case-marking rules. This
includes when used alone and forms of SE used as relative pronouns.
+d+am_D^D eafera_N^N w+as_BEDI +after_ADV^T cenned_VBN ,_, geong_ADJ^N in_P geardum_N^D ,_, +tone_D^A god_NPR^N sende_VBD folce_N^D to_P <--- relative clause frofre_N^D ;_. COBEOWULGewat_AXDI him_PRO^D +ta_ADV^T Andreas_NPR^N inn_RP on_P ceastre_N^A gl+admod_ADJ^N gangan_VB ,_, to_P +t+as_D^G +de_C <--- 'to where' he_PRO^N gramra_ADJ^G gemot_N^A ,_, COANDREA
Indefinite AN
Although AN can sometimes be interpreted as an indefinite determiner, it is
always tagged as a cardinal
number.
Disambiguating +TA
In some cases, +TA is ambiguous between an adverb or
preposition introducing a clause
and a determiner. In difficult cases the ambiguity is resolved as follows.
hine_PRO^A fyrwyt_N^N br+ac_VBDI modgehygdum_N^D ,_, hw+at_WPRO^N +ta_D^N men_N^N w+aron_BEDI ._. COBEOWULAledon_VBDI +ta_D^N leofne_ADJ^A +teoden_N^A ,_, beaga_N^G bryttan_N^A ,_, on_P bearm_N^A scipes_N^G ,_, m+arne_ADJ^A be_P m+aste_N^D ._. COBEOWUL
+T+AT
Likewise, +T+AT can be ambiguous between a determiner functioning as
wh-word and a complementiser
in relative-clause constructions. By default, +T+AT is
treated as a determiner in these cases if it matches the
antecedent in gender and number, and as a complementizer otherwise.
Wulfgar_NPR^N ma+telode_VBD +t+at_D^N w+as_BEDI Wendla_NPR^G leod_N^N ;_. COBEOWUL_CODE &_CONJ +ta_ADV^T swi+de_ADV ra+te_ADV +after_P +t+am_D^D ,_, swa_P +ta_D^N o+tre_ADJ^N ham_ADV^D comon_VBDI ,_, +ta_ADV^T fundon_VBDI hie_PRO^N o+tre_ADJ^A flocrade_N^A ,_, +t+at_C rad_VBDI ut_RP wi+d_P Lygtunes_NPR^G ,_.
All cardinal numbers except those in foreign
language sequences are tagged NUM, whether they are written
out or in number form. Roman numerals on their own do not count as foreign,
only in conjunction with other foreign words.
libro_FW 5=o=_FW ,_, capitulo_FW 24=o=_FW ._.
*difference*
Note that AN in YCOE does not have a special tag (ONE) as in
the PPCME2.
Case on numbers
Numbers up to three are inflected and tagged for case; all others are only
tagged for case when it is overt. Numbers up to three as part of larger
numbers (e.g., TWEGEN HUND) are only tagged for case if case is
overt.
Git_PRO^N on_P w+ateres_N^G +aht_N^A seofon_NUM niht_N^A swuncon_VBDI ;_. <--- no overt case COBEOWULXVna_NUM^G sum_Q^N sundwudu_N^A sohte_VBD ;_. <--- overt case COBEOWUL
Weak ANA
The weak form ANA is tagged as a focus
particle.
BE TWEONUM
While BETWEONUM is tagged as a preposition
(P), BE ... TWEONUM is tagged be_P ... tweonum_NUM^D.
monig_Q^N oft_ADV gecw+a+d_VBDI +t+atte_C su+d_ADV^L ne_NEG+CONJ nor+d_ADV^L be_P s+am_N^D tweonum_NUM^D ofer_P eormengrund_N^A o+ter_ADJ^N n+anig_NEG+Q^N under_P swegles_N^G begong_N^A selra_ADJ^N n+are_NEG+BEDS rondh+abbendra_N^G ,_, rices_N^G wyr+dra_ADJ^N ._. COBEOWUL
BU TU
While BUTU is tagged as a quantifier (Q), BU ... TU is tagged Q ... NUM.
+da_ADV^T gen_ADV^T ic_PRO^N gecr+afte_VBD +t+at_C se_D^N cempa_N^N ongon_RP+AXDI waldend_NPR^A wundian_VB ,_, weorud_N^N to_RP segon_VBDI +t+at_C +t+ar_ADV^L blod_N^N ond_CONJ w+ater_N^N bu_Q^N tu_NUM +atg+adre_ADV eor+tan_N^A sohtun_VBDI ._. COCYNEW3
Wh-pronoun (WPRO)
Tagged WPRO are forms of HWA/HW+AT heading a wh-NP, and
HW+A+DER meaning which of ....
*difference*
The genitive/possessive form of HWA (HW+AS) is tagged WPRO^G in
the YCOE, rather than as a possessive (WPRO$ as in the PPCME2).
Ic_PRO^N sceal_MDPI hra+de_ADV cunnan_MD hw+at_WPRO^A +du_PRO^N us_PRO^D to_P $dugu+dum_N^D gedon_VB wille_MDPS ._. COANDREAoldon_MDDI cunnian_VB hw+a+der_WQ cwice_ADJ^N lifdon_VBDI +ta_D^N +te_C on_P carcerne_N^D clommum_N^D f+aste_ADV hleoleasan_ADJ^A wic_N^A hwile_N^A wunedon_VBDI ,_, hwylcne_WPRO^A hie_PRO^N to_P +ate_N^D +arest_ADV^T mihton_MDDI +after_P fyrstmearce_N^G feores_N^G ber+adan_VB ._. COANDREA +Ta_P he_PRO^N hie_PRO^A ascade_VBD his_PRO$ $godas_N^A hw+a+ter_WPRO^N heora_PRO^G sceolde_MDD on_P o+trum_ADJ^D sige_N habban_HV ,_, +te_CONJ he_PRO^N on_P Romanum_NPR^D ,_, +te_CONJ Romane_NPR^N on_P him_PRO^D ,_, +ta_ADV^T ondwyrdon_VBDI hie_PRO^N him_PRO^D tweolice_ADV ,_. and_CONJ axode_VBD +tone_D^A halgan_N^A +turh_P hw+as_WPRO^G mihte_N he_PRO^N gefremode_VBD +ta_D^A wundorlican_ADJ^A tacna_N^A ,_, +t+at_C swa_ADV micel_Q^N werod_N^N him_PRO^D folgode_VBD ._.
Wh-adjective (WADJ)
HWILC is tagged WADJ in the YCOE in all cases; i.e., whether it
modifies a noun or not (as with other adjectives such as SWILC, O+TER,
etc.).
*difference*
WHICH is tagged as a wh-determiner (WD) in
the PPCME2.
befran_VBDI for_P hwylcum_WADJ^D intingan_N^D hi_PRO^N hine_PRO^A axodon_VBDI ._. We_PRO^N moton_MDPI nu_ADV^T secgan_VB swutellicor_ADV be_P +dysum_D^D ,_, hwylce_WADJ^N mettas_N^N w+aron_BEDI mannum_N^D forbodene_VBN^N on_P +d+are_D^D ealdan_ADJ^D +a_N^D &_CONJ cw+a+d_VBDI ,_, hwylc_WADJ^N is_BEPI min_PRO$^N modor_N^N &_CONJ mine_PRO$^N gebro+tru_N^N ?_.
Wh-adverb (WADV)
Hw-adverbs (HU, HWONNE, HW+AR and HWI) are tagged WADV both in
direct questions and introducing a wh-clause. Note that +TA and +TONNE are
tagged P when introducing a subordinate clause (see Subordinating conjunctions) and
+T+AR is always tagged as an locative adverb
even when acting as a relative pronoun.
Hu_WADV +tearf_MDPI mannes_N^G sunu_N^N <--- direct question maran_Q^A treowe_N^A ?_. COEXODUS+da_ADV^T w+as_BEDI forma_ADJ^N si+d_N^N +t+at_C hine_PRO^A weroda_N^G god_NPR^N wordum_N^D n+agde_NEG+VBD ,_, +t+ar_ADV^L he_PRO^N him_PRO^D ges+agde_VBD so+dwundra_N^G fela_Q ,_, hu_WADV +tas_D^A woruld_N^A worhte_VBD witig_ADJ^N <--- wh-clause drihten_NPR^N ,_, eor+dan_N^G ymbhwyrft_N^A and_CONJ uprodor_N^A ,_, gesette_VBD sigerice_N^A ,_, COEXODUS
HW+A+TER (WQ)
When introducing a WHETHER question, HW+A+DER is tagged WQ; when
it acts as a wh-pronoun meaning which of two it is tagged as a wh-pronoun WPRO.
swa_P +d+at_C hit_PRO^N n+as_NEG+BEDI gesene_ADJ^N hwe+der_WQ he_PRO^N seoc_ADJ^N w+are_BEDS and_CONJ axodon_VBDI +at_P +tam_D^D hiwum_N^D hw+a+der_WQ se_D^N halga_ADJ^N Petrus_NPR^N +t+ar_ADV^L wununge_N h+afde_HVD ,_. Gebide_VBI ge_PRO^N on_P beorge_N^D byrnum_N^D werede_VBN^N ,_, secgas_N^N on_P searwum_N^D ,_, hw+a+der_WPRO^N sel_ADV m+age_MDPS +after_P w+alr+ase_N^D wunde_N^A gedygan_VB uncer_PRO^G twega_NUM^G ._. COBEOWULGIF in indirect questions is tagged as a complementizer.
Differences from the PPCME2
Verbs are treated slightly differently in the YCOE from the PPCME2. The
main differences are:
MD infinitive MDI imperative MDPI present indicative MDPS present subjunctive MDP present tense (ambiguous subjunctive/indicative) MDPH present tense (ambiguous subjunctive/imperative) MDDI past indicative MDDS past subjunctive MDD past tense (ambiguous subjunctive/indicative)
The following verbs are always tagged as modals, whether used with an
infinitive or independently. "Modal" meanings are given first, followed by
independent meanings.
The verb AGAN, however, is only tagged as a modal when it is used with an
infinitive, meaning have to or ought to
$Ic_PRO^N +te_PRO^D m+ag_MDPI gesecgan_VB +t+at_C +tu_PRO^N +tec_PRO^A sylfne_ADJ^A ne_NEG +tearft_MDPI swi+tor_ADV swencan_VB ._. COCYNEW3Ic_PRO^N sceal_MDPI hra+de_ADV cunnan_MD <--- independent use of cunnan hw+at_WPRO^A +du_PRO^N us_PRO^D to_P $dugu+dum_N^D gedon_VB wille_MDPS ._. COANDREA Gif_P him_PRO^D arlice_ADV esne_N^N +tena+d_VBPI ,_, se_D^N +te_C agan_VB sceal_MDPI <--- AGAN on_P +tam_D^D si+dfate_N^D ,_, CORIDDLE
*difference*
Note that "modal" is a lexical category in the YCOE
(apart from AGAN), unlike in the PPCME2, where modals used
independently are tagged as lexical verbs.
AX infinitive AXI imperative AXPI present indicative AXPS present subjunctive AXP present tense (ambiguous subjunctive/indicative) AXPH present tense (ambiguous subjunctive/imperative) AXDI past indicative AXDS past subjunctive AXD past tense (ambiguous subjunctive/indicative) AXG present participle AXN past participleThe following verbs may be tagged as auxiliaries. Unlike the modals, these verbs are only tagged AX when used with a bare infinitive, or (marginally) with a participle.
W+as_BEDI hira_PRO^G Matheus_NPR^N sum_Q^N ,_, se_D^N mid_P Iudeum_NPR^D ongan_RP+AXDI godspell_N^A +arest_ADV wordum_N^D writan_VB wundorcr+afte_N^D ._. COANDREAGewat_AXDI +da_ADV neosian_VB ,_, <--- auxiliary use of gewat sy+t+dan_P niht_N^N becom_VBDI ,_, hean_ADJ^G <--- main verb use of becom huses_N^G ,_, hu_WADV hit_PRO^A Hringdene_NPR^N +after_P beor+tege_N^D gebun_VBN h+afdon_HVDI ._. COBEOWUL
Utan (UTP)
Forms of the verb UTAN, historically derived from WITAN to go and
used to introduce imperative or hortatory clauses (let us...,
come...), are tagged UTP. This verb is not used in the subjunctive
or in the past tense.
Uton_UTP nu_ADV^T brucan_VB +tisses_D^G undernmetes_N^G uton_UTP wyrcean_VB him_PRO^D sumne_Q^A fultum_N^A to_P his_PRO$ gelicnysse_N ._.
BE infinitive BEI imperative BEPI present indicative BEPS present subjunctive BEP present tense (ambiguous subjunctive/indicative) BEPH present tense (ambiguous subjunctive/imperative) BEDI past indicative BEDS past subjunctive BED past tense (ambiguous subjunctive/indicative) BAG present participle BEN past participle
HV infinitive HVI imperative HVPI present indicative HVPS present subjunctive HVP present tense (ambiguous subjunctive/indicative) HVPH present tense (ambiguous subjunctive/imperative) HVDI past indicative HVDS past subjunctive HVD past tense (ambiguous subjunctive/indicative) HAG present participle HVN past participle
All forms of BEON, WESAN, and (GE)WEOR+DAN are labelled with BE tags regardless of meaning. Forms of WEOR+DAN are often used in passive constructions next to BEON, WESAN. Forms of GEWEOR+DAN are more often used independently; nevertheless, they are always tagged as BE.
Egyptum_NPR^D wear+d_BEDI +t+as_D^G <--- WEOR+DAN d+agweorces_N^G deop_ADJ^N lean_N^N gesceod_VBN COEXODUSIc_PRO^N +turh_P Iudas_NPR^A +ar_ADV^T hyhtful_ADJ^N gewear+d_BEDI ,_. ond_CONJ nu_ADV^T <--- GEWEOR+DAN gehyined_VBN eom_BEPI <--- BEON, WESAN COCYNEW2
All forms of HABBAN and GEHABBAN are tagged HV.
*difference*
Note that DO is tagged as a lexical verb in the
YCOE and not given a special tag as in the PPCME2.
VB infinitive VBI imperative VBPI present indicative VBPS present subjunctive VBP present tense (ambiguous subjunctive/indicative) VBPH present tense (ambiguous subjunctive/imperative) VBDI past indicative VBDS past subjunctive VBD past tense (ambiguous subjunctive/indicative) VAG present participle VBN past participle
All lexical verbs are given tags beginning with VB.
Notice that an infinitive following TO can be tagged as a dative form VB^D (see Inflected infinitives).
Modals used as main verbs
Modal verbs are never tagged as main verbs, except AGAN (see Modal verbs).
Mood
*difference*
Unlike in the PPCME2, in the YCOE unambiguous
subjunctive verb forms are distinguished from unambiguous indicative forms
for all verbs except UTAN.
The following four categories are distinguished:
Mood (indicative or subjunctive) is labelled on verbs based on the form of
the verb itself and not on context. Only unambiguous forms are labelled;
ambiguous forms (e.g., past tense of 3rd sg. weak verbs in -EDE/-ODE) are
unmarked.
The following forms are always ambiguous:
&_CONJ ic_PRO^N worige_VBP &_CONJ he_PRO^N wunode_VBD flyma_N^N on_P +dam_D^D eastd+ale_N^D Do_VBI swa_P +tu_PRO^N spr+ace_VBD &_CONJ weaxe_VBP ge_PRO^N
In the past tense plural, -AN and -UN are taken as a variant of -ON and labelled indicative; only -EN is labelled as subjunctive
In the present plural (of non-preterite-present verbs), any vowel plus -N is labelled subjunctive. Sometimes this is dependent on the tense context, as the present subjunctive and past plural have the same root vowel (e.g., SCINEN, SCINON).
&_CONJ hi_PRO^N wunodan_VBDI +d+ar_ADV^L ._. Ic_PRO^N bidde_VBP eow_PRO ,_, Leof_ADJ^N ,_, +t+at_C ge_PRO^N cyrron_VBPS to_P minum_PRO$^D huse_N^D ,_, &_CONJ +t+ar_ADV^L wunion_VBPS nihtlanges_ADV^T &_CONJ hi_PRO^N scinon_VBPS on_P +d+are_D^G <--- part of sequence of heofenan_N^G f+astnysse_N^D present subjunctivesThe imperative/subjunctive ambiguity affects singular imperatives ending in -E and forms of some irregular verbs (DO, GA, BEO).
&_CONJ ga_VBPH of_P +tissum_D^D men_N^D
Participles
*difference*
Unlike in the PPCME2, verbal and adjectival use of
the present and past participles is not distinguished; that is, they are
both tagged VAG or VBN (or HAG/HVN, etc).
Only overt case is marked on participles
that are part of the main verb sequence. Therefore the following forms do
not have a case label in this context.
gedrince+d_VBPI to_P dryggum_N^D dreosendne_VAG^A <--- overt case welan_N^A ,_. and_CONJ +teah_ADV +t+as_D^G +tearfan_N^G ne^NEG bi+d_BEPI +turst_N^N aceled_VBN ._. <--- no overt case COMETBOELicgende_VAG beam_N^N l+asest_ADV <--- adjectival use growe+d_VBPI ._. COEXETER5
A form is considered a participle if it corresponds in its entirety to an actively used Old English verb (reference: Clark Hall's Concise Anglo-Saxon Dictionary, 4th edn.). This rules out:
anboren only-born handlocen linked by hand earmsceapen unfortunate, miserable goldhroden gold-adorned woruldwunigende dwelling on the earth manfremmende sinning wi+derhycgende hostile umborwesende as a child
unwrecen unavenged unwunded unwounded unweaxen young unoferswi+ded invincible unlifigende dead unbyrnende without burning unfricgende unquestioning unswiciende unswerving, loyal
getyd skilled +apled shaped like an apple gedwolen misled, perverse hilted hilted
These forms are tagged as adjectives, not
participles, and they thus follow the general rules for the case-marking of adjectives and not that of
participles given here.
Unlike past participles, present participles are frequently used as nouns. The policy in these cases is as uncontroversial as possible; in general, if a form is listed as a noun by Clark Hall, it is tagged as a noun; for example:
godhergend worshipper of God sweordwigend warrior wi+derfeohtend adversary ridend rider ceasterbuend citizen godfremmend doer of good
Infinitives
Inflected infinitives
Infinitives with -NE added to the base form are labelled with the plain
infinitive marker VB, HV, etc. plus dative case.
M+al_N^N is_BEPI me_PRO^D to_TO feran_VB ;_. <--- plain infinitive COBEOWULHe_PRO^N bi+d_BEPI +tam_D^D yflum_ADJ^D egeslic_ADJ^N ond_CONJ grimlic_ADJ^N to_TO geseonne_VB^D ,_, <--- inflected infinitive COCHRIST3
Infinitive marker TO (TO)
TO used with an infinitive is tagged TO. It is followed by both
plain and inflected infinitives.
Classes of Adverbs
Adverbs are tagged ADV. Five classes of adverbs are distinguished:
Locative adverbs (ADV^L)
Locative adverbs indicate location and are usually used with stative
verbs. Note that the same set of adverbs can also be tagged as contextual directional adverbs when
used with a verb of motion. The following is a list of the most common
locative adverbs; the list is not exhaustive.
+t+ar, be+aftan, bufan, feor, gehende, gehw+ar, her, innan, inne, neah, feor, utan, ute, wi+dinnan, wi+dutan, nahw+arNote that forms including HW+AR, such as +AGHW+AR, GEHW+AR, are tagged ADV, not WADV, when they are used as indefinites, as with all wh-indefinites.
Negated forms such as NAHW+AR have a NEG prefix NEG+ADV^L in the usual way.
In +T+AR +T+AR sequences, both +T+ARs are tagged ADV^L.
+T+ar_ADV^L comon_VBDI eac_ADV heora_PRO$ magas_N^N +t+at_C hi_PRO^A man_MAN^N begen_Q^A ofstunge_RP+VBDS +t+ar_ADV^L +d+ar_ADV^L hi_PRO^N on_P gebedum_N^D stodon_VBDI +Tas_D^N +te_C her_ADV^L nu_ADV^T wepa+d_VBPI woldon_MDDI mid_P eow_PRO blissian_VB Se_D^N cniht_N^N wear+d_BEDI geancsumod_VBN and_CONJ wi+dinnan_ADV^L ablend_VBN +after_P +t+as_D^G m+adenes_N^G spr+ace_N
Lexical directional adverbs (ADV^DX)
This set of adverbs is inherently directional and includes the following:
+tanon, +tider, gehwanon, gehwider, heonan, hider, hindan, -weard(es)Any word ending in -WEARD(ES) (apart from prepositional use) is tagged as an adverb. This includes HAMWEARD(ES).
Indefinite wh-forms like GEHWIDER are tagged ADV not WADV as usual.
Note that +TANON can also be used temporally in which case it is tagged ADV^T.
and_CONJ hi_PRO^N +tyder_ADV^DX comon_VBDI mid_P mycelre_Q^D sarnyssa_N^D +t+ar_ADV^D heora_PRO$ suna_N^N w+aron_BEDI geh+afte_VBN^N ,_. Hi_PRO^N feordon_VBDI +ta_ADV^T +tanon_ADV^DX fram_P +t+are_D^G scire_N^G bisceope_N^D ,_. He_PRO^N gegaderode_VBD +ta_ADV^T swi+de_ADV gode_ADJ^A wyrhtan_N^A gehwanon_ADV^DX ,_.
Contextual directional adverbs (ADV^D)
The set of adverbs used locatively can also be used
directionally with verbs of motion. In this case the adverb is tagged
ADV^D.
and_CONJ +done_D^A cempan_N^A tihton_VBDI +t+at_C he_PRO^N faran_VB sceolde_MDD feor_ADV^D fram_P +d+are_D^D byrig_N^D ._. +Ta_ADV^T bletsode_VBD Maurus_NPR^N +tone_D^A mann_N^A feorran_ADV^D ,_. and_CONJ heton_VBDI me_PRO gan_VB for+d_RP o+d+t+at_P we_PRO^N becoman_VBDI +t+ar_ADV^D se_D^N cyning_N^N w+as_BEDI ._. and_CONJ hi_PRO^A tomiddes_ADV^D besceofan_VB ._.
Temporal adverbs (ADV^T)
Temporal adverbs are tagged ADV^T. Adverbs meaning primarily
quickly but shading off into right away, immediately, such as
SNELLICE, RECENE, +ADRE, are considered members of the other adverbs class and are tagged ADV.
The following are the most common words tagged as temporal adverbs; the
list is not exhaustive.
+afre, +aft(er), +ane(s), +ar (+aror, etc.), +t+arrihte, +ta, +tagyt, +tanon, +tonne, +triwa, a, beforan, ealneg, eft, gefyrn, geo, gyt, gyrsand+ag, heononfor+d, iu, lange, late, nu, nugyt, oft (oftor, etc.), si+d+dan, simble, sona, tod+ag(e), tuwa, n+afreNote that when +T+ARRIHTE is spelled as two words, it is tagged with the PPCME2 numbering system +t+ar_ADV^T21 rihte_ADV^T22.
Hi_PRO^N sceoldon_MDDI +ta_ADV^T underhnigan_RP+VB nacodum_ADJ^D swurde_N^D ,_. and_CONJ het_VBDI hine_PRO^A symble_ADV^T beon_BE +atforan_P his_PRO$ gesih+de_N ._. and_CONJ heora_PRO$ modor_N^N w+as_BEDI Martia_NPR^N gecyged_VBN ,_, h+a+dena_ADJ^N +ta_ADV^T gyt_ADV^T ,_.
Other adverbs (ADV)
All other adverbs, including sentential and manner adverbs, are tagged
ADV.
D+aghwamlice_ADV he_PRO^N gefylde_VBD his_PRO$ drihtnes_NPR^G +tenunge_N geornlice_ADV ,_. He_PRO^N lufode_VBD swa_ADV +teah_ADV +done_D^A halgan_ADJ^A w+ar_N^A ,_. Nis_NEG+BEPI na_NEG+ADV godes_NPR^G wunung_N^N on_P +dam_D^D gr+agum_ADJ^D stanum_N^D ,_, ne_NEG+CONJ on_P +arenum_ADJ^D wecgum_N^D ,_.
Negative adverbs
Negative adverbs like NA, NAHW+AR, N+AFRE are tagged NEG+ADV
following general principles for negative elements.
ac_CONJ se_D^N +almihtiga_ADJ^N God_NPR^N eow_PRO n+afre_NEG+ADV^T ne_NEG forl+at_VBPI ,_, o+d_P +t+at_C ge_PRO^N gelogode_VBN^N beon_BEPS ._. ac_CONJ he_PRO^N ne_NEG leofode_VBD na_NEG+ADV +ta_ADV^T ,_. ne_NEG+CONJ +tu_PRO^N ne_NEG +atstand_VBI nahwar_NEG+ADV^L on_P +disum_D^D earde_N^D ,_.
Adverbial quantifiers
Quantifiers used adverbially, whether indeclinable (MA, LYT) or case forms
(MICCLUM, EALLES) are always tagged as quantifiers (Q). The inflected forms are also
labelled for case.
Adverbial nouns
The origin of many adverbs is in the oblique cases of nouns (HWILUM, GEARA,
UNWILLUM, etc.), and it is not clear at what point these cease to be nouns
and become adverbs (if there is indeed any such "point"). Largely because
of the difficulty of making such a division, all unmodified, single word
nouns used adverbially are tagged as adverbs with the appropriate extension
(T,L,DX,D) indicating function. This also applies to HAM which is tagged as
a directional (or occasionally locative) adverb.
Note that this does not apply to quantifiers and demonstratives used
adverbially since there is no difficulty in identifying these categories.
Adverb vs. preposition
Many words function as both adverbs (ADV) and prepositions (P) with either
a clausal complement (subordinating
conjunctions) or an NP complement (prepositions).
Hi_PRO^N hyne_PRO^A +ta_ADV^T +atb+aron_VBDI to_P brimes_N^G faro+de_N^D ,_, sw+ase_ADJ^N gesi+tas_N^N ,_, swa_P he_PRO^N selfa_ADJ^N b+ad_VBDI ,_, +tenden_P wordum_N^D weold_VBDI wine_N^N Scyldinga_NPR^G ;_. COBEOWULSwa_ADV mec_PRO^A gelome_ADV <--- SWA adverb la+dgeteonan_N^N +treatedon_VBDI +tearle_ADV ._. Ic_PRO^N him_PRO^D +tenode_VBD deoran_ADJ^D sweorde_N^D ,_, swa_P hit_PRO^N <--- SWA preposition gedefe_ADJ^N w+as_BEDI ._. COBEOWUL He_PRO^N w+as_BEDI leof_ADJ^N gode_NPR^D and_CONJ lifde_VBD her_ADV^L wintra_N^G hundnigontig_NUM +ar_P he_PRO^N be_P wife_N^D <--- +AR conjunction her_ADV^L $+turh_P gebedscipe_N^A bearn_N^A astrynde_VBD ;_. him_PRO^D +ta_ADV cenned_VBN wear+d_BEDI Cainan_NPR^N +arest_ADV^T eafora_N^N on_P <--- +AR adverb e+dle_N^D ._. COGENESI
NEAH, GEHENDE, FEOR (adverb)
Although NEAH, GEHENDE, and FEOR act in some ways like prepositions in that
they appear to take dative complements, they are also modified by such
adverbs as SWA, SWI+DE, etc. in a non-prepositional way. We have therefore
tagged these three words as adverbs when they do not appear as part of an
NP or are not overtly inflected (in which case they are tagged as adjectives).
The default tagging for cases where the expected inflection is zero is to
take them as adverbs unless they occur within an NP. This includes the
copular case. NEAH and GEHENDE are labelled as locative adverbs (apart from
the use of NEAH to mean nearly), while FEOR may be locative or
directional.
and_CONJ +tar_ADV^L +anig_Q^N +tingc_N^N neah_ADV^D <--- directional ne_NEG cume_VBPS ac_CONJ eac_ADV ealle_Q^N nytenu_N^N swy+de_ADV neah_ADV forwurdon_VBDI ._. &_CONJ +ta_D^N Beormas_NPR^N spr+acon_VBDI neah_ADV an_NUM^A ge+teode_N^A N+as_NEG+ADV hie_PRO^N +d+are_D^G fylle_N^G gefean_N^A h+afdon_HVDI ,_, manford+adlan_N^N ,_, +t+at_C hie_PRO^N me_PRO^A +tegon_VBDI ,_, symbel_N^A ymbs+aton_VBDI s+agrunde_N^D neah_ADV^L ;_. COBEOWUL$Hafast_HVPI +tu_PRO^N gefered_VBN +t+at_C +de_PRO^A feor_ADV^L ond_CONJ neah_ADV^L ealne_Q^A wideferh+t_N^A weras_N^N ehtiga+d_VBPI ,_, efne_ADV swa_ADV side_ADV swa_P s+a_N^N $bebuge+d_VBPI ,_, windgeard_N^N ,_, weallas_N^A ._. COBEOWUL Is_BEP +tam_D^D dome_N^D neah_ADV^L +t+at_C <--- locative with no we_PRO^N gelice_ADV sceolon_MDPI leanum_N^D overt inflection hleotan_VB ,_, COCHRIST2
GELICE
GELICE is normally an adverb (or an inflected adjective), but it occurs in
the subordinating construction "GELICE &" three times in Orosius, where
GELICE is tagged as a preposition to make the
subordinating nature of the construction clear.
+AREST
+AREST is tagged ADJ when used as an ordinal number and ADV^T when used as
a temporal adverb.
Leoht_N^N w+as_BEDI +arest_ADV^T +turh_P <--- +arest ADV drihtnes_NPR^G word_N^A d+ag_NPR^N genemned_VBN ,_, wlitebeorhte_ADJ^N $gesceaft_N^N ._. Wel_ADV licode_VBD frean_NPR^D +at_P frym+de_N^D for+tb+aro_ADJ^N tid_N^N ,_, d+ag_N^N +aresta_ADJ^N ;_. <--- +arest ADJ COGENESI
OFER, TO, FOR
OFER, TO, and FOR when they mean too, very in combination with
adjectives or adverbs are labelled ADV.
ofer_ADV f+at_ADJ^N to_ADV lange_ADV^T to_ADV god_ADJ^N for_ADV wel_ADV for_ADV oft_ADV^T
+TA
+TA can be ambiguous between an adverb, preposition and a determiner. In difficult cases the ambiguity is
resolved as follows.
+T+AR
+T+AR is treated as a locative or directional adverb (tagged
ADV^L> and ADV^D respectively), even when it introduces a
subordinate clause (LINK TO SYNTAX). Existential THERE is not
distinguished.
Eala_INTJ hu_WADV mycel_Q^N god_N^N is_BEPI and_CONJ hwylc_WADJ^N wynsumnys_N^N +d+ar_ADV^L +d+ar_ADV^L gebro+dru_N^N beo+d_BEPI on_P annysse_N ._. +Ta_ADV^T com_VBDI sum_Q^N wudewe_N^N ,_, +te_C w+as_BEDI anes_NUM^G martyres_N^G laf_N^N ,_, on_P +t+are_D^D ylcan_ADJ^D nihte_N^D ,_, +t+ar_ADV^L he_PRO^N l+ag_VBDI forwundod_VBN ,_.
TOD+AG(E), GIESTRAND+AG etc.
TOD+AG(E), GIESTRAND+AG etc. are tagged ADV^T when written as one
word. When TO D+AG(E) is written as two words, it is tagged as a prepositional phrase.
tod+ag_ADV^T +tu_PRO^N bist_BEPI mid_P me_PRO on_P neorxnawange_N^D nu_ADV^T tod+ag_ADV^T he_PRO^N modega+d_VBPI ,_. &_CONJ giet_ADV^T tod+age_ADV^T is_BEPI ,_, for_P Romana_NPR^G bismere_N^D ._. swa_ADV swa_P Crist_NPR^N gyrstand+ag_ADV^T me_PRO cydde_VBD be_P +te_PRO
gewat_VBDI him_PRO^D ham_ADV^D +tonon_ADV^DX goldwine_N^N gumena_N^G ._. COBEOWUL
Prepositions with NP complements
Prepositions are tagged P.
ofer_P ealle_Q^A gesceaft_N^A on_P +t+are_D^D upplican_ADJ^D +a+delan_ADJ^D ceastre_N^D +durh_P +d+at_D^A halige_ADJ^A triow_N^A
Prepositions with R-pronouns
Prepositions cliticized to R-pronouns are labelled ADV+P.
+t+arinne_ADV+P t+arbinnan_ADV+P +t+aron_ADV+P +t+ar+at_ADV+PWhen separated they are tagged: +t+ar_ADV^L inne_P.
Prepositions with demonstratives (FOR+TI, FOR+TAN,
etc.)
A preposition may be followed by a demonstrative (FOR +TI, FOR +TAN, FOR
+TAT, IN +TAT, WI+T +TAN, etc.) either absolutely, or followed by a
clause. In all cases the demonstrative is tagged D. If the
preposition and demonstrative are cliticized, the unit is tagged P
if it introduces a clause, ADV if it is used as a sentence adverb
(FOR+TI) and P+D, if used absolutely.
for+ti_ADV ic_PRO^N cw+a+d_VBDI Godes_NPR^G word_N^A ,_, for+tan_P +te_C he_PRO^N on_P his_PRO$ godspelle_N^D cw+a+d_VBDI ,_, And_CONJ for+ti_ADV cw+a+t_VBDI se_D^N stemn_N^N clypigende_VAG to_P Petre_NPR^D butan_P he_PRO^N nyde_N^D sceolde_MDD ,_, for+dan_P +te_C he_PRO^N wiste_VBD hw+at_WPRO^N him_PRO^D gewitegod_VBN w+as_BEDI ,_, and_CONJ nes_NEG+BEDI se_D^N mann_N^N on_P +t+are_D^D scire_N^D +te_C hi_PRO^A gesawe_VBDS +ar+tan_P+D^I ._. +t+ar_ADV^L +d+ar_ADV^L se_D^N god_N^N Baal_NPR^N +ar_ADV^T w+as_BEDI gewur+dod_VBN wolice_ADV o+d+t+at_P+D^A
Prepositions and particles
When one of the list of adverbial
particles precedes a preposition immediately, it is tagged as a
particle, even in cases such as IN TO or UP ON which might be interpreted
as cases of split prepositions.
adune_RP to_P Sebastianes_NPR^G fotum_N^D ut_RP on_P s+a_N ut_RP to_P anum_NUM^D felda_N^D up_RP to_P +t+are_D^D st+agre_N^D
Prepositions with clausal complements
Subordinating conjunctions are treated as prepositions taking a clausal
complement and tagged P.
+t+ar_ADV^L gecy+ded_VBN wear+d_BEDI +t+at_C halig_ADJ^N god_NPR^N helpe_N^A gefremede_VBD ,_, +da_P wear+d_BEDI gehyred_VBN heofoncyninges_N^G <--- P+clause stefn_N^N wr+atlic_ADJ^N under_P wolcnum_N^D ,_, <--- P+NP wordhleo+dres_N^G sweg_N^N m+ares_ADJ^G +teodnes_NPR^G ._. COANDREANote that GIF is tagged P when introducing an adverbial clause, but C when introducing an indirect question.God_NPR^A sceal_MDPI mon_MAN^N +arest_ADV^T hergan_VB f+agre_ADV ,_, f+ader_NPR^A userne_PRO$^A ,_, for+ton_P +te_C he_PRO^N us_PRO^D +at_P <--- P+C+clause frym+te_N^D geteode_VBD lif_N^A ond_CONJ l+anne_ADJ^A willan_N^A :_. COEXETER5
Modified conjunctions (SWA SWA, +TA +TA, etc.)
Modifying adverbs such as SWA, EALL, +TA, etc., which commonly appear
before prepositions introducing adverbial clauses, can also occur written
together with the preposition, SWASWA, EALLSWA, +TA+TA. In these cases the
whole unit is labelled P.
Swa_ADV swa_P d+agred_N^N todr+af+d_RP+VBPI +ta_D^A dimlican_ADJ^A +tystra_N^A ,_, and_CONJ manna_N^G eagan_N^A onlyht_RP+VBPI +te_C blinde_ADJ^N w+aron_BEDI on_P niht_N^A ,_, swa_ADV adr+afde_VBD +tin_PRO$^N lar_N^N +ta_D^A geleafleaste_N^A fram_P me_PRO ,_. And_CONJ him_PRO^D eallswa_ADV getimode_VBD swaswa_P +dam_D^D o+drum_ADJ^D flocce_N^D ,_, +t+at_C hi_PRO^N wurdon_BEDI forb+arnde_VBN^N mid_P brastligendum_VBN^D lige_N^D heofonlices_ADJ^G fyres_N^G f+arlice_ADV ealle_Q^N ._. +Ta_ADV^T +ta_P se_D^N sunu_N^N +t+at_D^A geseah_VBDI ,_, +ta_ADV^T gesohte_VBD he_PRO^N +t+as_D^G preostes_N^G fet_N^A ,_. and_CONJ +ta_D^N witan_N^N heton_VBDI hine_PRO^A beheafdian_VB ,_, +ta+ta_P he_PRO^N ne_NEG mihte_MDD his_PRO$ mand+ada_N^A betellan_VB ._.
SWA (preposition)
SWA introduces various kinds of adverbial and comparative clauses. It is
labelled as a preposition in all cases except as the second SWA in free relatives of the SWA
HW- SWA type where it is treated as the complementizer. SWA is also
used as an adverb.
and_CONJ +tu_PRO^N bist_BEPI swa_ADV hal_ADJ^N swa_P ic_PRO^N ._. So+dlice_ADV +alc_Q^N libbende_VAG nyten_N^N ,_, swa_ADV swa_P Adam_NPR^N hit_PRO^A gecygde_VBD ,_, swa_ADV is_BEPI his_PRO$ nama_N^N ._. &_CONJ beheledon_VBDI heora_PRO$ f+aderes_N^G gecynd_N^A ,_, swa_P +d+at_C hi_PRO^N ne_NEG gesawon_VBDI his_PRO$ n+acednysse_N ._. Abram_NPR^N +da_ADV^T ferde_VBD of_P Aran_NPR ,_, swa_ADV swa_P God_NPR^N him_PRO^D bead_VBDI ,_.
+TONNE
+TONNE meaning when or introducing comparative clauses is tagged
P
and_CONJ wolde_MDD beon_BE fur+dor_ADJ^N on_P o+drum_ADJ^D earde_N^D +tonne_P he_PRO^N on_P his_PRO$ agenum_ADJ^D w+are_BEDS Agnes_NPR^N him_PRO^D andwyrde_VBD ,_, Se_D^N +almihtiga_ADJ^N hera+d_VBPI swi+dor_ADV manna_N^G mod_N^A +tonne_P heora_PRO$ mycclan_Q ylde_N ,_.
WEARD
In the sequence "PREPOSITION NP WEARD", WEARD is tagged P. When
the preposition and WEARD are written together and precede the NP, the
whole unit is tagged P.
wi+d_P Rome_NPR weard_P to_P mynstre_N^D weard_P toweard_P +t+am_D^D feo_N^D mynstre_N^D weard_P
+TY/+TE L+AS (+TE)
+TY_+TE L+AS (+TE) unless introducing a subordinate clause is
treated as follows.
+ty_D^I l+as_P +te_C ... and_CONJ clypa_VBI to_P +tam_D^D godum_N^D ,_, +te_D^I l+as_P +de_C +tu_PRO^N +din_PRO$^A lif_N^A forl+ate_VBPS on_P iugo+de_N ._.
+TEAH+TE, O+T+TE etc.
One-word combinations of a subordinating conjunction and +TE
introducing a subordinate clause, such as +TEAH+TE although, O+T+TE
until are separated manually and its two constituents tagged in the
usual way. A comment is left to indicate the separation.
$+teah_P $+te_C {TEXT:+teah+te}_CODE $o+t_P $+te_C {TEXT:o+t+te}_CODENotice that this policy does not apply to +T+ATTE. +T+ATTE
O+T+T+AT
In addition to acting as a preposition until, O+T+T+AT can be used absolutely meaning
until then, in which case it is tagged P+D^A.
and_CONJ behwurfon_VBDI hire_PRO$ lic_N^A o+t+t+at_P heo_PRO^N bebyrged_VBN w+as_BEDI_CODE Worhton_VBDI +ta_ADV^T anne_NUM^A gangtun_N^A ,_, +t+ar_ADV^L +d+ar_ADV^L se_D^N god_N^N Baal_NPR^N +ar_ADV^T w+as_BEDI gewur+dod_VBN wolice_ADV o+d+t+at_P+D^A ._.
TO MIDDES, BE TWEONUM, BE SU+TAN, etc.
When TOMIDDES, BETWEONUM, BESU+TAN, etc. are written as single words, they
are tagged as prepositions when they take a complement and as adverbs when used absolutely. When written separately
they are tagged literally according to their constituent parts, whether used
absolutely or taking a complement.
tomiddes_P +dam_D^D streame_N^D and_CONJ hi_PRO^A tomiddes_ADV^D besceofan_VB to_P middes_N^G be_P tweonum_NUM^D him_PRO^D betweonum_P to_P middes_N^G +tam_D^D ise_N^D be_P su+tan_ADV +tam_D^D mu+tan_N^D on_P middan_ADJ to_P foran_P
SAM ... SAM
SAM ... SAM meaning whether ... or is tagged P.
hy_PRO^N gedo+d_VBPI +t+at_C o+ter_ADJ^N bi+d_BEPI oferfroren_RP+VBN ,_, sam_P hit_PRO^N sy_BEPS sumor_N^N sam_P winter_N^N
BUTON
BUTON is always tagged P, even when it means but and seems
to function as a coordinating
conjunction. BUTON can also function as a focus particle.
Seo_D^N Asia_NPR^N ,_, on_P +alce_Q^A healfe_N^A heo_PRO^N is_BEPI befangen_VBN mid_P sealtum_ADJ^D w+atre_N^D buton_P on_P easthealfe_N ;_. +Ta_ADV^T beag_VBDI +t+at_D^N land_N^N +t+ar_ADV^L eastryhte_ADV^D ,_, o+t+te_CONJ seo_D^N s+a_N^N in_RP on_P +d+at_D^A lond_N^A ,_, he_PRO^N nysse_NEG+VBD hw+a+der_WPRO^N buton_P he_PRO^N wisse_VBD +d+at_C he_PRO^N +d+ar_ADV^L bad_VBDI westanwindes_N^G &_CONJ hwon_Q^I nor+tan_ADV^D
GELICE
GELICE is normally an adverb (or an inflected adjective), but it occurs in
the subordinating construction "GELICE &" three times in Orosius, where
GELICE is tagged P to make the subordinating nature of the
construction clear.
for_P +ton_D^I +te_C elpendes_N^G hyd_N^N wile_MDP drincan_VB w+atan_N ,_, gelice_P &_CONJ spynge_N^N de+d_VBPI ._. +t+at_C hie_PRO^A an_NUM^N cyning_N^N swa_ADV ie+delice_ADV forneah_ADV buton_P +alcon_Q^D gewinne_N^D on_P his_PRO$ geweald_N^A be+tridian_VB sceolde_MDD ,_, gelice_P &_CONJ hie_PRO^N him_PRO^D +teowiende_VAG w+aron_BEDI ,_,
+T+AT, +T+ATTE, +TE
+T+AT, +T+ATTE and +TE introducing any kind of subordinate clause are
tagged C.
Swa_P +tu_PRO^N ,_, god__NPR^N of_P gode_NPR^D gearo_ADV acenned_VBN ,_, sunu_NPR^N so+tan_ADJ^G f+ader_NPR^G ,_, swegles_N^G in_P wuldre_N^D butan_P anginne_N^D +afre_ADV^T w+are_BEDS ,_, swa_ADV +tec_PRO^A nu_ADV^T for_P +tearfum_N^D +tin_PRO$^N agen_ADJ^N geweorc_N^N bide+d_VBPI +turh_P byldo_N^A ,_, +t+at_C +tu_PRO^N +ta_D^A beorhtan_ADJ^A us_PRO^D sunnan_N^A onsende_VBPS ,_, ond_CONJ +te_C sylf_N^N cyme_VBPS +t+at_C +du_PRO^N inleohte_VBPS +ta_D^A +te_C longe_ADV^T +ar_ADV^T ,_, +trosme_N^D be+teahte_VBN^N ond_CONJ in_P +teostrum_N^D her_ADV^L ,_, s+aton_VBDI sinneahtes_N^G ._. COCHRIST1+T+AT introducing a relative clause, however, is taken as the relative pronoun (and thus tagged as a determiner) unless this is impossible for reasons of number/gender/case, in which case it is tagged C.
Unlike +TEAH+TE and O+T+TE, it is not clear
that +T+AT+TE is best analyzed as +T+AT plus complementizer +TE, since it
is sometimes used as a determiner (EXAMPLE) and often introduces adverbial
clauses [?? IS THIS TRUE ??], and thus it is not split but rather tagged as
a unit. [?? THIS MAY CHANGE ??]
SWA in free relative clauses
SWA is tagged C in free relatives of
the SWA HW- SWA type.
Swa_ADV hwa_WPRO^N swa_C agyt_VBPI +d+as_D^G mannes_N^G blod_N^A ,_, his_PRO$ blod_N^N by+d_BEPI agoten_VBN ;_. on_P swa_ADV hwylcum_WADJ^D d+age_N^D swa_C +du_PRO^N etst_VBPI of_P +dam_D^D treowe_N^D ,_, +du_PRO^N scealt_MDPI dea+de_N^D sweltan_VB ._.
GIF in indirect questions
When GIF introduces indirect questions, it is tagged C.
nu_ADV^T ic_PRO^N sceal_MDPI geseon_VB gif_C Crist_NPR^N +de_PRO geh+al+d_VBPI and_CONJ het_VBDI his_PRO$ cnapan_N +da_D^A hwile_N^A hawian_VB to_P +d+are_D s+a_N ,_, gif_C +anig_Q^N mist_N^N arise_VBDS of_P +dam_D^D mycclum_Q^D brymme_N^D ._.
The following are tagged CONJ, or NEG+CONJ if negative,
when used as conjunctions:
+ag+ter, +te, ac, ge, na+ter, ne, ond, o+t+te, swa, &
When there is more than one conjunction, (e.g., +AG+TER GE...GE), all are
tagged CONJ.
&_CONJ leoht_N^N w+aar+d_BEDI geworht_VBN ._. sceawa_VBI hw+a+der_WQ hyt_PRO^N sy_BEPS +dines_PRO$^G suna_N^G +te_CONJ ne_NEG sy_BEPS ._. hw+ar_WADV^L m+ag_MDPI ic_PRO^N wysran_ADJ^A findan_VB +tonne_P +tu_PRO^N eart_BEPI ,_, o+t+te_CONJ fur+ton_ADV +tinne_PRO$^A gelican_N^A ?_. &_CONJ ge_PRO^N beo+d_BEPI +donne_ADV^T englum_N^D gelice_ADJ^N ,_, witende_VAG +ag+der_CONJ ge_CONJ god_N^A ge_CONJ yfel_N^A ._. Quintianus_NPR^N +ta_ADV^T cw+a+d_VBDI +t+at_C heo_PRO^N gecure_VBDS o+der_ADJ^A +d+ara_D^G ,_, swa_CONJ heo_PRO^N mid_P fordemdum_VBN^D dyslice_ADV forferde_VBD ,_, swa_CONJ heo_PRO^N +tam_D^D godum_N^D geoffrode_VBD ,_, swa_ADV swa_P +a+delboren_ADJ^N and_CONJ wis_ADJ^N ._.
+AG+TER_NA+TER are also tagged as quantifiers.
SWA is also tagged as preposition, adverb or complementizer.
NE is also tagged as sentential negation (NEG).
The negative particle NE is tagged NEG. Contractions of NE and
verb forms, adverbs and quantifiers are tagged NEG+-.
When NE is used as a conjunction, it is tagged NEG+CONJ. In clauses with only one NE where NE could be a conjunction or negation, it is tagged as negation (NEG) if it immediately precedes the verb and as a conjunction (NEG+CONJ) if it does not.
Although NA sometimes seems to function as a second negative particle, it is always tagged as an negative adverb (NEG+ADV).
For+d+am_ADV hiora_PRO^G n+anig_NEG+Q^N n+as_NEG+BEDI +ta_ADV^T gieta_ADV^T ,_. ne_NEG+CONJ hi_PRO^A ne_NEG gesawon_VBDI sundbuende_N^N ,_. ne_NEG+CONJ ymbutan_P hi_PRO^A awer_ADV^L ne_NEG herdon_VBDI ._. COMETBOENotice that forms starting with UN- are not tagged NEG+-.Nalles_NEG+Q^G wolcnu_N^N +da_ADV^T giet_ADV^T ofer_P rumne_ADJ^A grund_N^A regnas_N^A b+aron_VBDI ,_, wann_ADJ^N mid_P winde_N^D ,_. COGENESI gif_P he_PRO^N wyrsa_ADJ^N ne_NEG bi+d_BEPI ,_, ne_NEG wene_VBP ic_PRO^N his_PRO^G na_NEG+ADV beteran_ADJ^G ._. COMETBOE
Adverbial particles are tagged RP. The following is an exhaustive
list of all words tagged as particles. Note that many of these are tagged
as prepositions when they take a complement
NP or clause. Preceding another preposition, however, they are tagged as
particles (see Prepositions and particles).
adun(e), +after , aweg, (of)dune, fore, for+d, fram, geond, in, mid, ni+der, of, ofer, ongean, on, onweg, to, +turh, under, up, ut, wi+d, wi+der, ymb(e).
When a particle is cliticized to the beginning of a verb, the unit is
tagged RP+-.
Fyrst_N^N for+d_RP gewat_VBDI ._. COBEOWUL+tanon_ADV^D up_RP hra+de_ADV Wedera_NPR^G leode_N^N on_P wang_N^A stigon_VBDI ,_. COBEOWUL folc_N^N to_RP s+agon_VBDI ,_, hatan_ADJ^D heolfre_N^D ._. COBEOWUL da_ADV^T him_PRO^D Hro+tgar_NPR^N gewat_VBDI mid_P his_PRO$ h+ale+ta_N^G gedryht_N^A ,_, eodur_N^N Scyldinga_NPR^G ,_, ut_RP of_P healle_N^D ;_. COBEOWUL Heht_VBDI +da_ADV^T eorla_N^G hleo_N^N eahta_NUM^A mearas_N^A f+atedhleore_ADJ^A on_P flet_N^A teon_VB ,_, $in_RP under_P eoderas_N^A ._. COBEOWUL
Outside of foreign language sequences, foreign
names (PAULINUS, etc.) are not tagged FW, but NPR.
Latin liturgical terms (PATER NOSTER, TE DEUM, etc.) are tagged
FW, except when they follow English inflectional patterns, in
which case they are tagged N.
ANA, the weak form of AN, is tagged as a focus particle, following Mitchell
(1985:
+tu_PRO^N ana_FP canst_MDPI
ealra_Q^G gehygdo_N^A ,_, meotud_NPR^N
mancynnes_N^G ,_, mod_N^A in_P hre+dre_N^D ._.
COANDREA
BUTAN/BUTE is tagged as a focus particle in the NE...BUTAN construction and
in conjunction with numbers when it means only.
W+as_BEDI +ta_ADV^T lencten_N^N agan_VBN butan_FP VI_NUM nihtum_N^D
+ar_P sumeres_N^G cyme_N^D on_P Maias_NPR^G $kalend_N^A ._.
COCYNEW
Interjections (INTJ)
In general the INTJ tag is used only if a word has no other use than as an
interjection. When words with other functions as well are used as
interjections, they are still tagged with their primary POS tag, and not
with INTJ. Thus, HW+AT, is tagged WPRO (without case) when used as an
interjection. Adverbs which are also used as interjections (EFNE, +TONNE, HURU,
etc.) are always tagged as adverbs since it is too difficult to
consistently distinguish adverbial from interjection use. Finally GE,
although it has other functions as a pronoun and conjunction, is tagged
INTJ in interjection function.
The following words are tagged INTJ.
alleluia, amen, ge/gea/gyse, eala, la, nese, wa/wala/walawa,
wella
and_CONJ cw+a+d_VBDI to_P +tam_D^D cnihtum_N^D mid_P cenum_ADJ^D
geleafan_N^D ,_, Eala_INTJ ge_PRO^N Godes_NPR^G cempan_N^N ,_, ge_PRO^N
becomon_VBDI to_P sige_N^D ,_.
Foreign words (FW)
Everything (words, symbols, numbers, etc.) except punctuation in foreign
language sequences is labelled FW.
he_PRO^N cunne_MDPS pater_FW noster_FW
Mid_P +tam_D^D paternostre_N^D
Unknown words (XX)
Unknown or problematic words can be tagged XX. This tag is rarely used.
Splitting and joining parts of words
Words that are always treated as separate parts
There are two ways in which it may be indicated that we consider a single
written sequence as two words.
$cymst_VBPI $tu_PRO^N {TEXT:cymstu}_CODE
$flitst_VBPI $+du_PRO^N {TEXT:flits+du}_CODE
God_NPR^N nolde_NEG+MDD ofslean_RP+VB +tone_D^A
scyldigan_ADJ^A Dauid_NPR^A ,_, $+teah_P $+de_C {TEXT:+teah+de}_CODE
he_PRO^N syngode_VBD
Sy_BEPS wuldor_N^N and_CONJ lof_N^N +dam_D^D welwillendan_ADJ^D Gode_NPR^D
,_, $se_D^N $+de_C {TEXT:se+de}_CODE wur+da+d_VBPI his_PRO$ halgan_N mid_P
wuldre_N^D on_P ecnysse_N ._.
and_CONJ +ta_D^N h+a+denan_ADJ^N gelyfdon_VBDI on_P +ta_D^A leasan_ADJ^A
godas_N^A ,_, $+ta_D^N $+de_C {TEXT:+ta+de}_CODE n+aron_NEG+BEDI
godas_N^N ac_CONJ gramlice_ADJ^N deofle_N^N ._.
and_CONJ ferde_VBD $him_PRO^D $sylf_ADJ^N {TEXT:himsylf}_CODE aweg_RP
sorhful_ADJ^N on_P mode_N^D
noldon_NEG+MDDI +t+at_D^A <--- NEG+MD
geryne_N^A rihte_ADV cy+dan_VB ,_,
ne_NEG+CONJ hire_PRO^D andsware_N^A +anige_Q^A <--- NEG+CONJ
secgan_VB ,_, torngeni+dlan_N^N ,_, +t+as_D^G
hio_PRO^N him_PRO^D to_P sohte_VBD ,_.
COCYNEW2
Note that quantifiers starting with
NAT-, which derive historically from NAT ... I know not ..., are not
considered negative.
Nolde_NEG+MDD ic_PRO^N sweord_N^A beran_VB ,_,
w+apen_N^A to_P wyrme_N^D ,_, gif_P ic_PRO^N
wiste_VBD hu_WADV wi+d_P +dam_D^D agl+acean_N^D
$elles_ADV meahte_MDD gylpe_N^D wi+dgripan_RP+VB ,_, <-- RP+VB
swa_P ic_PRO^N gio_ADV^T $wi+d_P
Grendle_NPR^D dyde_VBD ._.
COBEOWUL
and_CONJ nes_NEG+BEDI se_D^N mann_N^N on_P +t+are_D^D scire_N^D +te_C
hi_PRO^A gesawe_VBDS +ar+tan_P+D^I ._.
+t+ar_ADV^L +d+ar_ADV^L se_D^N god_N^N Baal_NPR^N +ar_ADV^T
w+as_BEDI gewur+dod_VBN wolice_ADV o+d+t+at_P+D^A
se_D^N +te_C +turhseah_RP+VBDI swa_ADV +tone_D^A
preost_N^A for+don_P+D^I gesealdne_VBN^A deofle_N^D
Words that are sometimes treated as separate parts
Unlike in the PPCME, in the York Corpus
the possibly complex morphological structure of forms other than those
discussed in the previous section is not marked. Phrases such as ON
SUNDRUM, TO MIDDES, FOR +TAM +TE are tagged literally when written apart
and taken as a whole when written as one word.
to_P middes_N^G +tam_D^D ise_N^D
tomiddes_P +tam_D^D mu+tan_N^D
However, when an orthographically independent word has no meaning outside a
particular phrase (NATES in NATES HWON), or when the literal tagging of the
parts is misleading (tagging +T+AR in +T+AR RIHTE as locative when the
"word" +T+ARRIHTE is temporal), the PPCME2 numbering system is used to
indicate that the parts belong together; the first number indicates the
number of parts and the second number which part the tagged word is.
nates_NEG+ADV21 hwon_NEG+ADV22 not at all also spelt NATESHWON
+t+ar_ADV^T21 rihte_ADV^T22 straightaway also spelt +T+ARRIHTE
But note that the original phrase from which NATESHWON derives NA TO +T+AS
HWON is tagged according to its constituent parts.
na_NEG+ADV to_P +t+as_D^G HWON_Q^I
Compounds
In general noun-noun compounds are written as single words in edited Old
English texts. Sometimes however this convention is not followed and the
two parts are orthographically separated. In "true" compounds, the first
part of the compound is not inflected (WINTER SETL, SU+T RIMAN, NOR+T
S+A), and it therefore is not labelled for case; the case of the whole
compound is indicated on the second element.
winter_N setl_N^A
+tam_D^D su+t_N riman_N^D
+tan_D^I arcebiscop_N rice_N^I
We also treat as compounds, however, the names of
places, whether or not the first part is inflected (e.g., non-inflected
ELIG MYNSTER; inflected EGYPTA LOND, ROME BURH). The first part of such
compounds is tagged NPR without case (even if the case is fairly
obvious), while the second part is tagged N plus case, according
to the usual rules.
Elig_NPR mynstre_N^D
Egypta_NPR lond_N^A
ROME_NPR byrig_N^D
Also treated as compound are the names of
peoples like EAST ENGLE, etc. In this case both parts are tagged as
proper nouns, but again only the second is labelled for case. Note that,
unlike with the names of places, phrases such as ONGLE CYNNE are not
treated as compounds when written separately.
East_NPR Engle_NPR^N
Mercna_NPR^G cynne_N^D
Scotta_NPR^G cynnes_N^G