- Pronouns (PRO, PRO$)
- Reflexives (HERSELF, etc.)
- Possessive pronouns
- Pronominal cases
in which PRO is not used
- Existential THERE (EX)
- Common nouns (N, N$, NS, NS$, $)
- Singular, plural,
and collective nouns
- Units of measure (DAY,
YEAR, POUND, etc)
- Possessives and
- The $ tag
- Compass points
- Treatment of individual words
- HALF and SIDE
- Proper nouns (NPR, NPR$, NPRS, NPRS$)
- Names of people
- Names of places
- Unique objects
- Named days, months, and periods of time
- Named events
- Names of languages
- Table of Contents
Pronouns (PRO, PRO$)
PRO$ Pronoun, possessive
All pronouns are labelled PRO with the exception of pronominal
ONE and indefinate ME/MAN.
Reflexives (HERSELF, etc.)
Reflexive forms (HERSELF, etc.) are tagged PRO+N or PRO$+N
when cliticized. SELF is always tagged as a singular noun, whatever its form.
herself_PRO+N (by default)
Possessive pronouns are tagged PRO$ whether or not they modify a noun.
the_D lyon_N was_BED nat_NEG myne_PRO$
and_CONJ therefore_ADV+P ye_PRO shall_MD loose_VB
youres_PRO$ !_. '_'
Pronominal cases in which PRO is not used
The tag PRO is not used in the following two cases:
- the pronominal use of ONE (see Section ONE)
- ME, MAN when it means ONE (see Section MAN)
Existential THERE (EX)
Existential THERE is tagged EX. When THERE is ambiguous between a
locative and an existential reading, the default is existential.
whe+ter_WQ +tere_EX were_BED mo_QR of_P his_PRO$ predecessours_NS
in_P paradys_N o+ter_CONJ in_P helle_N
and_CONJ +tere_EX were_BED i-seie_VAN wonder_ADV false_ADJ
si+gtes_NS and_CONJ fals_ADJ tokenes_NS
Common nouns (N, N$, NS, NS$, $)
N Noun, singular and collective
N$ Noun, possessive/genitive
NS Noun, plural
NS$ Noun, plural, possessive/genitive
$ Possessive clitic HIS or 'S if separated from word
Singular, plural, and collective nouns
Singular and collective nouns (HORS, FOLK, PEOPLE etc.) are tagged
N. In early texts, before the universalization of plural -S, it
can be quite difficult to distinguish reliably in all cases between
singular and plural. Therefore, for the period M1, we have tried to follow
the translation accompanying the edition used when one is available, or
else a separate translation.
Units of measure (DAY, YEAR, POUND, etc)
Units of measure after numbers (TEN YEAR, etc.) are labelled as
singular or plural based on overt marking as follows: forms in
-s, -a, or -en are marked as plural, all others as singular.
three_NUM hondred_NUM wynter_N
ueale_Q hund_NUM wintra_NS
ix_NUM c_NUM pound_N
Possessives and genitives
All common nouns used as possessives are tagged N$, NS$. As with
the plural, genitive marking in early texts predates universal -S and thus
in these cases N$, NS$ indicates the function GENITIVE/POSSESSIVE
rather than any particular form. In general only nouns in relationship with
other nouns are marked as genitive/possessive. The two exceptions to
- when the head noun in a genitive/possessive NP is empty. In this case
other words, usually, quantifiers, but potentially other categories as
well, can be tagged with $.
- in essentially superlative adjectival and adverbial expressions like
ALRE FIRST/LAST/MOST (FIRST/LAST/MOST OF ALL), ALRE BEST (BEST OF ALL),
etc., ALRE is tagged Q$. See Section GENITIVE/POSSESSIVE
+te_D mannes_N$ shrifte_N the man's shrift
+te_D sowle_N$ fode_N the soul's food
his_PRO$ sinne_N$ sore_N sorrow of his sin
+te_D apostles_NS$ mu+des_NS the apostles' mouths
+ter_PRO$ apostlene_NS$ lore_N the apostles' teaching
kinges_NS$ sunes_NS kings' sons
alre_Q kinge_NS$ king_N king of all kings
here_PRO$ beire_Q$ friend_N friend of them both
o+dres_OTHER$ pine_N (an)other's pain
alre_Q$ mast_QS most of all
alre_Q$ earst_ADV first of all
alra_Q$ swi+dest_ADVS quickest of all
alre_Q$ best_ADJS best of all
The $ tag
The tag $ is used for HIS in the JOHN HIS BOOK construction, as
well as for the possessive clitic 'S. The possessive clitic really
antedates the texts in this corpus, although it occasionally appears
in the edited texts.
Peter_NPR his_$ peny_N
+Te_D kyng_N his_$ wyf_N
in_P a_D man_N 's_$ saule_N
a_D man_N 's_$ thoghte_N
Compass points are tagged N, both when used alone and in
combination with another noun. Only when the adjectival suffix -ERN
is present is the form tagged ADJ (e.g., NORTHERN, SOUTHERN,
if_P we_PRO gone_VBP toward_P +te_D north_N
Thomas_NPR Grey_NPR ,_, a_D knyte_N of_P +te_D north_N
Fro_P _CODE Cathay_NPR _CODE go_VBP
men_NS toward_P the_D est_N be_P many_Q iorneyes_NS
and_CONJ ano+tere_D+OTHER fram_P +te_D North_N
into_P +te_D South_N ,_, +tat_C was_BED callede_VAN
+de_D nor+d_N half_N
+te_D nor+t_N hille_N
cf. ADJECTIVAL USE
all_Q +te_D host_N +tat_C cam_VBD with_P +te_D
king_N were_BED robbid_VAN be_P northen_ADJ men_NS
Treatment of individual words
In early texts, the construction is:
sumes_Q kennes_N$ fisc_N
where SOME KIND is the genitive complement of FISH (i.e, FISH
OF SOME KIND). Later this is reanalyzed, with KIND as the head and
FISH the complement, SOME KIND(S) OF FISH.
some_Q kinds_NS fish_N
+Des_D fower_NUM kinnes_N$ teares_NS <--- early texts
eches_Q kinnes_N$ chapman_N+NS
that_D ylke_ADJ kynde_N compassyone_N
any_Q kynne_N +ting_N
In early texts MANNER mayy take a gentive.
ech_Q manyere_N lykinges_N$
+teose_D twa_NUM manere_N meonestrales_N$
Later, when bare genitives are generally no longer used in this fashion, it
often continues to appear without OF or any genitive marking. At this
stage the complement noun (LIKING) is simply labelled N.
each_Q manner_N liking_N
eny_Q maner_N wyse_N
a_D maner_N fals_ADJ drede_N
HALF and SIDE
HALF and SIDE also routinely take bare NP complements. The complement
NP in these cases is tagged simply as N(S) (unless it clearly
shows genitive marking).
euery_Q side_N +tat_D Gentil_ADJ Erl_N
+tis_D half_N +ta_D muntes_NS
+tis_D half_N Rome_NPR
Proper nouns (NPR, NPR$, NPRS, NPRS$)
NPR Proper noun
NPR$ Proper noun, possessive
NPRS Proper noun, plural
NPRS$ Proper noun, plural, possessive
Names of people
- Noun-noun pairs, like KING ARTHUR, EARL THOMAS, are treated as
compound nouns, and so both parts are proper.
Following THE and possessives, these are always treated as appositives,
although this is almost certainly the wrong analysis in some cases.
my_PRO$ lorde_N Arthure_NPR
the_D kynge_N Royns_NPR of_P Northe_NPR Walis_NPR
the_D grete_ADJ Lady_N Lyle_NPR of_P Avilion_NPR
+te_D gentil_ADJ Erl_N Thomas_NPR
Offices on their own (THE KING, THE ARCHBISHOP, THE EARL) are not proper
archebisshop_N of_P Caunterbury_NPR
- In adjective-noun pairs, like ALMIGHTY GOD, the adjective is
not considered part of the name, as long as the head noun is, by itself, a
proper noun (or part of a noun-noun compound).
god_NPR almihtin_ADJ <--- GOD, CHURCH etc. are NPR when alone
- French and other foreign names are treated in toto as proper nouns,
so all parts are NPR, including LE and DE or DU.
Petir_NPR de_NPR Luna_NPR
Melyot_NPR de_NPR Logyrs_NPR
Sagramour_NPR le_NPR Desyrus_NPR
- In English names, only the actual name is tagged NPR, any NP or
PP epithets are treated separately.
seint_NPR iohan_NPR baptiste_N
Iohannes_NPR +de_D godspellere_N
Daui+d_NPR +de_D profiete_N
Iosepe_NPR +de_D smi+de_N
Gy_NPR of_P Marchia_NPR
seint_NPR Patrik_NPR of_P Irlond_NPR
However, when a specific epithet which belongs to a specific person is used
alone to refer to that person, then it is tagged NPR.
the_D Baptist_NPR (when referring to John)
the_D Conqueror_NPR (when referring to William, etc)
the_D Ironside_NPR (when referring to Edmund)
- The names of peoples and groups are proper nouns. These are marked as
singular or plural based on overt plural marking. The rules for noun-noun
and adjective-noun pairs are the same as above.
his_PRO$ kyngdom_N of_P West_NPR Saxons_NPRS
- Names and epithets of God (GOD, LORD, CHRIST, CREATOR, HEALER,
SAVIOUR, etc.) and the devil (DEVIL, SATAN, FIEND, UNWIHT, WURSE, etc.)
are always proper. This includes the names of the members of the Trinity:
FATHER, SON, and HOLY GHOST.
Lord_NPR Lord_NPR God_NPR Almyghty_ADJ
God_NPR Lord_NPR Iesu_NPR
Crist_NPR oure_PRO$ Lord_NPR Jhesu_NPR Crist_NPR
ure_PRO$ helende_NPR Oure_PRO$ Lorde_NPR Godd_NPR
Oure_PRO$ Lorde_NPR hali_NPR gast_NPR
Names of places
- As with names of people, noun-noun pairs are always considered compounds
and both parts are labelled NPR.
As with THE KYNGE ROYNS, noun-noun place names preceded by a determiner are
treated as appositives.
the_D castell_N Nygurmous_NPR
+te_D flum_N Iordan_NPR
+te_D brynke_N of_P +te_D water_N Ponte_NPR
+te_D Castell_N Aungel_NPR
When the ``name'' part is not a noun, it is tagged NPR anyway.
the_D Castell_N Terrable_NPR
the_D Sege_N Perelous_NPR
- In adjective-noun pairs, if the head noun is a proper name in its own
right, the adjective is not part of the name.
If the head noun is not a proper name on its own, the adjective is part of
the_D rede_NPR see_NPR
THE X OF PLACE/PERSON is tagged as follows:
the_D Castell_N of_P Four_NPR Stonys_NPR
the_D cite`_N of_P Camelot_NPR
+te_D citee_N of_P Acres_NPR
+te_D covent_N of_P Coventre_NPR
+te_D cherch_N of_P Chestir_NPR
+te_D Abbay_N of_P Kyng_NPR Edward_NPR
CHRISTENDOM is tagged NPR when it is being used locatively, but
N when it means CHRISTIANITY or CHRISTIAN FAITH.
+Dre_NUM +ting_NS ben_BEP +tat_C elch_Q man_N habben_HV mot_MD ._,
+te_C wile_MD his_PRO$ cristendom_N leden_VB ._.
+De_D rihte_ADJ bileue_N setten_VBP +te_D twolue_NUM apostles_NS on_P
write_N ;_, ar_P hie_PRO ferden_VBD in_RP to_P al_Q middeneard_NPR
to_TO bodien_VB cristendome_N ._.
Hie_PRO is_BEP anginn_N of_P alle_Q cristendome_NPR ,_.
For names given to special things, and nouns denoting things of which
there is only one, the rules for noun-noun and adjective-noun pairs are the
same as above. If the head noun is not a proper name on its own, then the
adjective is also labelled NPR.
+te_D chirche_NPR <--- when referring to the entity, not buildings
+De_D hali_NPR gast_NPR
the_D elde_NPR testament_NPR
the_D Rounde_NPR Table_NPR
Note that for HOLY WRIT, HOLY BOOK, and THE OLD/NEW TESTAMENT, the
head nouns are not proper on their own, therefore the adjectives are
tagged NPR. This differs from HOLY SCRIPTURE and HOLY BIBLE
because SCRIPTURE and BIBLE are proper on their own.
Book titles in general are not treated as proper names, as this would hide
their internal syntax. Only a small number of books (as above) which can be
seen as having names rather than titles are treated this way. In addition,
certain common Latin canticles and prayers as well as the creed (CREDO) are
tagged NPR, not FW.
Te_NPR Deum_NPR Laudamus_NPR
Named days, months, and periods of time
- Days of the week
- Holidays and holy days
+te_D Ascencioun_NPR day_NPR
Seint_NPR Edward_NPR$ day_NPR
But note that in adjective-noun pairs in which the noun is in itself
proper, the adjective is tagged ADJ.
the_D Resurreccion_NPR and_CONJ the_D Passion_NPR
In phrases like THE FEAST OF X in which X is a named event, only the event
is tagged NPR.
the_D feste_N of_P Pentecoste_NPR
+te_D fest_N of_P Ascencion_NPR
In phrases like THE FEAST OF X in which X is not a named event, no part
is a proper noun.
+te_D feste_N of_P +te_D camel_N
+te_D day_N of_P doom_N
NOTE: this gives clearly the wrong reading in some cases in which both
nouns are common on their own, but the denotation of the phrase is clearly
proper (e.g. THE WAR OF THE ROSES). This problem currently remains
Names of languages
The names of languages are proper nouns:
the_D langage_N of_P English_NPR
When used adjectively, however, they are tagged ADJ.