Contents of this chapter:
"Labels" are the all upper-case tags inserted by the linguists who prepared the corpus (e.g., "IP", "CONJ", "N".) "Words" are the mostly lower-case original words of text (e.g. "so", "hit".) Every node in the sentence-tree has a label, and the leaf nodes also have words. CorpusSearch can conduct searches on labels or words. In practice, the majority of searches look for labels only.
CorpusSearch uses case-sensitive character-by-character string matching to match search-function arguments to strings found in the input. Therefore, spelling and upper-case/lower-case variations must be described explicitly (usually with an argument list.) For instance, this query searches for a complementizer whose associated text is "that" or "That":
(C iDominates that|That)
and finds sentences such as this:
/~* and he shalle do yow remedy, that youre herte shal be pleasyd. ' (CMMALORY,3.47) *~/ /* 12 CP-ADV: 13 C that */ ( (12 CP-ADV (13 C that) (14 IP-SUB (15 NP-SBJ (16 PRO$ youre) (17 N herte)) (18 MD shal) (19 BE be) (20 VAN pleasyd))) (ID CMMALORY,3.47))