The examples use a diphone synthesiser (MBROLA) which allows us to manipulate duration and f0 in a straightforward fashion.
Sound files are in aiff format.
The first file is a re-synthesised version of the utterance Paul knowingly found a sentence produced by a speaker of British English, though not the same variety as that of our database speaker. The re-synthesis uses MBROLA diphones with natural durations and a stylised f0. The subsequent examples keep this approximately natural f0 but impose predicted, rather than actual, durations.
If we model the attributes at the syllable node in our structure (STRENGTH, WEIGHT), this is the result:
Adding in rhyme node attributes (STRENGTH, WEIGHT, CHECKED, VOI), since the rhyme is the head of the syllable, we get:
Adding in nucleus node attributes (STRENGTH, WEIGHT, CHECKED, VOI, LONG), since the nucleus is the head of the rhyme, we get:
Adding in coda node attributes (VOI) to complete the head of the syllable, we get:
Adding in onset node attributes (STRENGTH) to complete the syllable, we get:
Finally, the output of the model with full segmental information: