![]() |
Project Description |
||||||||||||
|
An integrated prosodic approach to device-independent,natural-sounding speech synthesisA research project funded under the EPSRC Speech and Language programmeAdministrative Details
OverviewCurrent text-to-speech systems, both concatenative and formant-based, have good intelligibility but still have speech that often sounds unnatural because the rhythm, intonation and fine phonetic detail reflecting coarticulatory patterns are inadequately modelled. As a consequence, listening to such speech requires a greater cognitive effort which can lead to problems in applications for synthetic speech in circumstances of contaminating noise or poor communication channels. This collaborative project between Linguistics departments in Cambridge, London and York aims to construct a model of computational phonology that integrates and extends modern metrical approaches to phonetic interpretation and to apply this model to the generation of high-quality speech synthesis. The three focal areas of research are intonation, morphological structure and systematic segmental variation. Integrating these is a temporal model that provides a linguistic structure or 'data object' upon which phonetic interpretation is executed and which delivers control information for synthesis. Initially, the current project aims to cover a limited range of phenomena in one British English accent, but the complete model should be appropriate for language and accent independence. For signal generation, the project will start with time-domain modification of natural speech signals, supplemented by formant-based synthesis models - although compatibility with concatenative methods will be maintained. Progress will be evaluated using perceptual tests for naturalness, intelligibility and communicative success. Objectives
CambridgeCambridge's contribution to ProSynth is to model acoustic-phonetic fine detail and its control in the overall structure of the synthesizer, and to assess the intelligibility and naturalness of the synthesis. University College LondonUCL's contribution to ProSynth is to provide a spoken corpus of recordings, the software infrastructure and the modelling of intonation. YorkYork's contribution to the project will be in the field of timing. York has developed a model of timing which uses syllable structure as one of its determining factors, and generates natural-sounding rhythms for British English. E-Mail Addresses
Postal addresses
TelephoneTo phone us from outside the UK, remove the initial zero and replace it with +44.
|
Last changed: 13 Mar 2000