ProSynth logo: home page

Project Description

PARTNERS
Cambridge
UCL
York
AIMS
PUBLICATIONS
OUTPUT

An integrated prosodic approach to device-independent,natural-sounding speech synthesis

A research project funded under the EPSRC Speech and Language programme

Administrative Details

Grant Period: October 1997 - March 2000
Grant Award: £268,000
Grant Numbers: GR/L53069 (Cambridge)
GR/L51829 (York)
GR/L52109 (UCL)
Investigators: Sarah Hawkins (University of Cambridge)
Jill House (University College London)
Mark Huckvale (University College London)
John Local (University of York)
Richard Ogden (University of York)

Overview

Current text-to-speech systems, both concatenative and formant-based, have good intelligibility but still have speech that often sounds unnatural because the rhythm, intonation and fine phonetic detail reflecting coarticulatory patterns are inadequately modelled. As a consequence, listening to such speech requires a greater cognitive effort which can lead to problems in applications for synthetic speech in circumstances of contaminating noise or poor communication channels.

This collaborative project between Linguistics departments in Cambridge, London and York aims to construct a model of computational phonology that integrates and extends modern metrical approaches to phonetic interpretation and to apply this model to the generation of high-quality speech synthesis. The three focal areas of research are intonation, morphological structure and systematic segmental variation. Integrating these is a temporal model that provides a linguistic structure or 'data object' upon which phonetic interpretation is executed and which delivers control information for synthesis.

Initially, the current project aims to cover a limited range of phenomena in one British English accent, but the complete model should be appropriate for language and accent independence. For signal generation, the project will start with time-domain modification of natural speech signals, supplemented by formant-based synthesis models - although compatibility with concatenative methods will be maintained. Progress will be evaluated using perceptual tests for naturalness, intelligibility and communicative success.

Objectives

  • demonstration of selected parts of a text-to-speech system constructed on linguistically-motivated, declarative computational principles
  • development of a system-independent description of the linguistic structures involved
  • perceptual evaluations using criteria of naturalness and robustness

Cambridge

Cambridge's contribution to ProSynth is to model acoustic-phonetic fine detail and its control in the overall structure of the synthesizer, and to assess the intelligibility and naturalness of the synthesis.

University College London

UCL's contribution to ProSynth is to provide a spoken corpus of recordings, the software infrastructure and the modelling of intonation.

York

York's contribution to the project will be in the field of timing. York has developed a model of timing which uses syllable structure as one of its determining factors, and generates natural-sounding rhythms for British English.

E-Mail Addresses

Postal addresses

  • Department of Linguistics,
    University of Cambridge,
    Sidgwick Avenue,
    CAMBRIDGE. CB3 9DA

  • Department of Phonetics and Linguistics,
    University College London,
    Gower Street,
    LONDON. WC1E 6BT

  • Department of Language and Linguistic Science,
    University of York,
    Heslington,
    YORK. YO10 5DD

Telephone

To phone us from outside the UK, remove the initial zero and replace it with +44.
  • Sarah Hawkins: (01223) 335052
  • Jill House: (020) 7679 3167
  • Mark Huckvale: (020) 7679 5002
  • John Local: (01904) 432658
  • Richard Ogden: (01904) 432672
  • Paul Carter: (01904) 432660
  • Jana Dankovicova: (020) 7679 4173
  • Sebastian Heid: (01223) 335050
  • Rachael-Anne Knight: (020) 7679 7414


Last changed: 13 Mar 2000