Vocal Tract Modelling
        
        Presented here is an overview of those projects we have run at York
        related to physical modelling of the vocal tract, based on the original
        PhD work of Jack Mullen, with supporting web pages, examples and
        downloadable content.The 2-D Digital Waveguide Mesh Vocal
            Tract
        
        A real-time dynamic simulation of the vocal tract implemented using a
        2-D digital waveguide mesh offering a comparison and improvement over
        the more traditional 1-D Kelly-Lochbaum model.Key Publications:
Mullen, J., Howard, D.M., and Murphy, D.T., "Real-Time Dynamic Articulations in the 2D Waveguide Mesh Vocal Tract Model", IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 2, pp. 577-585, 2007, [DOI].
Mullen, J., Howard, D.M., and Murphy, D.T., "Waveguide Physical Modeling of Vocal Tract Acoustics: Flexible Formant Bandwidth Control From Increased Model Dimensionality", IEEE Transactions on Audio, Speech and Language Processing, vol. 14. no. 3, pp. 964-971, 2006, [DOI].
The
              Dynamic Digital Waveguide Mesh
          
          An implementation of the digital waveguide mesh that enables dynamic
          variation, based on Mullen et al. 2007, as listed above, and now used
          for other applications, including a first attempt at articulatory
          vocal tract synthesis.Key Publications:
Murphy, D.T., Shelley, S., and Ternström, S., "The Dynamically Varying Digital Waveguide Mesh", Proc. of the 19th Int. Congress on Acoustics, Madrid, Spain, September 2-7, 2007 [Invited Paper].
Murphy, D.T., Kelloniemi, A., Mullen, J., and Shelley, S., "Acoustic Modeling using the Digital Waveguide Mesh", IEEE Signal Processing Magazine, vol. 24, no. 2, pp. 55-66, March 2007 [InvitedPaper], [DOI].
3-D Vocal
              Tract Models based on MRI
          
          The 2-D digital waveguide mesh vocal tract was developed into a
          comparable 3-D model as part of Matt Speed's PhD work. Initially this
          implementation was tested using 3-D acrylic tube models, and then
          using 3-D geometries obtained from vocal tract MRI measurements of
          professional singers. The results are verified using acoustic
          measurements/recordings obtained under comparable conditions. Key Publications:
Speed, M., Murphy, D.T., Howard, D.M., "Modeling the Vocal Tract Transfer Function using a 3D Digital Waveguide Mesh", IEEE Transactions on Audio, Speech, and Language Processing, vol. 22, no. 2, pp. 453 - 464, Feb. 2014, [DOI].
Speed, M., Murphy, D.T., Howard, D.M., "Three-Dimensional Digital Waveguide Mesh Simulation of Cylindrical Vocal Tract Analogs", IEEE Transactions on Audio, Speech and Language Processing, vol. 21, no. 2, pp. 449-455, Feb. 2013, [DOI].
Articulatory Vocal Tract Synthesis in SuperCollider
          
          2-D vocal tract articulation was first attempted in Murphy, Shelley
          and Ternström (2007), as listed above, where we used the APEX system
          to generate cross-sectional area function information for input into
          our 2-D dynamic digital waveguide mesh model of the vocal tract. APEX
          has since been updated for implementation in SuperCollider and in this
          paper is used to control a traditional 1-D Kelly-Lochbaum tube model.
          The goal is to implement our 2-D model in SuperCollider as this
          framework proves to be a useful control and synthesis paradigm.Key Publications:
Murphy, D. T., Mátyás, J., and Ternström, S., "Articulatory vocal tract synthesis in Supercollider", Proc. of the 18th Int. Conference on Digital Audio Effects (DAFx-15), pp. 307-313, Trondheim, Norway, Nov. 30-Dec. 3, 2015.
Supporting SuperCollider scripts and source code available here.
3-D Dynamic Vocal Tract Models based on MRI
          
          A development of the work of Speed et al., that explored 3-D static
          vocal tract models from MRI data, and Mullen et al., that developed a
          2-D dynamic vocal tract model. In this research project, a 3-D dynamic
          vocal tract model as the next logical step is explored. The approach,
          in terms of working with the MRI data and developing the vocal tract
          models is outlined in the main paper below (with accompanying data),
          and compared against these previous methods, including benchmarking,
          in part, against a detailed FEM model. Methods to develop an
          articulatory control method for such models have also been
          investigated.Key Publications:
Gully A. J., Daffern, H., and Murphy, D. T., “Diphthong Synthesis Using the Dynamic 3D Digital Waveguide Mesh”, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 2, pp. 243-255, Feb. 2018. [DOI]. Supporting Materials for this article:
Dataset and MATLAB Scripts [DOI].
Gully, A. J., Yoshimura, T., Murphy, D. T., Hashimoto, K., Nankaku, Y. & Tokuda, K., “Articulatory Text-to-Speech Synthesis Using the Digital Waveguide Mesh Driven by a Deep Neural Network”, Proc. of InterSpeech 2017, pp. 234-238, Stockholm, Sweden, Aug. 20-24, 2017. [DOI].