Audio Signal Processing Research - Results and Demos #2

NOTE: There are still compatibility issues with playing media files in some browsers or with different operating systems. Here sample sounds are embedded using a dedicated player, but if the player bar doesn't appear or fails to run for any reason direct links to the mp3 files are also provided.

Results page #2 - Source Separation Using Filters

Later work with Mark Every extended the separation work to more than two instruments/voices and removed the assumption of a strictly harmonic model. Further, it used a filtering approach rather than sinusoidal resynthesis. This is a particularly noteworthy change, since filtering allows the process to be entirely non-destructive. It is possible to separate a mixture of three instruments, for example, into four output channels - one for each of the original sources, and a fourth 'residual' channel which will contain all of the energy which is not consistent with any of the identified instruments. In practice, if harmonic or near-harmonic models are used, the residual would perhaps be expected to contain features such as the attack of a note, or breath noise, or bow noise, etc., as well as any artefacts associated with limitations of the identification and separation processes.

This extreme example illustrates the basic process for a mix of seven simultaneous violin notes.

The spectrogram emphasises the complexity of this (horrible sounding!) mixture. Clearly, the partials of the individual instruments are densely packed throughout the spectrum, and still have significant amplitude near  to the Nyquist limit of 22,050Hz. (sound file).

Despite the many overlaps and interactions between the partials, it is still possible to identify and isolate the seven instruments as below...

Spectrogram of seven simultaneous

Seven simultaneous violins

Time waveform of a mixture of seven simultaneous violin notes.

Separated violin note F5 (sound file).

Time waveforms of 7 separated and
                                extracted violins

Time waveforms of seven separate violins, identified and extracted from the mono original.

Separated violin note Ab5 (sound file).

Separated violin note A5 (sound file).

Separated violin note B5 (sound file).

Separated violin note Db6 (sound file).

Separated violin note E6 (sound file).

Separated violin note Gb6 (sound file).

Residual after separation of all violin notes. (*quiet* sound file)

Remix of all seven separated violin notes, for direct comparison with the original. (sound file)

This example is contrived in the sense that the notes are simultaneous - there is no need to establish the precise timings or to assign the notes to a particular instrument, as is necessary when handling a more complex polyphonic melody. Nevertheless it represents the first 'proof of principle' that it *is* possible to separate quite so many instruments from a complicated mono mixture.

More realistically, the example below is an extract from 'African Breeze' (from the 1985 film 'The Jewel of the Nile'), performed by Hugh Masekela with Jonathan Butler. Here, one track of the original song has been separated into three parts, the first of which corresponds to Hugh Masekela's flugelhorn solo. Having effectively demixed the source into a new three-track master, we now have additional creative control over the content.

Demix of a mono source into three

In the case illustrated above, the single original track has been separated into three individual channels, and then the flugelhorn (top track) has been doubled in strength before remixing the tracks to provide a new result.

Original excerpt from 'African Breeze' (sound file).

Remixed version after separation, doubling the strength of the flugelhorn, and remixing (sound file).

Remixed version after separation, halving the strength of the flugelhorn, and remixing (sound file).

Further publications (PDF format) and demonstration files are available, as below...

Every, M.R. and Szymanski, J.E.,
'A Spectral-Filtering Approach To Music Signal Separation',
Proceedings of the 7th International Conference on Digital Audio Effects (DAFx'04), Naples, Italy, pp. 197200 (5-8 October 2004).

Associated audio demonstrations

Every, M.R.,
'Separating Harmonic And Inharmonic Note Content From Real Mono Recordings'
Proceedings of the Digital Music Research Network Summer Conference 2005, Glasgow, U.K., pp. 913, (23-24 July 2005).


Every, M.R. and Szymanski, J.E.,
'Separation of overlapping impulsive sounds by bandwise noise interpolation'
Proceedings of the 8th International Conference on Digital Audio Effects (DAFx'05), Madrid, Spain, pp. 194-197 (20-22 September 2005).

Associated audio demonstrations

Every, M.R. and Szymanski, J.E.,
'Separation Of Synchronous Pitched Notes By Spectral Filtering Of Harmonics'
IEEE Transactions on Audio, Speech, and Language Processing, 14(5), pp. 1845-1856 (September 2006).

Associated audio demonstrations

Every, M.R.
'Separation of musical sources and structure from single-channel polyphonic recordings'
Ph.D. thesis, Department of Electronics, University of York, UK (2006).

Associated audio demonstrations

Every, M.R.,
'Discriminating Between Pitched Sources in Music Audio'
IEEE Transactions on Audio, Speech, and Language Processing, 16(2), pp. 267-277 (February 2008).

Back to the Top