Zongyu Yin, Federico Reuben, Susan Stepney, Tom Collins.
Deep learning's shallow gains: a comparative evaluation of algorithms for automatic music generation

Machine Learning, 112:1785-1822, 2023


Deep learning methods are recognised as state-of-the-art for many applications of machine learning. Recently, deep learning methods have emerged as a solution to the task of automatic music generation (AMG) using symbolic tokens in a target style, but their superiority over non-deep learning methods has not been demonstrated. Here, we conduct a listening study to comparatively evaluate several music generation systems along six musical dimensions: stylistic success, aesthetic pleasure, repetition or self-reference, melody, harmony, and rhythm. A range of models, both deep learning algorithms and other methods, are used to generate 30-s excerpts in the style of Classical string quartets and classical piano improvisations. Fifty participants with relatively high musical knowledge rate unlabelled samples of computer-generated and human-composed excerpts for the six musical dimensions. We use non-parametric Bayesian hypothesis testing to interpret the results, allowing the possibility of finding meaningful non-differences between systems' performance. We find that the strongest deep learning method, a reimplemented version of Music Transformer, has equivalent performance to a non-deep learning method, MAIA Markov, demonstrating that to date, deep learning does not outperform other methods for AMG. We also find there still remains a significant gap between any algorithmic method and human-composed excerpts.

  author = "Zongyu Yin and Federico Reuben and Susan Stepney and Tom Collins",
  title = "Deep learning's shallow gains: 
    a comparative evaluation of algorithms for automatic music generation", 
  doi = "10.1063/5.0119040",
  volume = 112,
  pages = "1785-1822",
  year = 2023,
  journal = "Machine Learning"