Audio properties of perceived boundaries in music
Data mining tasks such as music indexing, information retrieval, and similarity search, require an understanding of how listeners process music internally. Many algorithms for automatically analyzing the structure of recorded music assume that a large change in one or another musical feature suggests a section boundary. However, this assumption has not been tested: while our understanding of how listeners segment melodies has advanced greatly in the past decades, little is known about how this process works with more complex, full-textured pieces of music, or how stable this process is across genres. Knowing how these factors affect how boundaries are perceived will help researchers to judge the viability of algorithmic approaches with different corpora of music. We present a statistical analysis of a large corpus of recordings whose formal structure was annotated by expert listeners. We find that the acoustic properties of boundaries in these recordings corroborate findings of previous perceptual experiments. Nearly all boundaries correspond to peaks in novelty functions, which measure the rate of change of a musical feature at a particular time scale. Moreover, most of these boundaries match peaks in novelty for several features at several time scales. We observe that the boundary-novelty relationship can vary with listener, time scale, genre, and musical feature. Finally, we show that a boundary profile derived from a collection of novelty functions correlates with the estimated salience of boundaries indicated by listeners. © 1999-2012 IEEE.
IEEE Transactions on Multimedia
Digital Object Identifier (DOI)
Smith, Ching-Hua Chuan, & Chew, E. (2014). Audio Properties of Perceived Boundaries in Music. IEEE Transactions on Multimedia, 16(5), 1219–1228. https://doi.org/10.1109/TMM.2014.2310706