论文标题
一种评估自动音乐转录重新合成的感知度量
A Perceptual Measure for Evaluating the Resynthesis of Automatic Music Transcriptions
论文作者
论文摘要
这项研究的重点是对音乐表演的感知,例如室内声学和仪器等上下文因素发生变化。我们建议将“绩效”的概念与“解释”之一区分开,该概念表达了“艺术意图”。为了评估这种区别,我们进行了一项实验评估,邀请91名受试者聆听通过自动音乐转录(AMT)系统获得的MIDI数据创建的各种音频录音,并聆听一种感应的声音钢琴。在重新合成期间,我们模拟了不同的上下文,并要求听众评估当上下文发生变化时的解释发生了多少变化。结果表明:(1)仅MIDI格式就无法完全掌握音乐表演的艺术意图; (2)基于MIDI数据的常规客观评估措施与平均主观评估相关性较低。为了弥合这一差距,我们提出了一种新型措施,该方法与测试结果有意义地相关。此外,我们通过提供新的分数AMT方法来研究多模式的学习,并提出了$ P $分散问题的近似算法。
This study focuses on the perception of music performances when contextual factors, such as room acoustics and instrument, change. We propose to distinguish the concept of "performance" from the one of "interpretation", which expresses the "artistic intention". Towards assessing this distinction, we carried out an experimental evaluation where 91 subjects were invited to listen to various audio recordings created by resynthesizing MIDI data obtained through Automatic Music Transcription (AMT) systems and a sensorized acoustic piano. During the resynthesis, we simulated different contexts and asked listeners to evaluate how much the interpretation changes when the context changes. Results show that: (1) MIDI format alone is not able to completely grasp the artistic intention of a music performance; (2) usual objective evaluation measures based on MIDI data present low correlations with the average subjective evaluation. To bridge this gap, we propose a novel measure which is meaningfully correlated with the outcome of the tests. In addition, we investigate multimodal machine learning by providing a new score-informed AMT method and propose an approximation algorithm for the $p$-dispersion problem.