论文标题

Giantmidi-Piano:用于古典钢琴音乐的大型MIDI数据集

GiantMIDI-Piano: A large-scale MIDI dataset for classical piano music

论文作者

Kong, Qiuqiang, Li, Bochen, Chen, Jitong, Wang, Yuxuan

论文摘要

符号音乐数据集对于音乐信息检索和音乐分析很重要。但是,缺乏用于古典钢琴音乐的大规模符号数据集。在本文中,我们创建了一个巨型钢琴(GP)数据集,其中包含38,700,838个转录笔记和10,855个由2,786家作曲家组成的独特独奏钢琴作品。我们从国际音乐得分图书馆项目(IMSLP)中提取音乐作品的名称和作曲家的名字。我们从Internet搜索并下载他们相应的录音。我们进一步创建了一个策划的子集,其中包含由1,787位作曲家组成的7,236件作品,通过限制包含作曲家姓氏的下载录音的标题。我们应用卷积神经网络来检测独奏钢琴作品。然后,我们使用高分辨率钢琴转录系统将那些独奏钢琴录音将其抄录到乐器数字界面(MIDI)文件中。每个转录的MIDI文件都包含钢琴音符和踏板的发作,偏移,音高和速度属性。 Giantmidi-Piano包括90%的实时性能MIDI文件和10 \%序列输入MIDI文件。我们分析了来自不同时代的六个作曲家的巨型米迪 - 钢琴和展示音调类别,间隔,trichord和Tetrachord频率的统计数据,以表明巨型米迪 - 钢琴可用于音乐分析。我们根据钢琴检测F1得分,元数据准确性和转录错误率评估了巨型钢琴的质量。我们发布了以https://github.com/bytedance/giantmidi-piano的方式获取源代码的源代码

Symbolic music datasets are important for music information retrieval and musical analysis. However, there is a lack of large-scale symbolic datasets for classical piano music. In this article, we create a GiantMIDI-Piano (GP) dataset containing 38,700,838 transcribed notes and 10,855 unique solo piano works composed by 2,786 composers. We extract the names of music works and the names of composers from the International Music Score Library Project (IMSLP). We search and download their corresponding audio recordings from the internet. We further create a curated subset containing 7,236 works composed by 1,787 composers by constraining the titles of downloaded audio recordings containing the surnames of composers. We apply a convolutional neural network to detect solo piano works. Then, we transcribe those solo piano recordings into Musical Instrument Digital Interface (MIDI) files using a high-resolution piano transcription system. Each transcribed MIDI file contains the onset, offset, pitch, and velocity attributes of piano notes and pedals. GiantMIDI-Piano includes 90% live performance MIDI files and 10\% sequence input MIDI files. We analyse the statistics of GiantMIDI-Piano and show pitch class, interval, trichord, and tetrachord frequencies of six composers from different eras to show that GiantMIDI-Piano can be used for musical analysis. We evaluate the quality of GiantMIDI-Piano in terms of solo piano detection F1 scores, metadata accuracy, and transcription error rates. We release the source code for acquiring the GiantMIDI-Piano dataset at https://github.com/bytedance/GiantMIDI-Piano

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源