设置节奏场景：基于深度学习的鼓循环从任意语言提示生成

论文标题

设置节奏场景：基于深度学习的鼓循环从任意语言提示生成

Setting the rhythm scene: deep learning-based drum loop generation from arbitrary language cues

论文作者

Tripodi, Ignacio J.

论文摘要

生成的人工智能模型可以为音乐创作和现场表演提供宝贵的帮助，以帮助专业音乐家并帮助使业余爱好者的音乐创作过程民主化。在这里，我们提出了一种新颖的方法，鉴于英语单词或短语，生成了2个指南针的四件鼓模式的指南针，该指南针体现给定语言提示的“情绪”，或者可以用于语言提示所描述的视听场景。我们将此工具设想为电子音乐和视听配乐制作的组成辅助工具，也可以作为现场表演的即兴创作工具。为了为该模型生产训练样本，除了对“场景”或“情绪”术语的手动注释，我们设计了一种新颖的方法来提取任何歌曲的共识式鼓声。它由2杆4件鼓图案组成，代表歌曲的主要打击乐图案，可以将其导入任何音乐循环设备或实时循环软件。这两个关键组件（从可推广的输入中产生了鼓模式和共识打击乐提取）提出了一种新颖的计算机辅助组成方法，并为更全面的节奏生成提供了垫脚石。

Generative artificial intelligence models can be a valuable aid to music composition and live performance, both to aid the professional musician and to help democratize the music creation process for hobbyists. Here we present a novel method that, given an English word or phrase, generates 2 compasses of a 4-piece drum pattern that embodies the "mood" of the given language cue, or that could be used for an audiovisual scene described by the language cue. We envision this tool as composition aid for electronic music and audiovisual soundtrack production, or an improvisation tool for live performance. In order to produce the training samples for this model, besides manual annotation of the "scene" or "mood" terms, we have designed a novel method to extract the consensus drum track of any song. This consists of a 2-bar, 4-piece drum pattern that represents the main percussive motif of a song, which could be imported into any music loop device or live looping software. These two key components (drum pattern generation from a generalizable input, and consensus percussion extraction) present a novel approach to computer-aided composition and provide a stepping stone for more comprehensive rhythm generation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题