鱼声：通过数据驱动的音频源分离评估海洋声学生物多样性

论文标题

鱼声：通过数据驱动的音频源分离评估海洋声学生物多样性

Fish sounds: towards the evaluation of marine acoustic biodiversity through data-driven audio source separation

论文作者

Mancusi, Michele, Zonca, Nicola, Rodolà, Emanuele, Zuffi, Silvia

论文摘要

海洋生态系统正在以惊人的速度发生变化，表现出生物多样性丧失以及热带物种向温带盆地的迁移。监测水下环境及其居民对于了解这些系统的演变并实施保障政策至关重要。但是，评估和跟踪生物多样性通常是一项复杂的任务，尤其是在大型和不受控制的环境中，例如海洋。监测海洋生物多样性的最流行和有效方法之一是被动声学监测（PAM），它采用水力机来捕获水下声音。许多水生动物产生自己物种的特征。这些信号有效地在水下旅行，即使在很远的地方也可以检测到。此外，现代技术变得越来越方便，精确，可以非常准确，仔细的数据获取。迄今为止，使用PAM设备捕获的音频经常由海洋生物学家手动处理，并用传统的信号处理技术来解释，以检测动物发声。这是一项具有挑战性的任务，因为PAM录音通常是在长时间的。此外，生物多样性丧失的原因之一是声音污染。在从大声众人噪音的区域获得的数据中，很难手动将人造的人造与鱼的声音分开。如今，机器学习，尤其是深度学习代表了处理音频信号的艺术状态。具体而言，声音分离网络能够识别和分离人类的声音和乐器。在这项工作中，我们表明可以成功地使用相同的技术在PAM录音中自动提取鱼发声，从而为大规模生物多样性监测的可能性开辟了可能性。

The marine ecosystem is changing at an alarming rate, exhibiting biodiversity loss and the migration of tropical species to temperate basins. Monitoring the underwater environments and their inhabitants is of fundamental importance to understand the evolution of these systems and implement safeguard policies. However, assessing and tracking biodiversity is often a complex task, especially in large and uncontrolled environments, such as the oceans. One of the most popular and effective methods for monitoring marine biodiversity is passive acoustics monitoring (PAM), which employs hydrophones to capture underwater sound. Many aquatic animals produce sounds characteristic of their own species; these signals travel efficiently underwater and can be detected even at great distances. Furthermore, modern technologies are becoming more and more convenient and precise, allowing for very accurate and careful data acquisition. To date, audio captured with PAM devices is frequently manually processed by marine biologists and interpreted with traditional signal processing techniques for the detection of animal vocalizations. This is a challenging task, as PAM recordings are often over long periods of time. Moreover, one of the causes of biodiversity loss is sound pollution; in data obtained from regions with loud anthropic noise, it is hard to separate the artificial from the fish sound manually. Nowadays, machine learning and, in particular, deep learning represents the state of the art for processing audio signals. Specifically, sound separation networks are able to identify and separate human voices and musical instruments. In this work, we show that the same techniques can be successfully used to automatically extract fish vocalizations in PAM recordings, opening up the possibility for biodiversity monitoring at a large scale.

下载PDF全文

下载文献需遵守相关版权规定

论文标题