“看到声音”：Wigner-Wille分布和卷积神经网络的音频分类

论文标题

“看到声音”：Wigner-Wille分布和卷积神经网络的音频分类

"Seeing Sound": Audio Classification with the Wigner-Wille Distribution and Convolutional Neural Networks

论文作者

Christonasis, Antonios Marios, van Eijndhoven, Stef, Duin, Peter

论文摘要

随着大数据的越来越多，物联网硬件被广泛采用，AI功能变得越来越强大，组织正在不断投资感知。来自传感器网络的数据目前与传感器融合和AI算法相结合，以驱动自动驾驶汽车等领域的创新。这些传感器的数据可以在许多用例中使用，包括城市环境安全系统中的警报，用于枪击和爆炸等活动。此外，可以在低光条件下或在没有相机的位置中使用多种类型的传感器，例如声音传感器。本文研究了在城市环境中使用声传感器数据的潜力。从技术上讲，我们提出了一种使用Wigner-Ville分布和卷积神经网络对声音数据进行分类的新方法。在本文中，我们报告了开源数据集上该方法的性能。提出的概念和工作是基于我的博士学位论文，该论文是与荷兰国家警察合作的埃因霍温大学数据科学工程博士学位课程的一部分。在论文期间，在现实世界数据集上进行了其他工作，由于机密性，此处未在此处介绍。

With big data becoming increasingly available, IoT hardware becoming widely adopted, and AI capabilities becoming more powerful, organizations are continuously investing in sensing. Data coming from sensor networks are currently combined with sensor fusion and AI algorithms to drive innovation in fields such as self-driving cars. Data from these sensors can be utilized in numerous use cases, including alerts in safety systems of urban settings, for events such as gun shots and explosions. Moreover, diverse types of sensors, such as sound sensors, can be utilized in low-light conditions or at locations where a camera is not available. This paper investigates the potential of the utilization of sound-sensor data in an urban context. Technically, we propose a novel approach of classifying sound data using the Wigner-Ville distribution and Convolutional Neural Networks. In this paper, we report on the performance of the approach on open-source datasets. The concept and work presented is based on my doctoral thesis, which was performed as part of the Engineering Doctorate program in Data Science at the University of Eindhoven, in collaboration with the Dutch National Police. Additional work on real-world datasets was performed during the thesis, which are not presented here due to confidentiality.

下载PDF全文

下载文献需遵守相关版权规定

论文标题