论文标题
使用参数化的双耳CDR估计器改善可听见的空间提示
Improving spatial cues for hearables using a parameterized binaural CDR estimator
论文作者
论文摘要
我们研究了一种基于双耳连贯性与扩散功率比(CDR)(CDR)的语音增强方法,该方法保留了掩盖者的听觉空间提示和宽边目标。常规的CDR估计器通常依赖于所需信号和/或弥漫性噪声场的数学相干模型,这可能会影响其在自然环境中的准确性。这项工作提出了一种新的可靠和参数化的定向双耳CDR估计器。估计器是在时频域中计算的,基于对双耳麦克风信号之间空间相干功能的几何解释。将新的CDR估计量的双耳性能与类似鸡尾酒派对的环境中的三个最先进的CDR估计器进行了比较,并且在几种客观的语音质量指标(例如PESQ和SRMR)方面已显示出改进。我们还讨论了可参数化的CDR估计器对不同声音环境的好处,并使用低延迟实时框架简要地反思了几种非正式的主观评估。
We investigate a speech enhancement method based on the binaural coherence-to-diffuse power ratio (CDR), which preserves auditory spatial cues for maskers and a broadside target. Conventional CDR estimators typically rely on a mathematical coherence model of the desired signal and/or diffuse noise field in their formulation, which may influence their accuracy in natural environments. This work proposes a new robust and parameterized directional binaural CDR estimator. The estimator is calculated in the time-frequency domain and is based on a geometrical interpretation of the spatial coherence function between the binaural microphone signals. The binaural performance of the new CDR estimator is compared with three state-of-the-art CDR estimators in cocktail-party-like environments and has shown improvements in terms of several objective speech quality metrics such as PESQ and SRMR. We also discuss the benefits of the parameterizable CDR estimator for varying sound environments and briefly reflect on several informal subjective evaluations using a low-latency real-time framework.