论文标题

口吃的幼儿心理生理唤醒:一种可解释的AI方法

Psychophysiological Arousal in Young Children Who Stutter: An Interpretable AI Approach

论文作者

Sharma, Harshit, Xiao, Yi, Tumanova, Victoria, Salekin, Asif

论文摘要

提出的首个研究的研究有效地识别并形象化了第二次逐渐模式的模式差异,在学龄前儿童的生理唤醒中,那些做了口吃(CWS)的孩子(CWS),并且在两个挑战性的条件下,在两个挑战性的情况下,在压力和正式的情况下说话,他们不口吃(CWN)(CWN)(CWN)。第一种情况可能会因高唤醒而影响儿童的言语。后者对演讲者介绍了语言,认知和交流要求。我们在两个目标条件下收集了来自70名儿童的生理参数数据。首先,我们采用一种新型的模式多种局限性学习方法(MI-MIL)方法在不同条件下对CWS与CWN进行分类。对该分类器的评估解决了与最先进的语音科学研究兴趣相符的四个关键研究问题。后来,我们利用Shap分类器的解释来可视化人口/组级别和个性化级别的CWS独有的显着,细粒和时间生理参数。尽管对不同模式的小组级识别将增强我们对口吃的病因和发展的理解,但个性化级别的识别将使口吃儿童的生理唤醒能够进行远程,连续和实时的评估,这可能会导致个性化的,恰当的干预措施,从而改善言语流动性。提出的Mi-Mil方法是新颖的,可以推广到不同的域,并且可以实时执行。最后,在多个数据集,呈现的框架和几个基线上进行了全面的评估,这些基线在语音生产过程中确定了有关CWSS生理唤醒的显着见解。

The presented first-of-its-kind study effectively identifies and visualizes the second-by-second pattern differences in the physiological arousal of preschool-age children who do stutter (CWS) and who do not stutter (CWNS) while speaking perceptually fluently in two challenging conditions i.e speaking in stressful situations and narration. The first condition may affect children's speech due to high arousal; the latter introduces linguistic, cognitive, and communicative demands on speakers. We collected physiological parameters data from 70 children in the two target conditions. First, we adopt a novel modality-wise multiple-instance-learning (MI-MIL) approach to classify CWS vs. CWNS in different conditions effectively. The evaluation of this classifier addresses four critical research questions that align with state-of-the-art speech science studies' interests. Later, we leverage SHAP classifier interpretations to visualize the salient, fine-grain, and temporal physiological parameters unique to CWS at the population/group-level and personalized-level. While group-level identification of distinct patterns would enhance our understanding of stuttering etiology and development, the personalized-level identification would enable remote, continuous, and real-time assessment of stuttering children's physiological arousal, which may lead to personalized, just-in-time interventions, resulting in an improvement in speech fluency. The presented MI-MIL approach is novel, generalizable to different domains, and real-time executable. Finally, comprehensive evaluations are done on multiple datasets, presented framework, and several baselines that identified notable insights on CWSs' physiological arousal during speech production.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源