论文标题
可解释的机器学习 - 简短的历史,最先进和挑战
Interpretable Machine Learning -- A Brief History, State-of-the-Art and Challenges
论文作者
论文摘要
我们介绍了可解释的机器学习(IML)领域的简短历史,概述了最新的解释方法,并讨论了挑战。近年来,IML的研究蓬勃发展。它的年龄很小,它从1960年代开始,在回归建模和基于规则的机器学习方面具有超过200年的历史。最近,已经提出了许多新的IML方法,其中许多是模型不可静止的方法,但也解释了针对深度学习和基于树的合奏的解释技术。 IML方法可以直接分析模型组件,研究对输入扰动的敏感性,或者分析ML模型的局部或全局替代近似值。该领域采用了准备和稳定的状态,不仅在研究中提出了许多方法,而且在开源软件中也实现了。但是,对于IML,仍然存在许多重要的挑战,例如处理依赖的特征,因果解释和不确定性估计,需要解决其成功应用于科学问题的问题。另一个挑战是对可解释性的严格定义缺失,这被社区接受。为了应对挑战并推进该领域,我们敦促回忆起我们在统计和(基于规则的)ML中可解释的,数据驱动的建模的根源,同时也考虑其他领域,例如敏感性分析,因果推断和社会科学等其他领域。
We present a brief history of the field of interpretable machine learning (IML), give an overview of state-of-the-art interpretation methods, and discuss challenges. Research in IML has boomed in recent years. As young as the field is, it has over 200 years old roots in regression modeling and rule-based machine learning, starting in the 1960s. Recently, many new IML methods have been proposed, many of them model-agnostic, but also interpretation techniques specific to deep learning and tree-based ensembles. IML methods either directly analyze model components, study sensitivity to input perturbations, or analyze local or global surrogate approximations of the ML model. The field approaches a state of readiness and stability, with many methods not only proposed in research, but also implemented in open-source software. But many important challenges remain for IML, such as dealing with dependent features, causal interpretation, and uncertainty estimation, which need to be resolved for its successful application to scientific problems. A further challenge is a missing rigorous definition of interpretability, which is accepted by the community. To address the challenges and advance the field, we urge to recall our roots of interpretable, data-driven modeling in statistics and (rule-based) ML, but also to consider other areas such as sensitivity analysis, causal inference, and the social sciences.