论文标题
模式挖掘的最小描述长度原理:调查
The Minimum Description Length Principle for Pattern Mining: A Survey
论文作者
论文摘要
这大约是应用于模式挖掘的最小描述长度(MDL)原理。该描述的长度保持在最低限度。 采矿模式是数据分析中的核心任务,除了有效枚举的问题之外,选择模式是一个重大挑战。 MDL原理是一种基于信息理论的模型选择方法,已应用于模式挖掘,目的是获得紧凑的高质量模式集。在提供信息理论和编码的相关概念以及对MDL和类似原则背后的理论的工作之后,我们回顾了基于MDL的方法,用于挖掘各种类型的数据和模式。最后,我们就这些方法开设了一些问题,并突出显示了当前有效相关的数据分析问题。
This is about the Minimum Description Length (MDL) principle applied to pattern mining. The length of this description is kept to the minimum. Mining patterns is a core task in data analysis and, beyond issues of efficient enumeration, the selection of patterns constitutes a major challenge. The MDL principle, a model selection method grounded in information theory, has been applied to pattern mining with the aim to obtain compact high-quality sets of patterns. After giving an outline of relevant concepts from information theory and coding, as well as of work on the theory behind the MDL and similar principles, we review MDL-based methods for mining various types of data and patterns. Finally, we open a discussion on some issues regarding these methods, and highlight currently active related data analysis problems.