论文标题
使用弹性措施降低维度
Dimensionality Reduction using Elastic Measures
论文作者
论文摘要
随着超维数据的大数据分析的最新激增,对机器学习应用的降低技术的兴趣重新引起了人们的兴趣。为了使这些方法提高绩效提高并了解基础数据,需要确定适当的指标。通常会忽略此步骤,通常会选择指标,而无需考虑数据的基本几何形状。在本文中,我们提出了一种将弹性指标纳入T分布的随机邻居嵌入(T-SNE)和均匀的歧管近似和投影(UMAP)的方法。我们将我们的方法应用于功能数据,该功能数据以旋转,参数化和比例为特征。如果这些属性被忽略,它们可能导致不正确的分析和分类性能差。通过我们的方法,我们证明了三个基准数据集(MPEG-7,CAR数据集和Themonoor的平面数据集)的形状识别任务的提高,我们分别获得了0.77、0.95和1.00 F1分数。
With the recent surge in big data analytics for hyper-dimensional data there is a renewed interest in dimensionality reduction techniques for machine learning applications. In order for these methods to improve performance gains and understanding of the underlying data, a proper metric needs to be identified. This step is often overlooked and metrics are typically chosen without consideration of the underlying geometry of the data. In this paper, we present a method for incorporating elastic metrics into the t-distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP). We apply our method to functional data, which is uniquely characterized by rotations, parameterization, and scale. If these properties are ignored, they can lead to incorrect analysis and poor classification performance. Through our method we demonstrate improved performance on shape identification tasks for three benchmark data sets (MPEG-7, Car data set, and Plane data set of Thankoor), where we achieve 0.77, 0.95, and 1.00 F1 score, respectively.