投影追求的偏见假设形成

论文标题

投影追求的偏见假设形成

Biased Hypothesis Formation From Projection Pursuit

论文作者

Patterson, John, Avery, Chris, Grear, Tyler, Jacobs, Donald J.

论文摘要

偏差对假设形成的影响是针对自动数据驱动的投影追踪神经网络提取和选择数据流二进制特征的特征。这个智能的探索过程将完整的向量状态空间分配到不相交的子空间中，以创建通过在两组标记的数据流之间观察到的相似性和差异来量化的工作假设。数据流通常是时间的测序，并且可能显示出复杂的时空模式。例如，给定来自分子动力学模拟的原子轨迹，该机器的任务是量化通过比较蛋白质突变体来促进功能的动力学机制，有些已知可以起作用，而另一些则是非功能。利用模仿功能和非功能性蛋白质动力学的合成二维分子，在机器学习模型和不同环境下的选定训练数据中都可以鉴定和控制偏差。基于上下文依赖性的观点，工作假设的完善将数据收敛于对数据的统计稳定多元感知。在数据探索过程中包括各种观点可以增强相似性和差异的多元特征的解释性。

The effect of bias on hypothesis formation is characterized for an automated data-driven projection pursuit neural network to extract and select features for binary classification of data streams. This intelligent exploratory process partitions a complete vector state space into disjoint subspaces to create working hypotheses quantified by similarities and differences observed between two groups of labeled data streams. Data streams are typically time sequenced, and may exhibit complex spatio-temporal patterns. For example, given atomic trajectories from molecular dynamics simulation, the machine's task is to quantify dynamical mechanisms that promote function by comparing protein mutants, some known to function while others are nonfunctional. Utilizing synthetic two-dimensional molecules that mimic the dynamics of functional and nonfunctional proteins, biases are identified and controlled in both the machine learning model and selected training data under different contexts. The refinement of a working hypothesis converges to a statistically robust multivariate perception of the data based on a context-dependent perspective. Including diverse perspectives during data exploration enhances interpretability of the multivariate characterization of similarities and differences.

下载PDF全文

下载文献需遵守相关版权规定

论文标题