论文标题
连续的分类:一种新颖的简单价值指数式家庭
The continuous categorical: a novel simplex-valued exponential family
论文作者
论文摘要
单纯值数据出现在整个统计和机器学习中,例如在转移学习和深层网络压缩的背景下。此类数据的现有模型依赖于Dirichlet分布或其他相关损失功能;在这里,我们显示了这些标准选择从许多限制中系统地造成的,包括偏见和数值问题,这些问题挫败了这些分布上游的灵活网络模型的使用。我们通过引入一个新颖的分布族来解决这些局限性,用于建模单纯形值数据 - 连续分类,这是作为最近发现的连续伯努利的非平凡的多元概括而产生的。与Dirichlet和其他典型选择不同,连续的分类会导致行为良好的概率损耗函数,该概率损失函数会产生公正的估计器,同时保留了Dirichlet的数学简单性。除了探索其理论属性外,我们还介绍了该分布的采样方法,这些方法适合重新聚集技巧,并评估其性能。最后,我们证明,在模拟研究中,连续的分类超过标准选择,在多方选举中应用的示例和神经网络压缩任务。
Simplex-valued data appear throughout statistics and machine learning, for example in the context of transfer learning and compression of deep networks. Existing models for this class of data rely on the Dirichlet distribution or other related loss functions; here we show these standard choices suffer systematically from a number of limitations, including bias and numerical issues that frustrate the use of flexible network models upstream of these distributions. We resolve these limitations by introducing a novel exponential family of distributions for modeling simplex-valued data - the continuous categorical, which arises as a nontrivial multivariate generalization of the recently discovered continuous Bernoulli. Unlike the Dirichlet and other typical choices, the continuous categorical results in a well-behaved probabilistic loss function that produces unbiased estimators, while preserving the mathematical simplicity of the Dirichlet. As well as exploring its theoretical properties, we introduce sampling methods for this distribution that are amenable to the reparameterization trick, and evaluate their performance. Lastly, we demonstrate that the continuous categorical outperforms standard choices empirically, across a simulation study, an applied example on multi-party elections, and a neural network compression task.