论文标题
最低协变量不平衡问题的网络流量方法
Network Flow Methods for the Minimum Covariates Imbalance Problem
论文作者
论文摘要
平衡协变量的问题是在观察性研究中出现的,在观察性研究中,一个对照样本和另一组与对照组的分离,是治疗样本。每个样品在任一组中都有几个观察到的标称协变量。每个协变量分区的值或类别将处理和控制样品控制为许多子集,称为\ textit {latver},其中每个级别的样本共享相同的协变量值。我们在这里解决了选择对照组的子集的问题,以便在最大程度上平衡治疗组和对照组所选子集之间的水平的大小,即最小失衡问题。 在这里证明,在两个协变量上的最小不平衡问题通过网络流技术有效地解决。我们提出了问题的整数编程公式,其中约束矩阵完全不可测量,这意味着对问题的线性编程放宽具有所有基本解决方案,尤其是最佳解决方案,即积分。该整数编程公式与最低成本网络流问题有关,该问题可在$ O(n \ cdot(n' + n \ log n))$ steps中解决,对于$ n $,治疗组的大小和$ n'$的大小是对照组的大小。进一步设计了一种更有效的算法,该算法是基于替代性的最大流量,即两局部最小不足问题的公式,该问题以$ O(N'^{3/2} \ log^2n)$步骤运行。
The problem of balancing covariates arises in observational studies where one is given a group of control samples and another group, disjoint from the control group, of treatment samples. Each sample, in either group, has several observed nominal covariates. The values, or categories, of each covariate partition the treatment and control samples to a number of subsets referred to as \textit{levels} where the samples at every level share the same covariate value. We address here a problem of selecting a subset of the control group so as to balance, to the best extent possible, the sizes of the levels between the treatment group and the selected subset of control group, the min-imbalance problem. It is proved here that the min-imbalance problem, on two covariates, is solved efficiently with network flow techniques. We present an integer programming formulation of the problem where the constraint matrix is totally unimodular, implying that the linear programming relaxation to the problem has all basic solutions, and in particular the optimal solution, integral. This integer programming formulation is linked to a minimum cost network flow problem which is solvable in $O(n\cdot (n' + n\log n))$ steps, for $n$ the size of the treatment group and $n'$ the size of the control group. A more efficient algorithm is further devised based on an alternative, maximum flow, formulation of the two-covariate min-imbalance problem, that runs in $O(n'^{3/2}\log^2n)$ steps.