C-lasso-用于约束稀疏和稳健回归和分类的Python软件包

论文标题

C-lasso-用于约束稀疏和稳健回归和分类的Python软件包

c-lasso -- a Python package for constrained sparse and robust regression and classification

论文作者

Simpson, Léo, Combettes, Patrick L., Müller, Christian L.

论文摘要

我们介绍了C-Lasso，这是一个Python软件包，可通过线性相等性约束实现稀疏且稳健的线性回归和分类。假定基本统计前向模型的形式为：\ [y = xβ+σε\ qquad \ qquad \ textrm {约束} \ qquadcβ= 0 \]，$ x \ in \ in \ in \ mathbb {r}^{r}^{n \ times d} $ y \ y y y \ y y \ y y \ y \ y \ y \ y \ y \ y \ In \ mathbb {r}^{n} $是连续或二进制响应向量。矩阵$ c $是一般约束矩阵。向量$β\ in \ mathbb {r}^{d} $包含未知系数和未知量表的$σ$。显着用例是（稀疏）对数对比度回归，并带有组成数据$ x $，需要约束$ 1_D^tβ= 0 $（Aitchion和Bacon-Shone 1984）和广义套索，这是所描述的问题的特殊情况（例如，参见（例如，James，Paulson和Rusmevichientong 2020202020202020），3）。 C-LASSO软件包提供了估计器，以推断出形式\ [\ [\ min_ {β\ in \ Mathbb {r}^d，r}^d，permin_ {β\ in \ in \ mathbb {r MathBB { λ\ left \ lvertβ\ right \ rvert_1 \ qquad \ textrm {byf} \ qquadcβ= 0 \]对于多个凸损耗函数$ f（\ cdot，\ cdot）$。这包括受约束的拉索，受约束的缩放套索和具有线性平等约束的稀疏Huber m估计器。

We introduce c-lasso, a Python package that enables sparse and robust linear regression and classification with linear equality constraints. The underlying statistical forward model is assumed to be of the following form: \[ y = X β+ σε\qquad \textrm{subject to} \qquad Cβ=0 \] Here, $X \in \mathbb{R}^{n\times d}$is a given design matrix and the vector $y \in \mathbb{R}^{n}$ is a continuous or binary response vector. The matrix $C$ is a general constraint matrix. The vector $β\in \mathbb{R}^{d}$ contains the unknown coefficients and $σ$ an unknown scale. Prominent use cases are (sparse) log-contrast regression with compositional data $X$, requiring the constraint $1_d^T β= 0$ (Aitchion and Bacon-Shone 1984) and the Generalized Lasso which is a special case of the described problem (see, e.g, (James, Paulson, and Rusmevichientong 2020), Example 3). The c-lasso package provides estimators for inferring unknown coefficients and scale (i.e., perspective M-estimators (Combettes and Müller 2020a)) of the form \[ \min_{β\in \mathbb{R}^d, σ\in \mathbb{R}_{0}} f\left(Xβ- y,σ \right) + λ\left\lVert β\right\rVert_1 \qquad \textrm{subject to} \qquad Cβ= 0 \] for several convex loss functions $f(\cdot,\cdot)$. This includes the constrained Lasso, the constrained scaled Lasso, and sparse Huber M-estimators with linear equality constraints.

下载PDF全文

下载文献需遵守相关版权规定

论文标题