在分层采样设计下处理层次是研究领域的丢失数据

论文标题

在分层采样设计下处理层次是研究领域的丢失数据

Dealing with missing data under stratified sampling designs where strata are study domains

论文作者

Rodríguez, Carlos, Nieto-Barajas, Luis, Pérez-Pérez, Carlos

论文摘要

一项快速计数试图估算选举的投票趋势，并在选举当天晚上将其传达给人口。在快速计数中，采样基于投票站的分层设计。投票信息逐渐收集，通常无法保证在所有阶层中获得完整的样本甚至信息。但是，必须获得具有部分信息的准确间隔估计。此外，如果地层是额外的研究领域，这将变得更具挑战性。为了产生部分估计，提出了两种策略：1）使用动态后分层策略的贝叶斯模型，以及在对历史投票信息进行彻底分析后定义的单个插补过程。此外，还包括信誉级别的校正来解决差异的低估； 2）将标准的多个插定思想与经典抽样技术相结合的常见替代方案，以在缺失的信息框架下获得估计。使用2021快速计数的信息进行了说明并比较两种解决方案。目的是估计墨西哥代表会议厅的组成。

A quick count seeks to estimate the voting trends of an election and communicate them to the population on the evening of the same day of the election. In quick counts, the sampling is based on a stratified design of polling stations. Voting information is gathered gradually, often with no guarantee of obtaining the complete sample or even information in all the strata. However, accurate interval estimates with partial information must be obtained. Furthermore, this becomes more challenging if the strata are additionally study domains. To produce partial estimates, two strategies are proposed: 1) A Bayesian model using a dynamic post-stratification strategy and a single imputation process defined after a thorough analysis of historic voting information. Additionally, a credibility level correction is included to solve the underestimation of the variance; 2) a frequentist alternative that combines standard multiple imputation ideas with classic sampling techniques to obtain estimates under a missing information framework. Both solutions are illustrated and compared using information from the 2021 quick count. The aim was to estimate the composition of the Chamber of Deputies in Mexico.

下载PDF全文

下载文献需遵守相关版权规定

论文标题