论文标题
Causalsim:无偏痕量驱动模拟的因果框架
CausalSim: A Causal Framework for Unbiased Trace-Driven Simulation
论文作者
论文摘要
我们提出Causalsim,这是一种无偏微量驱动模拟的因果框架。当前痕量驱动的模拟器假定要模拟的干预措施(例如,新算法)不会影响痕迹的有效性。但是,现实世界中的痕迹通常会因痕量收集过程中的选择算法而产生偏见,因此在干预下重播痕迹可能会导致结果不正确。 Causalsim通过学习系统动力学的因果模型和潜在因素来解决这一挑战,从而捕获痕量收集过程中的基础系统条件。它在固定的算法集中使用初始随机控制试验(RCT)学习了这些模型,然后在模拟新算法时将其应用于从跟踪数据中删除偏差。 Causalsim的关键是将无偏的痕量驱动模拟映射到张量的完成问题,并具有极为稀疏的观测值。通过利用RCT数据中存在的基本分布不变属性,Causalsim尽管观察值稀少,但可以采用新颖的张量完成方法。我们对实际和合成数据集的Causalsim进行了广泛的评估,包括来自Puffer视频流系统的十个月的真实数据,表明它提高了模拟的准确性,与专家设计和受监督的学习基线相比,平均将错误降低了53%和61%。此外,与偏见的基线模拟器相比,Causalsim提供了有关ABR算法的明显不同见解,我们通过实际部署验证了ABR算法。
We present CausalSim, a causal framework for unbiased trace-driven simulation. Current trace-driven simulators assume that the interventions being simulated (e.g., a new algorithm) would not affect the validity of the traces. However, real-world traces are often biased by the choices algorithms make during trace collection, and hence replaying traces under an intervention may lead to incorrect results. CausalSim addresses this challenge by learning a causal model of the system dynamics and latent factors capturing the underlying system conditions during trace collection. It learns these models using an initial randomized control trial (RCT) under a fixed set of algorithms, and then applies them to remove biases from trace data when simulating new algorithms. Key to CausalSim is mapping unbiased trace-driven simulation to a tensor completion problem with extremely sparse observations. By exploiting a basic distributional invariance property present in RCT data, CausalSim enables a novel tensor completion method despite the sparsity of observations. Our extensive evaluation of CausalSim on both real and synthetic datasets, including more than ten months of real data from the Puffer video streaming system shows it improves simulation accuracy, reducing errors by 53% and 61% on average compared to expert-designed and supervised learning baselines. Moreover, CausalSim provides markedly different insights about ABR algorithms compared to the biased baseline simulator, which we validate with a real deployment.