在线赌博中的预测

论文标题

在线赌博中的预测

Churn prediction in online gambling

论文作者

Merchie, Florian, Ernst, Damien

论文摘要

在业务保留率中，预防流失一直是一个主要问题。这项工作通过将在线赌博作为二进制分类任务的背景下的流失预测问题进行形式化，从而为该领域做出了贡献。我们还基于复发性神经网络对此问题提出了算法答案。该算法通过具有时间序列的形式的在线赌博数据进行了测试，可以通过复发性神经网络进行有效处理。为了评估训练有素的模型的性能，使用了标准的机器学习指标，例如准确性，精度和召回率。特别是对于这个问题，进行的实验允许评估特定体系结构的选择取决于最重要的指标。使用NBRC偏爱精度的体系结构，使用LSTM的体系结构可提供更好的回忆，而基于GRU的体系结构则可以更高的精度并平衡其他两个指标。此外，进一步的实验表明，仅使用较新的时间序列历史来训练网络会降低结果的质量。我们还研究了在特定的即时$ t $上学习的模型的性能，其他时候$ t^{\ prime}> t $。结果表明，$ t $在以下瞬间$ t^{\ prime}> t $时所学的模型的性能保持不错，这表明无需以高速率刷新模型。但是，由于一次性事件影响了数据，模型的性能会导致明显的差异。

In business retention, churn prevention has always been a major concern. This work contributes to this domain by formalizing the problem of churn prediction in the context of online gambling as a binary classification task. We also propose an algorithmic answer to this problem based on recurrent neural network. This algorithm is tested with online gambling data that have the form of time series, which can be efficiently processed by recurrent neural networks. To evaluate the performances of the trained models, standard machine learning metrics were used, such as accuracy, precision and recall. For this problem in particular, the conducted experiments allowed to assess that the choice of a specific architecture depends on the metric which is given the greatest importance. Architectures using nBRC favour precision, those using LSTM give better recall, while GRU-based architectures allow a higher accuracy and balance two other metrics. Moreover, further experiments showed that using only the more recent time-series histories to train the networks decreases the quality of the results. We also study the performances of models learned at a specific instant $t$, at other times $t^{\prime} > t$. The results show that the performances of the models learned at time $t$ remain good at the following instants $t^{\prime} > t$, suggesting that there is no need to refresh the models at a high rate. However, the performances of the models were subject to noticeable variance due to one-off events impacting the data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题