适用于合成财务场景的生成对抗网络

论文标题

适用于合成财务场景的生成对抗网络

Generative Adversarial Networks Applied to Synthetic Financial Scenarios Generation

论文作者

Rizzato, Matteo, Wallart, Julien, Geissler, Christophe, Morizet, Nicolas, Boumlaik, Noureddine

论文摘要

财务行业正在生产越来越多的数据集，投资专业人员可以认为对金融资产的价格产生影响。这些数据集最初主要限于交换数据，即价格，资本化和数量。现在，它们的覆盖范围已大大扩展到包括宏观经济数据，商品的供应和需求，资产负债表数据以及最近的金融大数据，例如ESG分数。作为影响力的因素的扩大构成了统计建模的严重挑战。实际上，这些因素之间的相关性的不稳定性几乎无法确定构建场景所需的联合法律。幸运的是，近年来深度学习领域的壮观进步引起了甘斯。 gan是一种生成机器学习模型，它以不监督的方式生成具有与培训数据分布相同特征的新数据样本，避免了数据假设和人类诱导的偏见。在这项工作中，我们正在探索使用gan用于合成财务场景的生成。这项试点研究是富士通和advestis之间合作的结果，随后将对可以从建议的解决方案中受益的用例进行彻底探索。我们提出了一种基于gans的算法，该算法允许复制一组股票的多个属性（包括但不限于价格，市值，市值，ESG得分，争议得分）的多元数据。这种方法与金融文献中的例子不同，金融文献主要集中在临时资产价格方案的复制上。我们还提出了几个指标来评估GAN产生的数据的质量。这种方法非常适合产生场景，时间方向只是作为随后（最终条件）从学习分布中得出的数据点产生的。我们的方法将允许模拟高维场景（与最新用例中当前使用的$ \ Lessim10 $功能相比），在这些情况下，由于明智的功能工程和选择，网络复杂性降低了。完整的结果将在即将进行的一项研究中提出。

The finance industry is producing an increasing amount of datasets that investment professionals can consider to be influential on the price of financial assets. These datasets were initially mainly limited to exchange data, namely price, capitalization and volume. Their coverage has now considerably expanded to include, for example, macroeconomic data, supply and demand of commodities, balance sheet data and more recently extra-financial data such as ESG scores. This broadening of the factors retained as influential constitutes a serious challenge for statistical modeling. Indeed, the instability of the correlations between these factors makes it practically impossible to identify the joint laws needed to construct scenarios. Fortunately, spectacular advances in Deep Learning field in recent years have given rise to GANs. GANs are a type of generative machine learning models that produce new data samples with the same characteristics as a training data distribution in an unsupervised way, avoiding data assumptions and human induced biases. In this work, we are exploring the use of GANs for synthetic financial scenarios generation. This pilot study is the result of a collaboration between Fujitsu and Advestis and it will be followed by a thorough exploration of the use cases that can benefit from the proposed solution. We propose a GANs-based algorithm that allows the replication of multivariate data representing several properties (including, but not limited to, price, market capitalization, ESG score, controversy score,. . .) of a set of stocks. This approach differs from examples in the financial literature, which are mainly focused on the reproduction of temporal asset price scenarios. We also propose several metrics to evaluate the quality of the data generated by the GANs. This approach is well fit for the generation of scenarios, the time direction simply arising as a subsequent (eventually conditioned) generation of data points drawn from the learned distribution. Our method will allow to simulate high dimensional scenarios (compared to $\lesssim10$ features currently employed in most recent use cases) where network complexity is reduced thanks to a wisely performed feature engineering and selection. Complete results will be presented in a forthcoming study.

下载PDF全文

下载文献需遵守相关版权规定

论文标题