论文标题

原油价格预测合并新闻文本

Crude oil price forecasting incorporating news text

论文作者

Bai, Yun, Li, Xixi, Yu, Hao, Jia, Suling

论文摘要

稀疏和简短的新闻头条可能是任意,嘈杂的和模棱两可的,这使得经典主题模型LDA(潜在的DIRICHLET分配)很难容纳旨在容纳长文本以发现知识的知识。但是,一些有关基于文本的原油预测的现有研究采用LDA来探索新闻头条的主题,从而导致短文和主题模型之间的不匹配,并进一步影响预测性能。利用高级和适当的方法来构建新闻头条的高质量特征在原油预测中至关重要。为了解决这个问题,本文介绍了两个新颖的主题指标和简短和稀疏文本数据的情感。经验实验表明,Adaboost.rt具有我们提出的文本指标,对简短和稀疏的文本数据的看法和表征更全面,表现优于其他基准。另一个重要的优点是,当应用于其他期货商品时,我们的方法还产生了良好的预测性能。

Sparse and short news headlines can be arbitrary, noisy, and ambiguous, making it difficult for classic topic model LDA (latent Dirichlet allocation) designed for accommodating long text to discover knowledge from them. Nonetheless, some of the existing research about text-based crude oil forecasting employs LDA to explore topics from news headlines, resulting in a mismatch between the short text and the topic model and further affecting the forecasting performance. Exploiting advanced and appropriate methods to construct high-quality features from news headlines becomes crucial in crude oil forecasting. To tackle this issue, this paper introduces two novel indicators of topic and sentiment for the short and sparse text data. Empirical experiments show that AdaBoost.RT with our proposed text indicators, with a more comprehensive view and characterization of the short and sparse text data, outperforms the other benchmarks. Another significant merit is that our method also yields good forecasting performance when applied to other futures commodities.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源