Finbert：财务通信的验证语言模型

论文标题

Finbert：财务通信的验证语言模型

FinBERT: A Pretrained Language Model for Financial Communications

论文作者

Yang, Yi, UY, Mark Christopher Siy, Huang, Allen

论文摘要

上下文预处理的语言模型，例如BERT（Devlin等，2019），通过大规模的未标记文本重新提供培训，在各种NLP任务中取得了重大突破。财务部门还积累了大量的财务交流文本。否则，没有预计的特定特定语言模型可用。在这项工作中，我们使用大量的金融通信语料库来鉴定特定于金融领域的BERT模型Finbert来满足需求。关于三个财务情感分类任务的实验证实了Finbert比通用域BERT模型的优势。代码和预估计的模型可在https://github.com/yya518/finbert上找到。我们希望这对从业者和研究金融NLP任务的研究人员将很有用。

Contextual pretrained language models, such as BERT (Devlin et al., 2019), have made significant breakthrough in various NLP tasks by training on large scale of unlabeled text re-sources.Financial sector also accumulates large amount of financial communication text.However, there is no pretrained finance specific language models available. In this work,we address the need by pretraining a financial domain specific BERT models, FinBERT, using a large scale of financial communication corpora. Experiments on three financial sentiment classification tasks confirm the advantage of FinBERT over generic domain BERT model. The code and pretrained models are available at https://github.com/yya518/FinBERT. We hope this will be useful for practitioners and researchers working on financial NLP tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题