论文标题
Finbert:财务通信的验证语言模型
FinBERT: A Pretrained Language Model for Financial Communications
论文作者
论文摘要
上下文预处理的语言模型,例如BERT(Devlin等,2019),通过大规模的未标记文本重新提供培训,在各种NLP任务中取得了重大突破。财务部门还积累了大量的财务交流文本。否则,没有预计的特定特定语言模型可用。在这项工作中,我们使用大量的金融通信语料库来鉴定特定于金融领域的BERT模型Finbert来满足需求。关于三个财务情感分类任务的实验证实了Finbert比通用域BERT模型的优势。代码和预估计的模型可在https://github.com/yya518/finbert上找到。我们希望这对从业者和研究金融NLP任务的研究人员将很有用。
Contextual pretrained language models, such as BERT (Devlin et al., 2019), have made significant breakthrough in various NLP tasks by training on large scale of unlabeled text re-sources.Financial sector also accumulates large amount of financial communication text.However, there is no pretrained finance specific language models available. In this work,we address the need by pretraining a financial domain specific BERT models, FinBERT, using a large scale of financial communication corpora. Experiments on three financial sentiment classification tasks confirm the advantage of FinBERT over generic domain BERT model. The code and pretrained models are available at https://github.com/yya518/FinBERT. We hope this will be useful for practitioners and researchers working on financial NLP tasks.