数十亿个参数比内域培训数据高：法律案件的案例研究

论文标题

数十亿个参数比内域培训数据高：法律案件的案例研究

Billions of Parameters Are Worth More Than In-domain Training Data: A case study in the Legal Case Entailment Task

论文作者

Rosa, Guilherme Moraes, Bonifacio, Luiz, Jeronymo, Vitor, Abonizio, Hugo, Lotufo, Roberto, Nogueira, Rodrigo

论文摘要

最近的工作表明，在零射击和少数射击场景中，语言模型缩放到数十亿个参数（例如GPT-3）表现出色。在这项工作中，我们在Coliee 2022竞赛的法律案例中尝试了零拍模型。我们的实验表明，在语言模型中缩放参数的数量可以提高我们以前的零射击结果的F1分数超过6个点，这表明至少在此任务中，更强的零射击能力可能是较大模型的特征。在Coliee 2021测试集中，我们的3B参数零射击模型优于所有模型，包括合奏，并且在Coliee 2022竞赛中获得了单个模型的最佳性能，仅次于由3B模型本身组成的合奏和同一模型的较小版本。尽管大语言模型构成了挑战，这主要是由于实时应用中的延迟限制，但我们提供了零摄像的monot5-3b模型的演示，以作为搜索引擎，包括法律文档。我们提交的代码和系统的演示可在https://github.com/neuralmind-ai/coliee和https://neuralsearchx.neuralmind.ai上获得。

Recent work has shown that language models scaled to billions of parameters, such as GPT-3, perform remarkably well in zero-shot and few-shot scenarios. In this work, we experiment with zero-shot models in the legal case entailment task of the COLIEE 2022 competition. Our experiments show that scaling the number of parameters in a language model improves the F1 score of our previous zero-shot result by more than 6 points, suggesting that stronger zero-shot capability may be a characteristic of larger models, at least for this task. Our 3B-parameter zero-shot model outperforms all models, including ensembles, in the COLIEE 2021 test set and also achieves the best performance of a single model in the COLIEE 2022 competition, second only to the ensemble composed of the 3B model itself and a smaller version of the same model. Despite the challenges posed by large language models, mainly due to latency constraints in real-time applications, we provide a demonstration of our zero-shot monoT5-3b model being used in production as a search engine, including for legal documents. The code for our submission and the demo of our system are available at https://github.com/neuralmind-ai/coliee and https://neuralsearchx.neuralmind.ai, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题