捍卫零射击检索的跨编码器

论文标题

捍卫零射击检索的跨编码器

In Defense of Cross-Encoders for Zero-Shot Retrieval

论文作者

Rosa, Guilherme, Bonifacio, Luiz, Jeronymo, Vitor, Abonizio, Hugo, Fadaee, Marzieh, Lotufo, Roberto, Nogueira, Rodrigo

论文摘要

双重编码器和跨编码器被广泛用于许多最新的检索管道中。在这项工作中，我们研究了这两种类型的体系结构对内域和室外场景的广泛参数计数的概括能力。我们发现，跨编码器的参数数量和早期查询文档相互作用在检索模型的概括能力中起着重要作用。我们的实验表明，增加模型大小会导致内域测试集的边际收益，但是在微调过程中从未见过的新领域的增长幅度更大。此外，我们表明，在几个任务中，跨编码器在很大程度上胜过相似大小的双重编码器。在贝尔（Beir）基准中，我们最大的跨编码器超过了最先进的双重编码器。最后，我们表明，与更简单的检索器（例如BM25）在室外任务上使用Bi-Andoders作为第一阶段的检索器没有任何收益。该代码可从https://github.com/guilhermemr04/scaling-zero-shot-retrieval.git获得

Bi-encoders and cross-encoders are widely used in many state-of-the-art retrieval pipelines. In this work we study the generalization ability of these two types of architectures on a wide range of parameter count on both in-domain and out-of-domain scenarios. We find that the number of parameters and early query-document interactions of cross-encoders play a significant role in the generalization ability of retrieval models. Our experiments show that increasing model size results in marginal gains on in-domain test sets, but much larger gains in new domains never seen during fine-tuning. Furthermore, we show that cross-encoders largely outperform bi-encoders of similar size in several tasks. In the BEIR benchmark, our largest cross-encoder surpasses a state-of-the-art bi-encoder by more than 4 average points. Finally, we show that using bi-encoders as first-stage retrievers provides no gains in comparison to a simpler retriever such as BM25 on out-of-domain tasks. The code is available at https://github.com/guilhermemr04/scaling-zero-shot-retrieval.git

下载PDF全文

下载文献需遵守相关版权规定

论文标题