PNLP混合：语言有效的全MLP体系结构

论文标题

PNLP混合：语言有效的全MLP体系结构

pNLP-Mixer: an Efficient all-MLP Architecture for Language

论文作者

Fusco, Francesco, Pascual, Damian, Staar, Peter, Antognini, Diego

论文摘要

基于变压器架构的大型预训练的语言模型已大大改变了自然语言处理（NLP）景观。但是，由于其大小和推理成本，将这些模型部署在受限设备（例如智能手表）中的设备应用程序（例如智能手表）中是完全不切实际的。作为基于变压器的体系结构的替代方案，在有效的NLP上进行的最新工作表明，重量效率的模型可以实现简单任务的竞争性能，例如插槽填充和意图分类，并在Megabyte的顺序上具有模型大小。这项工作介绍了PNLP-Mixer体系结构，这是一种无嵌入的MLP混合模型，用于NLP，该模型得益于新型投影层，可实现高效率。我们在两个多语性语义解析数据集（MTOP和Multiatis）上评估了仅一个兆字节的PNLP混合模型。我们的量化模型达到了MTOP和多ATIS上MBERT的性能的99.4％和97.8％，同时使用了170倍的参数。我们的模型始终击败了最新模型（PQRNN）的最新模型，该模型的两倍，其幅度是MTOP的7.8％。

Large pre-trained language models based on transformer architecture have drastically changed the natural language processing (NLP) landscape. However, deploying those models for on-device applications in constrained devices such as smart watches is completely impractical due to their size and inference cost. As an alternative to transformer-based architectures, recent work on efficient NLP has shown that weight-efficient models can attain competitive performance for simple tasks, such as slot filling and intent classification, with model sizes in the order of the megabyte. This work introduces the pNLP-Mixer architecture, an embedding-free MLP-Mixer model for on-device NLP that achieves high weight-efficiency thanks to a novel projection layer. We evaluate a pNLP-Mixer model of only one megabyte in size on two multi-lingual semantic parsing datasets, MTOP and multiATIS. Our quantized model achieves 99.4% and 97.8% the performance of mBERT on MTOP and multi-ATIS, while using 170x fewer parameters. Our model consistently beats the state-of-the-art of tiny models (pQRNN), which is twice as large, by a margin up to 7.8% on MTOP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题