论文标题

阿拉伯世界自然语言处理的全景调查

A Panoramic Survey of Natural Language Processing in the Arab World

论文作者

Darwish, Kareem, Habash, Nizar, Abbas, Mourad, Al-Khalifa, Hend, Al-Natsheh, Huseein T., El-Beltagy, Samhaa R., Bouamor, Houda, Bouzoubaa, Karim, Cavalli-Sforza, Violetta, El-Hajj, Wassim, Jarrar, Mustafa, Mubarak, Hamdy

论文摘要

自然语言一词是指没有故意的人类计划和设计的任何符号通信系统(口语,签名或书面)。这将阿拉伯语和日语等自然语言与人工结构的语言(例如Esperanto或Python)区分开来。自然语言处理(NLP)是人工智能(AI)的子场,致力于建模自然语言以构建语音识别和综合,机器翻译,光学特征识别(OCR),情感分析(SA),问题答案,对话系统,对话系统等。NLP与其他学科的领域相互联系,与计算机科学的相互互动性,语言学,语言学,语言学,Sconsocics,Sconsocconsosic,Sconcement。一些最早的AI应用程序在NLP中(例如,机器翻译);尤其是过去十年(2010-2020)的质量令人难以置信的提高,与过去对过去像科幻小说一样的公众意识,使用和期望的提高相匹配。 NLP研究人员以开发可以应用于所有人类语言的语言独立模型和工具而感到自豪,例如可以使用相同的基本机制和模型为各种语言构建机器翻译系统。但是,现实是,某些语言确实比其他语言(例如印地语和斯瓦希里语)更加关注(例如英语和中文)。阿拉伯语,阿拉伯世界的主要语言,数百万非阿拉伯穆斯林的宗教语言在此连续体的中间。尽管阿拉伯NLP面临许多挑战,但它已经看到了许多成功和发展。接下来,我们将讨论阿拉伯语的主要挑战作为必要的背景,并介绍了阿拉伯语NLP的简短历史。然后,我们调查了许多研究领域,并对阿拉伯NLP的未来进行了批判性讨论。

The term natural language refers to any system of symbolic communication (spoken, signed or written) without intentional human planning and design. This distinguishes natural languages such as Arabic and Japanese from artificially constructed languages such as Esperanto or Python. Natural language processing (NLP) is the sub-field of artificial intelligence (AI) focused on modeling natural languages to build applications such as speech recognition and synthesis, machine translation, optical character recognition (OCR), sentiment analysis (SA), question answering, dialogue systems, etc. NLP is a highly interdisciplinary field with connections to computer science, linguistics, cognitive science, psychology, mathematics and others. Some of the earliest AI applications were in NLP (e.g., machine translation); and the last decade (2010-2020) in particular has witnessed an incredible increase in quality, matched with a rise in public awareness, use, and expectations of what may have seemed like science fiction in the past. NLP researchers pride themselves on developing language independent models and tools that can be applied to all human languages, e.g. machine translation systems can be built for a variety of languages using the same basic mechanisms and models. However, the reality is that some languages do get more attention (e.g., English and Chinese) than others (e.g., Hindi and Swahili). Arabic, the primary language of the Arab world and the religious language of millions of non-Arab Muslims is somewhere in the middle of this continuum. Though Arabic NLP has many challenges, it has seen many successes and developments. Next we discuss Arabic's main challenges as a necessary background, and we present a brief history of Arabic NLP. We then survey a number of its research areas, and close with a critical discussion of the future of Arabic NLP.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源