控制，生成，增强：多属性文本生成的可扩展框架

论文标题

控制，生成，增强：多属性文本生成的可扩展框架

Control, Generate, Augment: A Scalable Framework for Multi-Attribute Text Generation

论文作者

Russo, Giuseppe, Hollenstein, Nora, Musat, Claudiu, Zhang, Ce

论文摘要

我们介绍了有条件的VAE架构CGA来控制，生成和增强文本。 CGA能够通过将对抗性学习与上下文感知的损失和周期性的单词辍学程序相结合，从而生成控制多种语义和句法属性的天然英语句子。我们在消融研究中证明了单个模型成分的价值。我们的方法的可伸缩性是通过单个歧视器（独立于属性数量）确保的。我们通过一系列自动和人类评估在生成的句子中显示出高质量，多样性和属性控制。作为我们工作的主要应用，我们在数据增强方案中测试了这种新的NLG模型的潜力。在下游NLP任务中，我们的CGA模型产生的句子比强大的基线显示出显着的改进，并且分类性能通常与添加相同数量的其他实际数据相媲美。

We introduce CGA, a conditional VAE architecture, to control, generate, and augment text. CGA is able to generate natural English sentences controlling multiple semantic and syntactic attributes by combining adversarial learning with a context-aware loss and a cyclical word dropout routine. We demonstrate the value of the individual model components in an ablation study. The scalability of our approach is ensured through a single discriminator, independently of the number of attributes. We show high quality, diversity and attribute control in the generated sentences through a series of automatic and human assessments. As the main application of our work, we test the potential of this new NLG model in a data augmentation scenario. In a downstream NLP task, the sentences generated by our CGA model show significant improvements over a strong baseline, and a classification performance often comparable to adding same amount of additional real data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题