通过公共政策使人工智能与人类保持一致

论文标题

通过公共政策使人工智能与人类保持一致

Aligning Artificial Intelligence with Humans through Public Policy

论文作者

Nay, John, Daily, James

论文摘要

鉴于人工智能（AI）日益渗透到我们的生活中，因此我们必须系统地将AI目标与人类的目标和价值观保持一致。人类对齐问题的问题源于明确指定AI模型在世界上所有相关状态中所采取的所有行动应获得的奖励的不切实际性。因此，一种可能的解决方案是利用AI模型的能力，以从丰富的数据来源中隐含地描述人类价值观的奖励来学习这些奖励。民主决策过程仅通过制定特定的规则，灵活的标准，可解释的准则和可概括的先例来产生此类数据，这些规则综合了公民对世界许多州采取的潜在行动的偏好。因此，在计算上编码公共政策以使其对AI系统清晰可见，应该是一种社会技术方法的重要组成部分。本文概述了对AI的研究，该研究学习了可以利用用于下游任务的政策数据中的结构。为了证明AI理解政策的能力，我们提供了一个AI系统的案例研究，该系统可以预测拟议立法与任何给定的公开交易公司的相关性及其对该公司的可能影响。我们认为，这代表了人工智能和政策的“理解”阶段，但是利用政策作为人类价值观的关键来源，需要“理解”政策。解决一致性问题对于确保AI单独（对部署AI的人或团体）并在社会上是有益的。随着AI系统在高风险环境中越来越多的责任，将民主决定的政策纳入这些系统可以使他们的行为与人类目标保持一致，以对不断发展的社会做出响应。

Given that Artificial Intelligence (AI) increasingly permeates our lives, it is critical that we systematically align AI objectives with the goals and values of humans. The human-AI alignment problem stems from the impracticality of explicitly specifying the rewards that AI models should receive for all the actions they could take in all relevant states of the world. One possible solution, then, is to leverage the capabilities of AI models to learn those rewards implicitly from a rich source of data describing human values in a wide range of contexts. The democratic policy-making process produces just such data by developing specific rules, flexible standards, interpretable guidelines, and generalizable precedents that synthesize citizens' preferences over potential actions taken in many states of the world. Therefore, computationally encoding public policies to make them legible to AI systems should be an important part of a socio-technical approach to the broader human-AI alignment puzzle. This Essay outlines research on AI that learn structures in policy data that can be leveraged for downstream tasks. As a demonstration of the ability of AI to comprehend policy, we provide a case study of an AI system that predicts the relevance of proposed legislation to any given publicly traded company and its likely effect on that company. We believe this represents the "comprehension" phase of AI and policy, but leveraging policy as a key source of human values to align AI requires "understanding" policy. Solving the alignment problem is crucial to ensuring that AI is beneficial both individually (to the person or group deploying the AI) and socially. As AI systems are given increasing responsibility in high-stakes contexts, integrating democratically-determined policy into those systems could align their behavior with human goals in a way that is responsive to a constantly evolving society.

下载PDF全文

下载文献需遵守相关版权规定

论文标题