In December 2012, the Oxford Future of Humanity Institute sponsored the first conference on the Impacts and Risks of Artificial General Intelligence. I was invited to present a keynote talk on “Autonomous Technology for the Greater Human Good”. The talk was recorded and the video is here. Unfortunately the introduction was cut off but the bulk of the talk was recorded. Here are the talk slides as a pdf file. The abstract was:
Autonomous Technology and the Greater Human Good
Next generation technologies will make at least some of their decisions autonomously. Self-driving vehicles, rapid financial transactions, military drones, and many other applications will drive the creation of autonomous systems. If implemented well, they have the potential to create enormous wealth and productivity. But if given goals that are too simplistic, autonomous systems can be dangerous. We use the seemingly harmless example of a chess robot to show that autonomous systems with simplistic goals will exhibit drives toward self-protection, resource acquisition, and self-improvement even if they are not explicitly built into them. We examine the rational economic underpinnings of these drives and describe the effects of bounded computational power. Given that semi-autonomous systems are likely to be deployed soon and that they can be dangerous when given poor goals, it is urgent to consider three questions: 1) How can we build useful semi-autonomous systems with high confidence that they will not cause harm? 2) How can we detect and protect against poorly designed or malicious autonomous systems? 3) How can we ensure that human values and the greater human good are served by more advanced autonomous systems over the longer term?
1) The unintended consequences of goals can be subtle. The best way to achieve high confidence in a system is to create mathematical proofs of safety and security properties. This entails creating formal models of the hardware and software but such proofs are only as good as the models. To increase confidence, we need to keep early systems in very restricted and controlled environments. These restricted systems can be used to design freer successors using a kind of “Safe-AI Scaffolding” strategy.
2) Poorly designed and malicious agents are challenging because there are a wide variety of bad goals. We identify six classes: poorly designed, simplistic, greedy, destructive, murderous, and sadistic. The more destructive classes are particularly challenging to negotiate with because they don’t have positive desires other than their own survival to cause destruction. We can try to prevent the creation of these agents, to detect and stop them early, or to stop them after they have gained some power. To understand an agent’s decisions in today’s environment, we need to look at the game theory of conflict in ultimate physical systems. The asymmetry between the cost of solving and checking computational problems allows systems of different power to coexist and physical analogs of cryptographic techniques are important to maintaining the balance of power. We show how Neyman’s theory of cooperating finite automata and a kind of “Mutually Assured Distraction” can be used to create cooperative social structures.
3) We must also ensure that the social consequences of these systems support the values that are most precious to humanity beyond simple survival. New results in positive psychology are helping to clarify our higher values. Technology based on economic ideas like Coase’s theorem can be used to create a social infrastructure that maximally supports the values we most care about. While there are great challenges, with proper design, the positive potential is immense.