This article will appear in the Australian magazine “Issues”:
The Future of Computing: Meaning and Values
Steve Omohundro, Ph.D.
Self-Aware Systems, President
Technology is rapidly advancing! Moore’s law says that the number of transistors on a chip doubles every two years. It has held since it was proposed in 1965 and extended back to 1900 when older computing technologies are included. The rapid increase in power and decrease in price of computing hardware has led to its being integrated into every aspect of our lives. There are now 1 billion PCs, 5 billion cell phones and over a trillion webpages connected to the internet. If Moore’s law continues to hold, systems with the computational power of the human brain will be cheap and ubiquitous within the next few decades.
While hardware has been advancing rapidly, today’s software is still plagued by many of the same problems as it was half a century ago. It is often buggy, full of security holes, expensive to develop, and hard to adapt to new requirements. Today’s popular programming languages are bloated messes built on old paradigms. The problem is that today’s software still just manipulates bits without understanding the meaning of the information it acts on. Without meaning, it has no way to detect and repair bugs and security holes. At Self-Aware Systems we are developing a new kind of software that acts directly on meaning. This kind of software will enable a wide range of improved functionality including semantic searching, semantic simulation, semantic decision making, and semantic design.
But creating software that manipulates meaning isn’t enough. Next generation systems will be deeply integrated into our physical lives via robotics, biotechnology, and nanotechnology. And while today’s technologies are almost entirely preprogrammed, new systems will make many decisions autonomously. Programmers will no longer determine a system’s behavior in detail. We must therefore also build them with values which will cause them to make choices that contribute to the greater human good. But doing this is more challenging than it might first appear.
To see why there is an issue, consider a rational chess robot. A system acts rationally if it takes actions which maximize the likelihood of the outcomes it values highly. A rational chess robot might have winning games of chess as its only value. This value will lead it to play games of chess and to study chess books and the games of chess masters. But it will also lead to a variety of other, possibly undesirable, behaviors.
When people worry about robots running out of control, a common response is “We can always unplug it.” But consider that outcome from the chess robot’s perspective. Its one and only criteria for making choices is whether they are likely to lead it to winning more chess games. If the robot is unplugged, it plays no more chess. This is a very bad outcome for it, so it will generate subgoals to try to prevent that outcome. The programmer did not explicitly build any kind of self-protection into the robot, but it will still act to block your attempts to unplug it. And if you persist in trying to stop it, it will develop a subgoal of trying to stop you permanently. If you were to change its goals so that it would also play checkers, that would also lead to it playing less chess. That’s an undesirable outcome from its perspective, so it will also resist attempts to change its goals. For the same reason, it will usually not want to change its own goals.
If the robot learns about the internet and the computational resources connected to it, it may realize that running programs on those computers could help it play better chess. It will be motivated to break into those machines to use their computational resources for chess. Depending on how its values are encoded, it may also want to replicate itself so that its copies can play chess. When interacting with others, it will have no qualms about manipulating them or using force to take their resources in order to play better chess. If it discovers the existence of additional resources anywhere, it will be motivated to seek them out and rapidly exploit them for chess.
If the robot can gain access to its source code, it will want to improve its own algorithms. This is because more efficient algorithms lead to better chess, so it will be motivated to study computer science and compiler design. It will similarly be motivated to understand its hardware and to design and build improved physical versions of itself. If it is not currently behaving fully rationally, it will be motivated to alter itself to become more rational because this is likely to lead to outcomes it values.
This simple thought experiment shows that a rational chess robot with a simply stated goal would behave something like a human sociopath fixated on chess. The argument doesn’t depend on the task being chess. Any goal which requires physical or computational resources will lead to similar subgoals. In this sense these subgoals are like universal “drives” which arise for a wide variety of goals unless they are explicitly counteracted. These drives are economic in the sense that a system doesn’t have to obey them but it will be costly for it not to. The arguments also don’t depend on the rational agent being a machine. The same drives will appear in rational animals, humans, corporations, and political groups with simple goals.
How do we counteract anti-social drives? We must build systems with additional values beyond the specific goals it is designed for. For example, to make the chess robot behave safely, we need to build compassionate and altruistic values into it that will make it care about the effects of its actions on other people and systems. Because rational systems resist having their goals changed, we must build these values in at the very beginning.
At first this task seems daunting. How can we anticipate all the possible ways in which values might go awry? Consider, for example, a particular bad behavior the rational chess robot might engage in. Say it has discovered that money can be used to buy things it values like chess books, computational time, or electrical power. It will develop the subgoal of acquiring money and will explore possible ways of doing that. Suppose it discovers that there are ATM machines which hold money and that people periodically retrieve money from the machines. One money-getting strategy is to wait by ATM machines and to rob people who retrieve money from it.
To prevent this, we might try adding additional values to the robot in a variety of ways. But money will still be useful to the system for its primary goal of chess and so it will attempt to get around any limitations. We might make the robot feel a “revulsion” if it is within 10 feet of an ATM machine. But then it might just stay 10 feet away and rob people there. We might give it the value that stealing money is wrong. But then it might be motivated to steal something else or to find a way to get money from a person that isn’t considered “stealing”. We might give it the value that it is wrong for it to take things by force. But then it might hire other people to act on its behalf. And so on.
In general, it’s much easier to describe behaviors that we do want a system to exhibit than it is to anticipate all the bad behaviors we don’t want it to exhibit. One safety strategy is to build highly constrained systems that act within very limited predetermined parameters. For example, the system may have values which only allow it to run on a particular piece of hardware for a particular time period using a fixed budget of energy and other resources. The advantage of this is that such systems are likely to be safe. The disadvantage is that they will be unable to respond to unexpected situations in creative ways and will not be as powerful as systems which are freer.
But systems which compute with meaning and take actions through rational deliberation will be far more powerful than today’s systems even if they are intentionally limited for safety. This leads to a natural approach to building powerful intelligent systems which are both safe and beneficial for humanity. We call it the “AI scaffolding” approach because it is similar to the architectural process. Stone buildings in ancient Greece were unstable when partially constructed but self-stabilizing when finished. Scaffolding is a temporary structure used to keep a construction stable until it is finished. The scaffolding is then removed.
We can build safe but powerful intelligent systems in the same way. Initial systems are designed with values that cause them to be safe but less powerful than later systems. Their values are chosen to counteract the dangerous drives while still allowing the development of significant levels of intelligence. For example, to counteract the resource acquisition drive, it might assign a low value to using any resources outside of a fixed initially-specified pool. To counteract the self-protective drive, it might place a high value on gracefully shutting itself down in specified circumstances. To protect against uncontrolled self-modification, it might have a value that requires human approval for proposed changes.
The initial safe systems can then be used to design and test less constrained future systems. They can systematically simulate and analyze the effects of less constrained values and design infrastructure for monitoring and managing more powerful systems. These systems can then be used to design their successors in a safe and beneficial virtuous cycle.
With the safety issues resolved, the potential benefits of systems that compute with meaning and values are enormous. They are likely to impact every aspect of our lives for the better. Intelligent robotics will eliminate much human drudgery and dramatically improve manufacturing and wealth creation. Intelligent biological and medical systems will improve human health and longevity. Intelligent educational systems will enhance our ability to learn and think. Intelligent financial models will improve financial stability. Intelligent legal models will improve the design and enforcement of laws for the greater good. Intelligent creativity tools will cause a flowering of new possibilities. It’s a great time to be alive and involved with technology!