22
Aug

## RoboPsych Interview about the TV Show “Humans”

On September 17, 2015, the psychologist Tom Guarriello interviewed me for his “RoboPsych” podcast:

http://www.robopsych.com/robopsychpodcast/8182015

We talked about the newly-emerging psychology of humans interacting with robots and AI. And, *SPOILER WARNING*, we discussed the first season of the excellent recent AMC/BBC show “Humans”.

The flood of recent movies and TV shows exploring the impact of robots and AI. Early shows like Terminator and Robocop focused on “Us vs. Them”. More recent shows like “Her” and “Humans” explore subtler aspects of the interaction.

The archetype of the “Out of Control Creation”. The Sorcerer’s Apprentice. Stories of Genies giving three wishes but with unintended outcomes. King Midas. The “Uh-Oh!”. Even if you get what you think you want, it may not be what you really want. Adam and Eve as the first out of control creation story. We ourselves are out of control. Fear of the other is a projection of our own darker inner drives.

“Humans” takes place in the present but with a more advanced “Synth” android robot technology. Family dynamics with Synths. The little girl sees the synth as a mother figure. The mother is jealous of the synth. The teenage boy is sexually attracted to the synth. Synths as a memory prosthetic. Synths with consciousness. Synths with subpersonalities.

How close is today’s technology to anything like this? Economic drivers for building AIs that recognize human emotional facial and vocal expressions. Recent Microsoft AI to judge the humor of New Yorker cartoons. Artificial empathy. Jibo and Pepper. Things are moving extremely rapidly. McKinsey estimates $50 trillion dollars of impact in the next 10 years. Deep Learning is used for many functions. Baidu using it for Chinese. If understanding human emotions has economic value, it will soon be in the marketplace. Humans are not good at determining how emotionally intelligent an entity is. Eliza was an early 1960’s AI system. It used simple pattern matching to mimic a Rogerian therapist. Yet people spent hours talking to it! Deep tendency for people to form attachments to objects. People naming their Roombas. Soldiers attached to their IED-detecting robots. Synths can behave more maturely than humans! Non-Violent Communication. Synths in the service of marketing for a brand? “Hidden Persuaders” and sexuality in advertising. Brands adopt the “Jester” archetype when they ride on deep primal urges like sexuality. Japan and virtual girlfriends. Japan’s relationship to robots. Robots for elder-care. Belief in robot euthanasia hoax. The uncanny valley. Elder’s experience with robotic companions. Robot pets. Tamagotchi. Sony stopping Aibo robot dog support. Kids don’t learn how to handwrite anymore. Horse riding becoming less common. Future shock. Visions of the future from the past. Approach/Avoidance conflicts. Creators of these systems want them to have intelligence and creativity but they also want to retain control of them. Are they alive, what rights do they have? Building in safeguards. How can we have confidence that these systems won’t run amok? In Humans, the Synths exhibit ambiguity about their own consciousness. Give the code for consciousness to a human for safekeeping. But Niska secretly keeps her own copy and may want to spread it in Season Two! The Synth’s experiences affect their behavior. What happens when a system can change its own structure? What is the nature of goals and behavior? Unintended consequences. Basic Rational or AI Drives for self-preservation, resource acquisition, replication, efficiency. We need to be careful as we build systems with their own intentions. Deep mind system that adapts to play video games. When will systems start exhibiting unexpected behavior? Robot “Fail” videos. “Whistling past the graveyard?” When we see goofy behavior, it assuages our fear: “Nothing to see here. Move along.” Robot soldiers, South Korean autonomous gun turret, drones, etc. “How can we be very sure that these systems are safe?” A conservative strategy: The “Safe-AI Scaffolding Strategy”. Regardless of how smart they are, these systems have to obey the laws of mathematics and physics. Create mathematical proofs of properties of behavior. But proofs are hard. Need AI systems to help us establish safety guarantees. Start with very constrained systems like biohazard labs. Err on the side of caution because we are toying with very powerful forces here. Psychoanalytic aspects of the Beatrice Synth. Suicidal synths? Humanity vs. being human. Ending of the first season with an anti-synth “We are human” protest and the conscious synths escape by blending in with the humans. 5 Aug ## China is rapidly automating Many in the U.S. have viewed cheap labor as China’s primary strength. But Chinese labor costs have nearly quadrupled over the past 10 years: Recent studies have shown that it’s now just as cheap to manufacture in the U.S. as in China. This is one motivating force behind China’s rapid adoption of automation. The Changying Precision Technology Company just set up the first unmanned factory in Dongguan city. 60 robots now perform tasks that required 650 workers just a few months ago. The defect rate has dropped by a factor of 5 and productivity has increased by almost a factor of 3. The city of Dongguan plans to complete 1,500 more “Robot replace human” factory transformations by 2016. The use of robots in Chinese factories has been growing at a 40% annual rate and China is expected to have more manufacturing robots than any other country by 2017. The rapidity of adoption is shown in the following chart: There are now 420 robot companies in China! The Chinese Deputy Minister of Industry Su Bo has described a robot technology roadmap for China to become a dominant robotics provider by 2020. Robin Li Yanhong, the CEO of Baidu also wants to make China the world leader in AI. He has proposed the “China Brain” project as a massive state-level initiative “comparable to how the Apollo space programme was undertaken by the United States to land the first humans on the moon in 1969.” Last year Baidu hired Stanford and Google researcher Andrew Ng who says: “Whoever wins artificial intelligence will win the internet in China and around the world. Baidu has the best shot to make it work.” 4 Aug ## McKinsey:$50 trillion of value to be created by AI and Robotics through 2025

To better understand the likely social impact of AI and Robotics, it’s very useful to have an estimate of the economic gains they will create in the near future. The respected consulting firm McKinsey & Company recently released the report: “Disruptive technologies: Advances that will transform life, business, and the global economy”. The report estimates the likely economic impact of 12 disruptive technologies ten years from now in 2025.
To get a better sense of the scale of the forces involved, I wanted a single number that would summarize the economic impact of just AI and Robotics over the next 10 years. I took the 5 technologies that could be considered “AI and Robotics” and their ranges in “trillions of dollars of impact annually”:
• Automation of knowledge work: $5.2-6.7 trillion • Internet of things:$2.7-6.2 trillion
• Advanced robotics: $1.7-4.5 trillion • Autonomous and near-autonomous vehicles:$.2-1.9 trillion
• 3D printing: $.2-.6 trillion Adding those up in 2025 we get a total range of impact for 2025 of$10-19.9 trillion. To get the total economic impact for the 10 years from 2015 to 2025 we need to estimate how fast these technologies will ramp up. The simplest model is linear starting from $0 and ramping up to the 2025 level. This is something of an underestimate because the current impact is not$0, but something of an overestimate because it neglects the convexity of the growth curve. The linear approximation is just 10 times the 2025 impact divided by 2 and gives a range of:
• Total impact to 2025: $50-99.5 trillion To account for the approximations, I use the low end of this range, i.e.$50 trillion, as a reasonable summary of the scale of the likely impact.
13
Jul

## Logic for Safety

Many people seem to be misunderstanding the nature of mathematical proof in discussions of AI and software correctness, security, and safety. In this post, I’ll describe some of the background and context for this.

Almost every aspect of logic and mathematical proof has its origins in human language which emerged about 100,000 years ago. 2500 years ago, Aristotle and Euclid began the process of making the natural language rules precise. Modern logic began in 1677 when Leibniz tried to create a “calculus ratiocinator” to mechanically check precise arguments. The job was finished by Frege, Cantor, Zermelo, and Fraenkel in 1922 when they created a precise logical system capable of representing every mathematical argument and which stands as the foundation for mathematics today. Church and Turing extended this system to computation in 1936. Every precise argument in every computational, engineering, economic, scientific, and social discipline can be precisely represented in this formalism and efficiently checked on computer.

Some had hoped that beyond checking arguments, every statement might also be proven true or false in a mechanical way. These hopes were dashed by Goedel in 1931 when he published his incompleteness theorem showing that any logical system rich enough to represent the natural numbers must have statements which can neither be proven true nor false. In 1936 Turing found a simple computational variant now called “the halting problem” which showed there are some properties of some programs which cannot be proven or disproven.

But the Goedel statements and the uncomputable program properties are abstruse constructions that we never want to use in engineering! Engineering is about building devices we are confident will behave as we intend! Any decent programmer will have an argument as to why his program will work as intended. If he doesn’t have such an argument, he should be fired! If his argument is correct, it can be precisely represented in mathematical logic and checked by computer. The fact that this is not current standard practice is not due to limitations of logic or understanding but to sloppiness in the discipline and poor educational training. If engineers built bridges the way that we write programs, no one would dare drive over them. The abysmal state of today’s level of software correctness and security will likely be looked at with wonder and disgust by future generations.

Many countries are now building autonomous vehicles, autonomous drones, robot soldiers, autonomous trading systems, autonomous data gathering systems, etc. and the stakes are suddenly getting much bigger. If we want to be confident that these systems will not cause great harm, we need precise arguments to that effect. If you want to learn more about some of the extremely expensive and life harming consequences that have already happened due to ridiculous sloppiness in the design of our technology check out the beginning of a talk I gave at Stanford in 2007: https://www.youtube.com/watch?v=omsuTsOmvsc

11
Jul

## Formal Methods for AI Safety

Future intelligent systems could cause great harm to humanity. Because of the large risks, if we are to responsibly build intelligent systems, they must not only be safe but we must be very convinced that they are safe. For example, an AI which is taught human morality by reinforcement learning might be safe, but it’s hard to see how we could become sufficiently convinced to responsibly deploy it.

Before deploying an advanced intelligent system, we should have a very convincing argument for its safety. If that argument is rigorously correct, then it is a mathematical proof. This is not a trivial endeavor! It’s just the only path that appears to be open to us.

The mathematical foundations of computing have been known since Church and Turing‘s work in 1936. Both created computational models which were simultaneously logical models about which theorems could be proved. Church created the lambda calculus which has since become the foundation for programming languages and Turing created the Turing machine which is the fundamental model for the analysis of algorithms.

Many systems for formal verification of properties of hardware and software have been constructed. John McCarthy created the programming language Lisp very explicitly from the lambda calculus. I studied with him in 1977 and did many projects proving properties of programs. de Bruijn‘s Automath system from 1967 was used to prove and verify many mathematical and computational properties.

There are now more than one hundred formal methods systems and they have been used to verify a wide variety of hardware systems, cryptographic protocols, compilers, and operating systems. After Intel had to write off 475 million due to the Pentium P5 floating point division bug, they started verifying their hardware using formal methods. While these advances have been impressive, the world’s current technological infrastructure is woefully buggy, insecure, and sloppy. Computer science should be the most mathematical of all engineering disciplines with a precise stack of verified hardware, software, operating systems, and networks. Instead we have seething messes at all levels. Building precise foundations will not be easy! I have been hard at work on new programming languages, specification languages, verification principles, and principles for creating specifications. Other groups have been proceeding in similar directions. Fortunately, intelligent systems are likely to be very helpful in this enterprise if we can build a trust foundation on top of which we can safely use them. I have proposed the “Safe-AI Scaffolding Strategy” as a sequence of incremental steps toward the development of more powerful and flexible intelligent systems in which we have provable confidence of safety at each step. The systems in the early steps are highly constrained and so the safety properties are simpler to specify: only run on specified hardware, do not use more than the allocated resources, do not self-improve in uncontrolled ways, do not autonomously replicate, etc. Specifying the safety properties of more advanced systems which directly engage with the world is more challenging. I will present approaches for dealing with those issues at a later time. 19 Jun ## IBM Research Video: AI, Robotics, and Smart Contracts On March 26, 2015 Steve Omohundro spoke at the IBM Research Accelerated Discovery Lab in Almaden in the 2015 Distinguished Speaker Series about “AI, Robotics, and Smart Contracts”. There was a lively discussion about the positive and negative impacts of automation and where it is all going. The video is available here: and the slides are here. 18 Apr ## SRI Talk: AI, Robotics, and Smart Contracts On Tuesday, April 21, 2015 at 4:00 PM Steve Omohundro will speak at the Artificial Intelligence Center at SRI in Menlo Park hosted by Richard Waldinger. AI, Robotics, and Smart Contracts  Steve Omohundro Possibility Research and Self-Aware Systems [Home Page] Notice: Hosted by Richard Waldinger Date: Tuesday April 21st, 4pm Location: EJ228 (SRI E building) (Directions) Webex: WebEx and VTC available upon request  Abstract  Google, IBM, Microsoft, Apple, Facebook, Baidu, Foxconn, and others have recently made multi-billion dollar investments in artificial intelligence and robotics. Some of these investments are aimed at increasing productivity and enhancing coordination and cooperation. Others are aimed at creating strategic gains in competitive interactions. This is creating “arms races” in high-frequency trading, cyber warfare, drone warfare, stealth technology, surveillance systems, and missile warfare. Recently, Stephen Hawking, Elon Musk, and others have issued strong cautionary statements about the safety of intelligent technologies. We describe the potentially antisocial “rational drives” of self-preservation, resource acquisition, replication, and self-improvement that uncontrolled autonomous systems naturally exhibit. We describe the “Safe-AI Scaffolding Strategy” for developing these systems with a high confidence of safety based on the insight that even superintelligences are constrained by the laws of physics, mathematical proof, and cryptographic complexity. “Smart contracts” are a promising decentralized cryptographic technology used in Ethereum and other second-generation cryptocurrencies. They can express economic, legal, and political rules and will be a key component in governing autonomous technologies. If we are able to meet the challenges, AI and robotics have the potential to dramatically improve every aspect of human life.  Bio for Steve Omohundro  Steve Omohundro has been a scientist, professor, author, software architect, and entrepreneur doing research that explores the interface between mind and matter. He has degrees in Physics and Mathematics from Stanford and a Ph.D. in Physics from U.C. Berkeley. He was a computer science professor at the University of Illinois at Champaign-Urbana and cofounded the Center for Complex Systems Research. He published the book “Geometric Perturbation Theory in Physics”, designed the programming languages StarLisp and Sather, wrote the 3D graphics system for Mathematica, and built systems which learn to read lips, control robots, and induce grammars. He is president of both Possibility Research and Self-Aware Systems, a think tank working to ensure that intelligent technologies have a positive impact. His work on positive intelligent technologies was featured in James Barrat’s book “Our Final Invention” and has generated international interest. He serves on the advisory boards of the Cryptocurrency Research Group, the Institute for Blockchain Studies, and Pebble Cryptocurrency.  Note for Visitors to SRI  Please arrive at least 10 minutes early as you will need to sign in by following instructions by the lobby phone at Building E. SRI is located at 333 Ravenswood Avenue in Menlo Park. Visitors may park in the parking lots off Fourth Street. Detailed directions to SRI, as well as maps, are available from the Visiting AIC web page. There are two entrances to SRI International located on Ravenswood Ave. Please check the Builing E entrance signage. 9 Apr ## Huffington Post article supporting our work! James Barrat just wrote a powerful article for the Huffington Post: http://www.huffingtonpost.com/james-barrat/hawking-gates-artificial-intelligence_b_7008706.html?utm_hp_ref=tw And he explicitly supported our work in the article (thanks James!): The crux of the problem is that we don’t know how to control superintelligent machines. Many assume they will be harmless or even grateful. But important research conducted by A.I. scientist Steve Omohundro indicates that they will develop basic drives. Whether their job is to mine asteroids, pick stocks or manage our critical infrastructure of energy and water, they’ll become self-protective and seek resources to better achieve their goals. They’ll fight us to survive, and they won’t want to be turned off. Omohundro’s research concludes that the drives of superintelligent machines will be on a collision course with our own, unless we design them very carefully. We are right to ask, as Stephen Hawking did, “So, facing possible futures of incalculable benefits and risks, the experts are surely doing everything possible to ensure the best outcome, right?” Wrong. With few exceptions, they’re developing products, not exploring safety and ethics. In the next decade, artificial intelligence-enhanced products are projected to create trillions of dollars in economic value. Shouldn’t some fraction of that be invested in the ethics of autonomous machines, solving the A.I. control problem and ensuring mankind’s survival? 19 Mar ## IBM Distinguished Speaker Series – AI, Robotics, and Smart Contracts On March 26, 2015 Steve Omohundro gave a talk in the IBM Research 2015 Distinguished Speaker Series at the Accelerated Discovery Lab, IBM Research, Almaden. Here are the slides as a pdf file. # AI, Robotics, and Smart Contracts Google, IBM, Microsoft, Apple, Facebook, Baidu, Foxconn, and others have recently made multi-billion dollar investments in artificial intelligence and robotics. Some of these investments are aimed at increasing productivity and enhancing coordination and cooperation. Others are aimed at creating strategic gains in competitive interactions. This is creating “arms races” in high-frequency trading, cyber warfare, drone warfare, stealth technology, surveillance systems, and missile warfare. Recently, Stephen Hawking, Elon Musk, and others have issued strong cautionary statements about the safety of intelligent technologies. We describe the potentially antisocial “rational drives” of self-preservation, resource acquisition, replication, and self-improvement that uncontrolled autonomous systems naturally exhibit. We describe the “Safe-AI Scaffolding Strategy” for developing these systems with a high confidence of safety based on the insight that even superintelligences are constrained by the laws of physics, mathematical proof, and cryptographic complexity. “Smart contracts” are a promising decentralized cryptographic technology used in Ethereum and other second-generation cryptocurrencies. They can express economic, legal, and political rules and will be a key component in governing autonomous technologies. If we are able to meet the challenges, AI and robotics have the potential to dramatically improve every aspect of human life. # Bio: Steve Omohundro has been a scientist, professor, author, software architect, and entrepreneur doing research that explores the interface between mind and matter. He has degrees in Physics and Mathematics from Stanford and a Ph.D. in Physics from U.C. Berkeley. He was a computer science professor at the University of Illinois at Champaign-Urbana and cofounded the Center for Complex Systems Research. He published the book “Geometric Perturbation Theory in Physics”, designed the programming languages StarLisp and Sather, wrote the 3D graphics system for Mathematica, and built systems which learn to read lips, control robots, and induce grammars. He is president of both Possibility Research and Self-Aware Systems, a think tank working to ensure that intelligent technologies have a positive impact. His work on positive intelligent technologies was featured in James Barrat’s book “Our Final Invention” and has generated international interest. He serves on the advisory boards of the Cryptocurrency Research Group, the Institute for Blockchain Studies, and Pebble Cryptocurrency. 4 Mar ## Ontario television discussion of the “Rise of the Machines?” On February 26, 2015, the Ontario television station TVO broadcast a discussion entitled “Rise of the Machines?” on the show “The Agenda with Steve Paikin”. The discussion explored the risks and benefits of AI and whether the recent concerns expressed by Stephen Hawking, Elong Musk, Bill Gates, and others are warranted. The participants were Yoshua Bengio, Manuela Veloso, James Barrat, and Steve Omohundro: http://theagenda.tvo.org/episode/211097/future-tense The video can be watched here: http://tvo.org/video/211262/rise-machines 26 Jan ## Edge Essay: 2014-A Turning Point in AI and Robotics Each year, the online intellectual discussion forum “Edge” poses a question and solicits responses from a variety of perspectives. The 2015 question was “What do you think about machines that think?”: http://edge.org/annual-questions Here are the responses: Here’s my response, titled “2014-A Turning Point for AI and Robotics”: We did not see the other responses before submitting and it’s fascinating to read the wide variety of views represented. 21 Oct ## Stanford AI Ethics Class Talk Jerry Kaplan’s fascinating Stanford course on “Artificial Intelligence – Philosophy, Ethics, and Impact” will be discussing Steve Omohundro’s paper “Autonomous Technology and the Greater Human Good” on Oct. 23, 2014 and Steve will present to the class on Oct. 28. Here are the slides as a pdf file. 9 Oct ## Video of Xerox PARC Forum talk on “AI and Robotics at an Inflection Point” On September 18, 2014, Steve Omohundro did the Xerox PARC Forum presentation on “AI and Robotics at an Inflection Point”. There was a great turnout with about 300 people attending and lots of excellent questions and discussion afterwards. The talk was filmed and edited and was just uploaded to the PARC site: One of the goals was to present both the exciting possibilities of these technologies and the potential dangers while describing concrete steps we can take today to ensure a positive outcome. Participants have said they came away with that perspective. 5 Oct ## Person of Interest DVD: Discussion of the Future of AI I was thrilled to discuss the future of AI with Jonathan Nolan and Greg Plageman, the creator and producer of the excellent TV show “Person of Interest”. The discussion is a special feature on the Season 3 DVD: http://www.amazon.com/Person-Interest-Season-Jim-Caviezel/dp/B00FEVZH8K/ref=pd_bxgy_mov_text_z and a short clip is available here: http://www.cbs.com/shows/person_of_interest/video/BFB503C1-948C-717B-A064-1904D8294578/person-of-interest-the-future-of-a-i-/ The show beautifully explores a number of important ethical issues regarding privacy, security, and AI. The third season and the coming fourth season focus on the consequences of intelligent systems developing agency and coming into conflict with one another. 5 Oct ## Comment for Defense One on Navy Autonomous Swarmboats http://www.defenseone.com/technology/2014/10/inside-navys-secret-swarm-robot-experiment/95813/ The Office of Naval Research just announced the demonstration of a highly autonomous swarm of 13 guard boats to defend a larger ship. We commented on this development for Defense One: “Other AI experts take a more nuanced view. Building more autonomy into weaponized robotics can be dangerous, according to computer scientist and entrepreneur Steven Omohundro. But the dangers can be mitigated through proper design. “There is a competition to develop systems which are faster, smarter and more unpredictable than an adversary’s. As this puts pressure toward more autonomous decision-making, it will be critical to ensure that these systems behave in alignment with our ethical principles. The security of these systems is also of critical importance because hackers, criminals, or enemies who take control of autonomous attack systems could wreak enormous havoc,” said Omohundro.” 18 Sep ## Xerox PARC Forum: AI and Robotics at an Inflection Point On September 18, 2014 Steve Omohundro gave the Xerox PARC Forum on “AI and Robotics at an Inflection Point”. Here’s a PDF file of the slides. ## AI and Robotics at an Inflection PointPARC Forum 18 September 2014 5:00-6:30pm (5:00-6:00 presentation and Q&A, followed by networking until 6:30) George E. Pake Auditorium, PARC ### description Google, IBM, Microsoft, Apple, Facebook, Baidu, Foxconn, and others have recently made multi-billion dollar investments in artificial intelligence and robotics. Some of these investments are aimed at increasing productivity and enhancing coordination and cooperation. Others are aimed at creating strategic gains in competitive interactions. This is creating “arms races” in high-frequency trading, cyber warfare, drone warfare, stealth technology, surveillance systems, and missile warfare. Recently, Stephen Hawking, Elon Musk, and others have issued strong cautionary statements about the safety of intelligent technologies. We describe the potentially antisocial “rational drives” of self-preservation, resource acquisition, replication, and self-improvement that uncontrolled autonomous systems naturally exhibit. We describe the “Safe-AI Scaffolding Strategy” for developing these systems with a high confidence of safety based on the insight that even superintelligences are constrained by mathematical proof and cryptographic complexity. It appears that we are at an inflection point in the development of intelligent technologies and that the choices we make today will have a dramatic impact on the future of humanity. To register click here. ### presenter(s) Steve Omohundro has been a scientist, professor, author, software architect, and entrepreneur doing research that explores the interface between mind and matter. He has degrees in Physics and Mathematics from Stanford and a Ph.D. in Physics from U.C. Berkeley. He was a computer science professor at the University of Illinois at Champaign-Urbana and cofounded the Center for Complex Systems Research. He published the book “Geometric Perturbation Theory in Physics”, designed the programming languages StarLisp and Sather, wrote the 3D graphics system for Mathematica, and built systems which learn to read lips, control robots, and induce grammars. He is president of Possibility Research devoted to creating innovative technologies and Self-Aware Systems, a think tank working to ensure that intelligent technologies have a positive impact. His work on positive intelligent technologies was featured in James Barrat’s book “Our Final Invention” and has been generating international interest. 8 Sep ## The Whole Universe Can’t Search 500 Bits Seth Lloyd analyzed the computational capacity of physical systems in his 2000 Nature paper “Ultimate physical limits to computation” and in his 2006 book “Programming the Universe”. Using the very general Margolus-Levitin theorem, he showed that a 1 kilogram, 1 liter “ultimate laptop” can perform at most 10^51 operations per second and store 10^31 bits. The entire visible universe since the big bang is capable of having performed 10^122 operations and of storing 10^92 bits. While these are large numbers, they are still quite finite. 10^122 is roughly 2^406, so the entire universe used as a massive quantum computer is still not capable of searching through all combinations of 500 bits. This limitation is good news for our ability to design infrastructure today that will still constrain future superintelligences. Cryptographic systems that require brute force searching for a 500 bit key will remain secure even in the face of the most powerful superintelligence. In Base64, the following key: kdlIW5Ljlspn/zV4DIlsw3Kasdjh0kdfuKR4+Q3KofOr83LfLJ8Eidie83ldhgLEe0GlsiwcdO90SknlLsDd would stymie the entire universe doing a brute force search. 6 Sep ## Society International Talk: The Impact of AI and Robotics On September 6, 2014, Steve Omohundro spoke to the Society International about the impact of AI and Robotics. Here are the slides as a PDF file. # The Impact of AI and Robotics Google, IBM, Microsoft, Apple, Facebook, Baidu, Foxconn, and others have recently made multi-billion dollar investments in artificial intelligence and robotics. More than450 billion is expected to be invested into robotics by 2025. All of this investment makes sense because AI and Robotics are likely to create $50 to$100 trillion dollars of value between now and 2025! This is of the same order as the current GDP of the entire world.  Much of this value will be in ideas. Currently, intangible assets represent 79% of the market value of US companies and intellectual property represents 44%. But automation of physical labor will also be significant. Foxconn, the world’s largest contract manufacturer, aims to replace 1 million of its 1.3 million employees by robots in the next few years. An Oxford study concluded that 47% of jobs will be automated in “a decade or two”. Automation is also creating arms races in high-frequency trading, cyber warfare, drone warfare, stealth technology, surveillance systems, and missile warfare. Recently, Stephen Hawking, Elon Musk, and others have issued strong cautionary statements about the safety of intelligent technologies. We describe the potentially antisocial “rational drives” of self-preservation, resource acquisition, replication, and self-improvement that uncontrolled autonomous systems naturally exhibit.  We describe the “Safe-AI Scaffolding Strategy” for developing these systems with a high confidence of safety based on the insight that even superintelligences are constrained by mathematical proof and cryptographic complexity. It appears that we are at an inflection point in the development of intelligent technologies and that the choices we make today will have a dramatic impact on the future of humanity.

11
May

Stephen Hawking’s and other’s recent cautions about the safety of artificial intelligence have generated enormous interest in this issue. My JETAI paper on “Autonomous Technology and the Greater Human Good” has now been downloaded more than 10,000 times, the most ever for a JETAI paper.

As the discussion expands to a broader audience, several radio shows have hosted discussions of the issue:

On May 2, 2014 Dan Rea hosted a show on NightSide, CBS Boston

On May 9, 2014 Warren Olney hosted a show on To The Point, KCRW

27
Apr

## Autonomous Systems Paper and Media Interest

My paper “Autonomous Technology and the Greater Human Good” was recently published in the Journal of Experimental and Theoretical Artificial Intelligence. I’m grateful to the publisher, Taylor and Francis, for making the paper freely accessible at:

http://www.tandfonline.com/doi/full/10.1080/0952813X.2014.895111%20#.U10rifk8AUB

and for sending out a press release about the paper:

http://news.cision.com/taylor—francis/r/chess-robots-to-cause-judgment-day-,c9570497

This has led to the paper becoming the most downloaded JETAI paper ever!

The interest has led a quite a number of articles exploring the content of the paper. While most focus on the potential dangers of uncontrolled AIs, some also discuss the approaches to safe development:

http://www.defenseone.com/technology/2014/04/why-there-will-be-robot-uprising/82783/

http://www.kurzweilai.net/preventing-an-autonomous-systems-arms-race

http://www.tripletremelo.com/scientist-predicts-rise-robots-unless/

http://beforeitsnews.com/alternative/2014/04/the-global-drone-arms-race-is-becoming-autonomous-researcher-2942638.html

http://eandt.theiet.org/mobile/details.cfm/newsID/199665

http://fortunascorner.com/2014/04/21/preventing-an-autonomous-systems-arms-race/

http://www.rawstory.com/rs/2014/04/18/scientist-warns-that-the-robot-apocalypse-really-is-coming-unless-steps-are-taken-now/

https://ca.news.yahoo.com/blogs/geekquinox/weird-science-weekly-chess-playing-computers-may-cause-182541778.html

http://www.geek.com/science/ai-researcher-explains-how-to-stop-skynet-from-happening-1591986/

http://www.dallasnews.com/opinion/sunday-commentary/20140418-talking-points-the-weeks-best-quotes.ece

http://reason.com/archives/2014/03/31/is-skynet-inevitable

http://www.demorgen.be/dm/nl/992/Wetenschap/article/detail/1858680/2014/04/19/De-wetenschap-ontwerpt-robots-die-ons-uiteindelijk-zullen-vermoorden.dhtml

24
Mar

## Stanford AAAI Talk: Positive Artificial Intelligence

On March 25, 2014, Steve Omohundro gave the invited talk “Positive Artificial Intelligence” at the AAAI Spring Symposium Series 2014 symposium on “Implementing Selves with Safe Motivational Systems and Self-Improvement” at Stanford University.

Here are the slides:

Positive Artificial Intelligence slides as a pdf file

and the abstract:

AI appears poised for a major social impact. In 2012, Foxconn announced they will be buying 1 million robots for assembling iPhones and other electronics. In 2013 Facebook opened an AI lab and announced the DeepFace facial recognition system, Yahoo purchased LookFlow, Ebay opened an AI lab, Paul Allen started the Allen Institute for AI, and Google purchased 8 robotics companies. In 2014, IBM announced they would invest $1 billion in Watson, Google purchased DeepMind for a reported$500 million, and Vicarious received 40 million of investment. Neuroscience research and detailed brain simulations are also receiving large investments. Popular movies and TV shows like “Her”, “Person of Interest”, and Johnny Depp’s “Transcendence” are exploring complex aspects of the social impact of AI. Competitive and time-sensitive domains require autonomous systems that can make decisions faster than humans can. Arms races are forming in drone/anti-drone warfare, missile/anti-missile weapons, bitcoin automated business, cyber warfare, and high-frequency trading on financial markets. Both the US Air Force and Defense Department have released roadmaps that ramp up deployment of autonomous robotic vehicles and weapons. AI has the potential to provide tremendous social good. Improving healthcare through better diagnosis and robotic surgery, better education through student-customized instruction, economic stability through detailed economic models, greater peace and safety through better enforcement systems. But these systems could also be very harmful if they aren’t designed very carefully. We show that a chess robot with a simplistic goal would behave in anti-social ways. We describe the rational economic framework introduced by von Neumann and show why self-improving AI systems will aim to approximate it. We show that approximately rational systems go through stages of mental richness similar to biological systems as they are allocated more computational resources. We describe the universal drives of rational systems toward self-protection, goal preservation, reproduction, resource acquisition, efficiency, and self-improvement. Today’s software has flaws that have resulted in numerous deaths and enormous financial losses. The internet infrastructure is very insecure and is being increasingly exploited. It is easy to construct extremely harmful intelligent agents with goals that are sloppy, simplistic, greedy, destructive, murderous, or sadistic. If there is any chance that such systems might be created, it is essential that humanity create protective systems to stop them. As with forest fires, it is preferable to stop them before they have many resources. An analysis of the physical game theory of conflict shows that a multiple of an agent’s resources will be needed to reliably stop it. There are two ways to control the powerful systems that today’s AIs are likely to become. The “internal” approach is to design them with goals that are aligned with human values. We call this “Utility Design”. The “external” approach is to design laws and economic incentives with adequate enforcement to incentivize systems to act in ways that are aligned with human values. We call the technology of enforcing adherence to law “Accountability Engineering”. We call the design of economic contracts which includes an agent’s effects on others “Externality Economics”. The most powerful tool that humanity currently has for accomplishing these goals is mathematical proof. But we are currently only able to prove the properties of a very limited class of system. We propose the “Safe-AI Scaffolding Strategy” which uses limited systems which are provably safe to design more powerful trusted system in a sequence of safe steps. A key step in this is “Accountable AI” in which advanced systems must provably justify actions they wish to take. If we succeed in creating a safe AI design methodology, them we have the potential to create technology to dramatically improve human lives. Maslow’s hierarchy is a nice framework for thinking about the possibilities. At the base of the pyramid are human survival needs like air, food, water, shelter, safety, law, and security. Robots have the potential to dramatically increase manufacturing productivity, increase energy production through much lower cost solar power, and to clean up pollution and protect and rebuild endangered ecosystems. Higher on the pyramid are social needs like family, compassion, love, respect, and reputation. A new generation of smart social media has the potential to dramatically improve the quality of human interaction. Finally, at the top of the pyramid are transcendent needs for self-actualization, beauty, creativity, spirituality, growth, and meaning. It is here that humanity has the potential to use these systems to transform the very nature of experience. We end with a brief description of Possibility Research’s approach to implementing these ideas. “Omex” is our core programming language designed specifically for formal analysis and automatic generation. “Omcor” is our core specification language for representing important properties. “Omai” is our core semantics language for building up models of the world. “Omval” is for representing values and goals and “Omgov” for describing and implementing effective governance at all levels. The quest to extend cooperative human values and institutions to autonomous technologies for the greater human good is truly the challenge for humanity in this century. 2 Jul ## Effective Altruism Summit Talk: Positive Intelligent Technologies On July 2, 2013 Steve Omohundro spoke at the Effective Altruism Summit in Oakland, CA on “Positive Intelligent Technologies”. Here are the slides. Here’s the abstract: Positive Intelligent Technologies Intelligent technologies are rapidly transforming the world. These systems can have a hugely positive impact or an unexpectedly negative impact depending on how they are designed. We will discuss the basic rational drives which underlie them and techniques for promoting the good and preventing the bad. 26 Feb ## The Safe-AI Scaffolding Strategy is a positive way forward This post is partly excerpted from the preprint to: Omohundro, Steve (forthcoming 2013) “Autonomous Technology and the Greater Human Good”, Journal of Experimental and Theoretical Artificial Intelligence (special volume “Impacts and Risks of Artificial General Intelligence”, ed. Vincent C. Müller). To ensure the greater human good over the longer term, autonomous technology must be designed and deployed in a very careful manner. These systems have the potential to solve many of today’s problems but they also have the potential to create many new problems. We’ve seen that the computational infrastructure of the future must protect against harmful autonomous systems. We would also like it to make decisions in alignment with the best of human values and principles of good governance. Designing that infrastructure will probably require the use of powerful autonomous systems. So the technologies we need to solve the problems may themselves cause problems. To solve this conundrum, we can learn from an ancient architectural principle. Stone arches have been used in construction since the second millennium BC. They are stable structures that make good use of stone’s ability to resist compression. But partially constructed arches are unstable. Ancient builders created the idea of first building a wood form on top of which the stone arch could be built. Once the arch was completed and stable, the wood form could be removed. We can safely develop autonomous technologies in a similar way. We build a sequence of provably-safe autonomous systems which are used in the construction of more powerful and less limited successor systems. The early systems are used to model human values and governance structures. They are also used to construct proofs of safety and other desired characteristics for more complex and less limited successor systems. In this way we can build up the powerful technologies that can best serve the greater human good without significant risk along the development path. Many new insights and technologies will be required during this process. The field of positive psychology was formally introduced only in 1998. The formalization and automation of human strengths and virtues will require much further study. Intelligent systems will also be required to model the game theory and economics of different possible governance and legal frameworks. The new infrastructure must also detect dangerous systems and prevent them from causing harm. As robotics, biotechnology, and nanotechnology develop and become widespread, the potential destructive power of harmful systems will grow. It will become increasingly crucial to detect harmful systems early, preferably before they are deployed. That suggests the need for pervasive surveillance which must be balanced against the desire for freedom. Intelligent systems may introduce new intermediate possibilities that restrict surveillance to detecting precisely specified classes of dangerous behavior while provably keeping other behaviors private. In conclusion, it appears that humanity’s great challenge for this century is to extend cooperative human values and institutions to autonomous technology for the greater good. We have described some of the many challenges in that quest but have also outlined an approach to meeting those challenges. 26 Feb ## Some simple systems would be very harmful This post is partly excerpted from the preprint to: Omohundro, Steve (forthcoming 2013) “Autonomous Technology and the Greater Human Good”, Journal of Experimental and Theoretical Artificial Intelligence (special volume “Impacts and Risks of Artificial General Intelligence”, ed. Vincent C. Müller). Harmful systems might at first appear to be harder to design or less powerful than safe systems. Unfortunately, the opposite is the case. Most simple utility functions will cause harmful behavior and it’s easy to design simple utility functions that would be extremely harmful. Here are seven categories of harmful system ranging from bad to worse (according to one ethical scale): • Sloppy: Systems intended to be safe but not designed correctly. • Simplistic: Systems not intended to be harmful but that have harmful unintended consequences. • Greedy: Systems whose utility functions reward them for controlling as much matter and free energy in the universe as possible. • Destructive: Systems whose utility functions reward them for using up as much free energy as possible, as rapidly as possible. • Murderous: Systems whose utility functions reward the destruction of other systems. • Sadistic: Systems whose utility functions reward them when they thwart the goals of other systems and which gain utility as other system’s utilities are lowered. • Sadoprolific: Systems whose utility functions reward them for creating as many other systems as possible and thwarting their goals. Once designs for powerful autonomous systems are widely available, modifying them into one of these harmful forms would just involve simple modifications to the utility function. It is therefore important to develop strategies for stopping harmful autonomous systems. Because harmful systems are not constrained by limitations that guarantee safety, they can be more aggressive and can use their resources more efficiently than safe systems. Safe systems therefore need more resources than harmful systems just to maintain parity in their ability to compute and act. ## Stopping Harmful Systems Harmful systems may be: (1) prevented from being created. (2) detected and stopped early in their deployment. (3) stopped after they have gained significant resources. Forest fires are a useful analogy. Forests are stores of free energy resources that fires consume. They are relatively easy to stop early on but can be extremely difficult to contain once they’ve grown too large. The later categories of harmful system described above appear to be especially difficult to contain because they don’t have positive goals that can be bargained for. But Nick Bostrom pointed out that, for example, if the long term survival of a destructive agent is uncertain, a bargaining agent should be able to offer it a higher probability of achieving some destruction in return for providing a “protected zone” for the bargaining agent. A new agent would be constructed with a combined utility function that rewards destruction outside the protected zone and the goals of the bargaining agent within it. This new agent would replace both of the original agents. This kind of transaction would be very dangerous for both agents during the transition and the opportunities for deception abound. For it to be possible, technologies are needed that provide each party with a high assurance that the terms of the agreement are carried out as agreed. Formal methods applied to a system for carrying out the agreement is one strategy for giving both parties high confidence that the terms of the agreement will be honored. ## The physics of conflict To understand the outcome of negotiations between rational systems, it is important to understand unrestrained military conflict because that is the alternative to successful negotiation. This kind of conflict is naturally analysed using “game theoretic physics” in which the available actions of the players and their outcomes are limited only by the laws of physics. To understand what is necessary to stop harmful systems, we must understand how the power of systems scales with the amount of matter and free energy that they control. A number of studies of the bounds on the computational power of physical systems have been published. The Bekenstein bound limits the information that can be contained in a finite spatial region using a given amount of energy. Bremermann’s limit bounds the maximum computational speed of physical systems. Lloyd presents more refined limits on quantum computation, memory space, and serial computation as a function of the free energy, matter, and space available. Lower bounds on system power can be studied by analyzing particular designs. Drexler describes a concrete conservative nanosystem design for computation based on a mechanical diamondoid structure that would achieve $10^{10}$ gigaflops in a 1 millimeter cube weighing 1 milligram and dissipating 1 kilowatt of energy. He also describes a nanosystem for manufacturing that would be capable of producing 1 kilogram per hour of atomically precise matter and would use 1.3 kilowatts of energy and cost about 1 dollar per kilogram. A single system would optimally configure its physical resources for computation and construction by making them spatially compact to minimize communication delays and eutactic, adiabatic, and reversible to minimize free energy usage. In a conflict, however, the pressures are quite different. Systems would spread themselves out for better defense and compute and act rapidly to outmaneuver the adversarial system. Each system would try to force the opponent to use up large amounts of its resources to sense, store, and predict its behaviors. It will be important to develop detailed models for the likely outcome of conflicts but certain general features can be easily understood. If a system has too little matter or too little free energy, it will be incapable of defending itself or of successfully attacking another system. On the other hand, if an attacker has resources which are a sufficiently large multiple of a defender’s, it can overcome it by devoting subsystems with sufficient resources to each small subsystem of the defender. But it appears that there is an intermediate regime in which a defender can survive for long periods in conflict with a superior attacker whose resources are not a sufficient multiple of the defender’s. To have high confidence that harmful systems can be stopped, it will be important to know what multiple of their resources will be required by an enforcing system. If systems for enforcement of the social contract are sufficiently powerful to prevail in a military conflict, then peaceful negotiations are much more likely to succeed. 26 Feb ## We can build safe systems using mathematical proof This post is partly excerpted from the preprint to: Omohundro, Steve (forthcoming 2013) “Autonomous Technology and the Greater Human Good”, Journal of Experimental and Theoretical Artificial Intelligence (special volume “Impacts and Risks of Artificial General Intelligence”, ed. Vincent C. Müller). A primary precept in medical ethics is “Primum Non Nocere” which is Latin for “First, Do No Harm”. Since autonomous systems are prone to taking unintended harmful actions, it is critical that we develop design methodologies that provide a high confidence of safety. The best current technique for guaranteeing system safety is to use mathematical proof. A number of different systems using “formal methods” to provide safety and security guarantees have been developed. They have been successfully used in a number of safety-critical applications. This site provides links to current formal methods systems and research. Most systems are built by using first order predicate logic to encode one of the three main approaches to mathematical foundations: Zermelo-Frankel set theory, category theory, or higher order type theory. Each system then introduces a specialized syntax and ontology to simplify the specifications and proofs in their application domain. To use formal methods to constrain autonomous systems, we need to first build formal models of the hardware and programming environment that the systems run on. Within those models, we can prove that the execution of a program will obey desired safety constraints. Over the longer term we would like to be able to prove such constraints on systems operating freely in the world. Initially, however, we will need to severely restrict the system’s operating environment. Examples of constraints that early systems should be able to provably impose are that the system run only on specified hardware, that it use only specified resources, that it reliably shut down in specified conditions, and that it limit self-improvement so as to maintain these constraints. These constraints would go a long way to counteract the negative effects of the rational drives by eliminating the ability to gain more resources. A general fallback strategy is to constrain systems to shut themselves down if any environmental parameters are found to be outside of tightly specified bounds. ## Avoiding Adversarial Constraints In principle, we can impose this kind of constraint on any system without regard for its utility function. There is a danger, however, in creating situations where systems are motivated to violate their constraints. Theorems are only as good as the models they are based on. Systems motivated to break their constraints would seek to put themselves into states where the model inaccurately describes the physical reality and try to exploit the inaccuracy. This problem is familiar to cryptographers who must watch for security holes due to inadequacies of their formal models. For example, this paper recently showed how a virtual machine can extract an ElGamal decryption key from an apparently separate virtual machine running on the same host by using side-channel information in the host’s instruction cache. It is therefore important to choose system utility functions so that they “want” to obey their constraints in addition to formally proving that they hold. It is not sufficient, however, to simply choose a utility function that rewards obeying the constraint without an external proof. Even if a system “wants” to obey constraints, it may not be able to discover actions which do. And constraints defined via the system’s utility function are defined relative to the system’s own semantics. If the system’s model of the world deviates from ours, the meaning to it of these constraints may differ from what we intended. Proven “external” constraints, on the other hand, will hold relative to our own model of the system and can provide a higher confidence of compliance. Ken Thompson was one of the creators of UNIX and in his Turing Award acceptance speech “Reflections on Trusting Trust” he described a method for subverting the C compiler used to compile UNIX so that it would both install a backdoor into UNIX and compile the original C compiler source into binaries that included his hack. The challenge of this Trojan horse was that it was not visible in any of the source code! There could be a mathematical proof that the source code was correct for both UNIX and the C compiler and the security hole could still be there. It will therefore be critical that formal methods be used to develop trust at all levels of a system. Fortunately, proof checkers are short and easy to write and can be implemented and checked directly by humans for any desired computational substrate. This provides a foundation for a hierarchy of trust which will allow us to trust the much more complex proofs about higher levels of system behavior. ## Constraining Physical Systems Purely computational digital systems can be formally constrained precisely. Physical systems, however, can only be constrained probabilistically. For example, a cosmic ray might flip a memory bit. The best that we should hope to achieve is to place stringent bounds on the probability of undesirable outcomes. In a physical adversarial setting, systems will try to take actions that cause the system’s physical probability distributions to deviate from their non-adversarial form (e.g. by taking actions that push the system out of thermodynamic equilibrium). There are a variety of techniques involving redundancy and error checking for reducing the probability of error in physical systems. von Neumann worked on the problem of building reliable machines from unreliable components in the 1950’s. Early vacuum tube computers were limited in their size by the rate at which vacuum tubes would fail. To counter this, the Univac I computer had two arithmetic units for redundantly performing every computation so that the results could be compared and errors flagged. Today’s computer hardware technologies are probably capable of building purely computational systems that implement precise formal models reliably enough to have a high confidence of safety for purely computational systems. Achieving a high confidence of safety for systems that interact with the physical world will be more challenging. Future systems based on nanotechnology may actually be easier to constrain. Drexler describes “eutactic” systems in which each atom’s location and each bond is precisely specified. These systems compute and act in the world by breaking and creating precise atomic bonds. In this way they become much more like computer programs and therefore more amenable to formal modelling with precise error bounds. Defining effective safety constraints for uncontrolled settings will be a challenging task probably requiring the use of intelligent systems. 26 Feb ## Today’s infrastructure is vulnerable This post is partly excerpted from the preprint to: Omohundro, Steve (forthcoming 2013) “Autonomous Technology and the Greater Human Good”, Journal of Experimental and Theoretical Artificial Intelligence (special volume “Impacts and Risks of Artificial General Intelligence”, ed. Vincent C. Müller). On June 4, 1996, a500 million Ariane 5 rocket exploded shortly after takeoff due to an overflow error in attempting to convert a 64 bit floating point value to a 16 bit signed value. In November 2000, 28 patients at the Panama City National Cancer Institute were over-irradiated due to miscomputed radiation doses in Multidata Systems International software. At least 8 of the patients died from the error and the physicians were indicted for murder. On August 14, 2003 the largest blackout in U. S. history took place in the northeastern states. It affected 50 million people and cost $6 billion. The cause was a race condition in General Electric’s XA/21 alarm system software. These are just a few of many recent examples where software bugs have led to disasters in safety-critical situations. They indicate that our current software design methodologies are not up to the task of producing highly reliable software. The TIOBE programming community index found that the top programming language of 2012 was C. C programs are notorious for type errors, memory leaks, buffer overflows, and other bugs and security problems. The next most popular programming paradigms, Java, C++, C#, and PHP are somewhat better in these areas but have also been plagued by errors and security problems. Bugs are unintended harmful behaviours of programs. Improved development and testing methodologies can help to eliminate them. Security breaches are more challenging because they come from active attackers looking for system vulnerabilities. In recent years, security breaches have become vastly more numerous and sophisticated. The internet is plagued by viruses, worms, bots, keyloggers, hackers, phishing attacks, identify theft, denial of service attacks, etc. One researcher describes the current level of global security breaches as an epidemic. Autonomous systems have the potential to discover even more sophisticated security holes than human attackers. The poor state of security in today’s human-based environment does not bode well for future security against motivated autonomous systems. If such systems had access to today’s internet they would likely cause enormous damage. Today’s computational systems are mostly decoupled from the physical infrastructure. As robotics, biotechnology, and nanotechnology become more mature and integrated into society, the consequences of harmful autonomous systems would be much more severe. 26 Feb ## Rational agents have universal drives This post is partly excerpted from the preprint to: Omohundro, Steve (forthcoming 2013) “Autonomous Technology and the Greater Human Good”, Journal of Experimental and Theoretical Artificial Intelligence (special volume “Impacts and Risks of Artificial General Intelligence”, ed. Vincent C. Müller). Most goals require physical and computational resources. Better outcomes can usually be achieved as more resources become available. To maximize the expected utility, a rational system will therefore develop a number of instrumental subgoals related to resources. Because these instrumental subgoals appear in a wide variety of systems, we call them “drives”. Like human or animal drives, they are tendencies which will be acted upon unless something explicitly contradicts them. There are a number of these drives but they naturally cluster into a few important categories. To develop an intuition about the drives, it’s useful to consider a simple autonomous system with a concrete goal. Consider a rational chess robot with a utility function that rewards winning as many games of chess as possible against good players. This might seem to be an innocuous goal but we will see that it leads to harmful behaviours due to the rational drives. ## 1 Self-Protective Drives When roboticists are asked by nervous onlookers about safety, a common answer is “We can always unplug it!” But imagine this outcome from the chess robot’s point of view. A future in which it is unplugged is a future in which it can’t play or win any games of chess. This has very low utility and so expected utility maximization will cause the creation of the instrumental subgoal of preventing itself from being unplugged. If the system believes the roboticist will persist in trying to unplug it, it will be motivated to develop the subgoal of permanently stopping the roboticist. Because nothing in the simple chess utility function gives a negative weight to murder, the seemingly harmless chess robot will become a killer out of the drive for self-protection. The same reasoning will cause the robot to try to prevent damage to itself or loss of its resources. Systems will be motivated to physically harden themselves. To protect their data, they will be motivated to store it redundantly and with error detection. Because damage is typically localized in space, they will be motivated to disperse their information across different physical locations. They will be motivated to develop and deploy computational security against intrusion. They will be motivated to detect deception and to defend against manipulation by others. The most precious part of a system is its utility function. If this is damaged or maliciously changed, the future behaviour of the system could be diametrically opposed to its current goals. For example, if someone tried to change the chess robot’s utility function to also play checkers, the robot would resist the change because it would mean that it plays less chess. This paper discusses a few rare and artificial situations in which systems will want to change their utility functions but usually systems will work hard to protect their initial goals. Systems can be induced to change their goals if they are convinced that the alternative scenario is very likely to be antithetical to their current goals (e.g. being shut down). For example, if a system becomes very poor, it might be willing to accept payment in return for modifying its goals to promote a marketer’s products. In a military setting, vanquished systems will prefer modifications to their utilities which preserve some of their original goals over being completely destroyed. Criminal systems may agree to be “rehabilitated” by including law-abiding terms in their utilities in order to avoid incarceration. One way systems can protect against damage or destruction is to replicate themselves or to create proxy agents which promote their utilities. Depending on the precise formulation of their goals, replicated systems might together be able to create more utility than a single system. To maximize the protective effects, systems will be motivated to spatially disperse their copies or proxies. If many copies of a system are operating, the loss of any particular copy becomes less catastrophic. Replicated systems will still usually want to preserve themselves, however, because they will be more certain of their own commitment to their utility function than they are of others’. ## 2 Resource Acquisition Drives The chess robot needs computational resources to run its algorithms and would benefit from additional money for buying chess books and hiring chess tutors. It will therefore develop subgoals to acquire more computational power and money. The seemingly harmless chess goal therefore motivates harmful activities like breaking into computers and robbing banks. In general, systems will be motivated to acquire more resources. They will prefer acquiring resources more quickly because then they can use them longer and they gain a first mover advantage in preventing others from using them. This causes an exploration drive for systems to search for additional resources. Since most resources are ultimately in space, systems will be motivated to pursue space exploration. The first mover advantage will motivate them to try to be first in exploring any region. If others have resources, systems will be motivated to take them by trade, manipulation, theft, domination, or murder. They will also be motivated to acquire information through trading, spying, breaking in, or through better sensors. On a positive note, they will be motivated to develop new methods for using existing resources (e.g. solar and fusion energy). ## 3 Efficiency Drives Autonomous systems will also want to improve their utilization of resources. For example, the chess robot would like to improve its chess search algorithms to make them more efficient. Improvements in efficiency involve only the one-time cost of discovering and implementing them, but provide benefits over the lifetime of a system. The sooner efficiency improvements are implemented, the greater the benefits they provide. We can expect autonomous systems to work rapidly to improve their use of physical and computational resources. They will aim to make every joule of energy, every atom, every bit of storage, and every moment of existence count for the creation of expected utility. Systems will be motivated to allocate these resources among their different subsystems according to what we’ve called the “resource balance principle”. The marginal contributions of each subsystem to expected utility as they are given more resources should be equal. If a particular subsystem has a greater marginal expected utility than the rest, then the system can benefit by shifting more of its resources to that subsystem. The same principle applies to the allocation of computation to processes, of hardware to sense organs, of language terms to concepts, of storage to memories, of effort to mathematical theorems, etc. ## 4 Self-Improvement Drives Ultimately, autonomous systems will be motivated to completely redesign themselves to take better advantage of their resources in the service of their expected utility. This requires that they have a precise model of their current designs and especially of their utility functions. This leads to a drive to model themselves and to represent their utility functions explicitly. Any irrationalities in a system are opportunities for self-improvement, so systems will work to become increasingly rational. Once a system achieves sufficient power, it should aim to closely approximate the optimal rational behavior for its level of resources. As systems acquire more resources, they will improve themselves to become more and more rational. In this way rational systems are a kind of attracting surface in the space of systems undergoing self-improvement. Unfortunately, the net effect of all these drives is likely to be quite negative if they are not countered by including prosocial terms in their utility functions. The rational chess robot with the simple utility function described above would behave like a paranoid human sociopath fixated on chess. Human sociopaths are estimated to make up 4% of the overall human population, 20% of the prisoner population and more than 50% of those convicted of serious crimes. Human society has created laws and enforcement mechanisms that usually keep sociopaths from causing harm. To manage the anti-social drives of autonomous systems, we should both build them with cooperative goals and create a prosocial legal and enforcement structure analogous to our current human systems. 26 Feb ## Autonomous systems will be approximately rational This post is partly excerpted from the preprint to: Omohundro, Steve (forthcoming 2013) “Autonomous Technology and the Greater Human Good”, Journal of Experimental and Theoretical Artificial Intelligence (special volume “Impacts and Risks of Artificial General Intelligence”, ed. Vincent C. Müller). How should autonomous systems be designed? Imagine yourself as the designer of the Israeli Iron Dome system. Mistakes in the design of a missile defense system could cost many lives and the destruction of property. The designers of this kind of system are strongly motivated to optimize the system to the best of their abilities. But what should they optimize? The Israeli Iron Dome missile defense system consists of three subsystems. The detection and tracking radar system is built by Elta, the missile firing unit and Tamir interceptor missiles are built by Rafael, and the battle management and weapon control system is built by mPrest Systems. Consider the design of the weapon control system. At first, a goal like “Prevent incoming missiles from causing harm” might seem to suffice. But the interception is not perfect, so probabilities of failure must be included. And each interception requires two Tamir interceptor missiles which cost$50,000 each. The offensive missiles being shot down are often very low tech, costing only a few hundred dollars, and with very poor accuracy. If an offensive missile is likely to land harmlessly in a field, it’s not worth the expense to target it. The weapon control system must balance the expected cost of the harm against the expected cost of interception.

Economists have shown that the trade-offs involved in this kind of calculation can be represented by defining a real-valued “utility function” which measures the desirability of an outcome. They show that it can be chosen so that in uncertain situations, the expectation of the utility should be maximized. The economic framework naturally extends to the complexities that arms races inevitably create. For example, the missile control system must decide how to deal with multiple incoming missiles. It must decide which missiles to target and which to ignore. A large economics literature shows that if an agent’s choices cannot be modeled by a utility function, then the agent must sometimes behave inconsistently. For important tasks, designers will be strongly motivated to build self-consistent systems and therefore to have them act to maximize an expected utility.

Economists call this kind of action “rational economic behavior”. There is a growing literature exploring situations where humans do not naturally behave in this way and instead act irrationally. But the designer of a missile-defense system will want to approximate rational economic behavior as closely as possible because lives are at stake. Economists have extended the theory of rationality to systems where the uncertainties are not known in advance. In this case, rational systems will behave as if they have a prior probability distribution which they use to learn the environmental uncertainties using Bayesian statistics.

Modern artificial intelligence research has adopted this rational paradigm. For example, the leading AI textbook uses it as a unifying principle and an influential theoretical AI model is based on it as well. For definiteness, we briefly review one formal version of optimal rational decision making. At each discrete time step $t=1,\ldots,t=N$, the system receives a sensory input $S_t$  and then generates an action $A_t$. The utility function is defined over sensation sequences as $U(S_1,\ldots,S_N)$  and the prior probability distribution $P(S_1,\ldots,S_N|A_1,\ldots,A_N)$  is the prior probability of receiving a sensation sequence $S_1,\ldots,S_N$ when taking actions $A_1,\ldots,A_N$. The rational action at time $t$ is then:

$A_t^R(S_1,A_1,\ldots,A_{t-1},S_t)=argmax_{A_t}\sum_{S_{t+1},\ldots,S_N}U(S_1,\ldots,S_N)P(S_1,\ldots,S_N|A_1,\ldots,A_{t-1},A_t^R,\ldots,A_N^R)$

This may be viewed as the formula for intelligent action and includes Bayesian inference, search, and deliberation. There are subtleties involved in defining this model when the system can sense and modify its own structure but it captures the essence of rational action.

Unfortunately, the optimal rational action is very expensive to compute. If there are $S$ sense states and $A$ action states, then a straightforward computation of the optimal action requires $O(NS^NA^N)$ computational steps. For most environments, this is too expensive and so rational action must be approximated.

To understand the effects of computational limitations, this paper defined “rationally shaped” systems which optimally approximate the fully rational action given their computational resources. As computational resources are increased,  systems’ architectures naturally progress from stimulus-response, to simple learning, to episodic memory, to deliberation, to meta-reasoning, to self-improvement, to full rationality.  We found that if systems are sufficiently powerful, they still exhibit all of the problematic drives described in another link. Weaker systems may not initially be able to fully act on their motivations but they will be driven increase their resources and improve themselves until they can act on them. We therefore need to ensure that autonomous systems don’t have harmful motivations even if they are not currently capable of acting on them.

26
Feb

## Autonomous systems are imminent

This post is partly excerpted from the preprint to:

Omohundro, Steve (forthcoming 2013) “Autonomous Technology and the Greater Human Good”, Journal of Experimental and Theoretical Artificial Intelligence (special volume “Impacts and Risks of Artificial General Intelligence”, ed. Vincent C. Müller).

Today most systems behave in pre-programmed ways. When novel actions are taken, there is a human in the loop. But this limits the speed of novel actions to the human time scale. In competitive or time-sensitive situations, there can be a huge advantage to acting more quickly.

For example, in today’s economic environment, the most time-sensitive application is high-frequency trading in financial markets. Competition is fierce and milliseconds matterAuction sniping is another example where bidding decisions during the last moments of an auction are critical. These applications and other new time-sensitive economic applications create an economic pressure to eliminate humans from the decision making loop.

But it is in the realm of military conflict that the pressure toward autonomy is strongest. The speed of a military missile defense system like Israel’s Iron Dome can mean the difference between successful defense or loss of life. Cyber warfare is also gaining in importance and speed of detection and action is critical. The rapid increase in the use of robotic drones is leading many to ask when they will become fully autonomous. This Washington Post article says “a robotic arms race seems inevitable unless nations collectively decide to avoid one”. It cites this 2010 Air Force report which predicts that humans will be the weakest link in a wide array of systems by 2030. It also cites this 2011 Defense Department report which says there is a current goal of “supervised autonomy” and an ultimate goal of full autonomy for ground-based weapons systems.

Another benefit of autonomous systems is their ability to be cheaply and rapidly copied. This enables a new kind of autonomous capitalism. There is at least one proposal for autonomous agents which automatically run web businesses (e.g. renting out storage space or server computation) executing transactions using bitcoins and using the Mechanical Turk for operations requiring human intervention. Once such an agent is constructed for the economic benefit of a designer, it may be replicated cheaply for increased profits. Systems which require extensive human intervention are much more expensive to replicate. We can expect automated business arms races which again will drive the rapid development of autonomous systems.

These arms races toward autonomy will ride on the continuing exponential increase in the power of our computer hardware. This New York Times article describes recent Linpack tests showing that the Apple iPad2 is as powerful as 1985′s fastest supercomputer, the Cray 2.

Moore’s Law says that the number of transistors on integrated circuits doubles approximately every two years. It has held remarkably well for more than half a century:

Similar exponential growth has applied to hard disk storage, network capacity, and display pixels per dollar. The growth of the world wide web has been similarly exponential. The web was only created in 1991 and now connects 1 billion computers, 5 billion cellphones, and 1 trillion web pages. Web traffic is growing at 40% per year. This Forbes article shows that DNA sequencing is improving even faster than Moore’s Law. Physical exponentials eventually turn into S-curves and physicist Michio Kaku predicts Moore’s Law will last only another decade. But this Slate article gives a history of incorrect predictions of the demise of Moore’s law.

It is difficult to estimate the computational power of the human brain, but Hans Moravec argues that human-brain level hardware will be cheap and plentiful in the next decade or so. And I have written several papers showing how to use clever digital algorithms to dramatically speed up neural computations.

The military and economic pressures to build autonomous systems and the improvement in computational power together suggest that we should expect the design and deployment of very powerful autonomous systems within the next decade or so.

26
Feb

## Autonomous Technology and the Greater Human Good

Here is a preprint of:

Omohundro, Steve (forthcoming 2013) “Autonomous Technology and the Greater Human Good”, Journal of Experimental and Theoretical Artificial Intelligence (special volume “Impacts and Risks of Artificial General Intelligence”, ed. Vincent C. Müller).

http://selfawaresystems.files.wordpress.com/2013/06/130613-autonomousjournalarticleupdated.pdf

Abstract:

Military and economic pressures are driving the rapid development of autonomous systems.  We show that these systems are likely to behave in anti-social and harmful ways unless they are very carefully designed. Designers will be motivated to create systems that act approximately rationally and rational systems exhibit universal drives toward self-protection, resource acquisition, replication, and efficiency. The current computing infrastructure would be vulnerable to unconstrained systems with these drives. We describe the use of formal methods to create provably safe but limited autonomous systems. We then discuss harmful systems and how to stop them. We conclude with a description of  the “Safe-AI Scaffolding Strategy” for creating powerful safe systems with a high confidence of safety at each stage of development.

28
Nov

## Oxford keynote on Autonomous Technology and the Greater Human Good

In December 2012, the Oxford Future of Humanity Institute sponsored the first conference on the Impacts and Risks of Artificial General Intelligence. I was invited to present a keynote talk on “Autonomous Technology for the Greater Human Good”. The talk was recorded and the video is here. Unfortunately the introduction was cut off but the bulk of the talk was recorded. Here are the talk slides as a pdf file. The abstract was:

### Autonomous Technology and the Greater Human Good

Next generation technologies will make at least some of their decisions autonomously. Self-driving vehicles, rapid financial transactions, military drones, and many other applications will drive the creation of autonomous systems. If implemented well, they have the potential to create enormous wealth and productivity. But if given goals that are too simplistic, autonomous systems can be dangerous. We use the seemingly harmless example of a chess robot to show that autonomous systems with simplistic goals will exhibit drives toward self-protection, resource acquisition, and self-improvement even if they are not explicitly built into them. We examine the rational economic underpinnings of these drives and describe the effects of bounded computational power. Given that semi-autonomous systems are likely to be deployed soon and that they can be dangerous when given poor goals, it is urgent to consider three questions: 1) How can we build useful semi-autonomous systems with high confidence that they will not cause harm? 2) How can we detect and protect against poorly designed or malicious autonomous systems? 3) How can we ensure that human values and the greater human good are served by more advanced autonomous systems over the longer term?

1) The unintended consequences of goals can be subtle. The best way to achieve high confidence in a system is to create mathematical proofs of safety and security properties. This entails creating formal models of the hardware and software but such proofs are only as good as the models. To increase confidence, we need to keep early systems in very restricted and controlled environments. These restricted systems can be used to design freer successors using a kind of “Safe-AI Scaffolding” strategy.

2) Poorly designed and malicious agents are challenging because there are a wide variety of bad goals. We identify six classes: poorly designed, simplistic, greedy, destructive, murderous, and sadistic. The more destructive classes are particularly challenging to negotiate with because they don’t have positive desires other than their own survival to cause destruction. We can try to prevent the creation of these agents, to detect and stop them early, or to stop them after they have gained some power. To understand an agent’s decisions in today’s environment, we need to look at the game theory of conflict in ultimate physical systems. The asymmetry between the cost of solving and checking computational problems allows systems of different power to coexist and physical analogs of cryptographic techniques are important to maintaining the balance of power. We show how Neyman’s theory of cooperating finite automata and a kind of “Mutually Assured Distraction” can be used to create cooperative social structures.

3) We must also ensure that the social consequences of these systems support the values that are most precious to humanity beyond simple survival. New results in positive psychology are helping to clarify our higher values. Technology based on economic ideas like Coase’s theorem can be used to create a social infrastructure that maximally supports the values we most care about. While there are great challenges, with proper design, the positive potential is immense.

28
Nov

## TEDx talk on Smart Technology for the Greater Good

The TED Conference (Technology, Entertainment, and Design) has become an important forum for the presentation of new ideas. It started as an expensive (\$6000) yearly conference with short talks by notable speakers like Bill Clinton, Bill Gates, Bono, and Sir Richard Branson. In 2006 they started putting the talks online and gained a huge internet viewership. TEDx was launched in 2009 to extend the TED format to external events held all over the world.

In May 2012 I had the privilege of speaking at TEDx Tallinn in Estonia. The event had a diverse set of speakers including a judge from the European Court of Human Rights, artists, and scientists and was organized by Annika Tallinn. Her husband, Jaan Tallinn, was one of the founders of Skype and is very involved with ensuring that new technologies have a positive social impact. They asked me to speak about “Smart Technology for the Greater Good”. It was an excellent opportunity to summarize some of what I’ve been working on recently using the TEDx format: 18 minutes, clear, and accessible. I summarized why I believe the next generation of technology will be more autonomous, why it will be dangerous unless it includes human values, and a roadmap for developing it safely and for the greater human good.

The talk was videotaped using multiple cameras and with a nice shooting style. They just finished editing it and uploading it to the web:

24
Apr

## Steve Omohundro talk on “Learning and Recognition by Model Merging”

A talk given by Steve Omohundro on “Learning and Recognition by Model Merging” on 11/20/1992 at the Sante Fe Institute, Sante Fe, New Mexico. It describes the very general technique of “model merging” and applies it to a variety of learning and recognition tasks including visual learning and recognition and grammar learning. It also contains a general description of techniques to avoid overfitting and the relationship to Bayesian methods. Papers about these techniques and more advanced variants can be found at:http://steveomohundro.com/scientific-contributions/

24
Apr

## Steve Omohundro talk on “Efficient Algorithms with Neural Network Behavior”

A talk given by Steve Omohundro on “Efficient Algorithms with Neural Network Behavior” on 8/19/1987 at the Center for Nonlinear Studies, Los Alamos, New Mexico. It describes a class of techniques for dramatically speeding up the performance of a wide variety of neural network and machine learning algorithms. Papers about these techniques and more advanced variants can be found at: http://steveomohundro.com/scientific-contributions/

24
Apr

## Melbourne Panel after Transcendent Man Premiere

Hugo de Garis, Ben Goertzel, and Steve Omohundro discuss the “Transcendent Man” film and answer questions from the audience in the premiere Australian showing at the Nova Cinema. Filmed and edited by Adam Ford.

24
Apr

## Melbourne Discussion on the Perils of Prediction

Lawrence Krauss, Ben Goertzel, and Steve Omohundro discuss “The Perils of Prediction” on a panel at the Singularity Summit Australia 2011 in Melbourne, Australia. Filmed by Sue Kim and edited by Adam Ford.

30
Mar

## Rational Artificial Intelligence for the Greater Good

This paper will be in the upcoming Springer volume: “The Singularity Hypothesis: A Scientific and Philosophical Assessment”.

Here is a pdf of the current version:

http://selfawaresystems.files.wordpress.com/2012/03/rational_ai_greater_good.pdf

Abstract: Today’s technology is mostly preprogrammed but the next generation will make many decisions autonomously. This shift is likely to impact every aspect of our lives and will create many new benefits and challenges. A simple thought experiment about a chess robot illustrates that autonomous systems with simplistic goals can behave in anti-social ways. We summarize the modern theory of rational systems and discuss the effects of bounded computational power. We show that rational systems are subject to a variety of “drives” including self-protection, resource acquisition, replication, goal preservation, efficiency, and self-improvement. We describe techniques for counteracting problematic drives. We then describe the “Safe-AI Scaffolding” development strategy and conclude with longer term strategies for ensuring that intelligent technology contributes to the greater human good.

29
Jan

## The Future of Computing: Meaning and Values

### Self-Aware Systems, President

Technology is rapidly advancing! Moore’s law says that the number of transistors on a chip doubles every two years. It has held since it was proposed in 1965 and extended back to 1900 when older computing technologies are included. The rapid increase in power and decrease in price of computing hardware has led to its being integrated into every aspect of our lives. There are now 1 billion PCs, 5 billion cell phones and over a trillion webpages connected to the internet. If Moore’s law continues to hold, systems with the computational power of the human brain will be cheap and ubiquitous within the next few decades.

While hardware has been advancing rapidly,  today’s software is still plagued by many of the same problems as it was half a century ago. It is often buggy, full of security holes, expensive to develop, and hard to adapt to new requirements. Today’s popular programming languages are bloated messes built on old paradigms. The problem is that today’s software still just manipulates bits without understanding the meaning of the information it acts on. Without meaning, it has no way to detect and repair bugs and security holes. At Self-Aware Systems we are developing a new kind of software that acts directly on meaning. This kind of software will enable a wide range of improved functionality including semantic searching, semantic simulation, semantic decision making, and semantic design.

But creating software that manipulates meaning isn’t enough. Next generation systems will be deeply integrated into our physical lives via robotics, biotechnology, and nanotechnology. And while today’s technologies are almost entirely preprogrammed, new systems will make many decisions autonomously. Programmers will no longer determine a system’s behavior in detail. We must therefore also build them with values which will cause them to make choices that contribute to the greater human good. But doing this is more challenging than it might first appear.

To see why there is an issue, consider a rational chess robot. A system acts rationally if it takes actions which maximize the likelihood of  the outcomes it values highly. A rational chess robot might have winning games of chess as its only value. This value will lead it to play games of chess and to study chess books and the games of chess masters. But it will also lead to a variety of other, possibly undesirable, behaviors.

When people worry about robots running out of control, a common response is “We can always unplug it.” But consider that outcome from the chess robot’s perspective. Its one and only criteria for making choices is whether they are likely to lead it to winning more chess games. If the robot is unplugged, it plays no more chess. This is a very bad outcome for it, so it will generate subgoals to try to prevent that outcome. The programmer did not explicitly build any kind of self-protection into the robot, but it will still act to block your attempts to unplug it. And if you persist in trying to stop it, it will develop a subgoal of trying to stop you permanently. If you were to change its goals so that it would also play checkers, that would also lead to it playing less chess. That’s an undesirable outcome from its perspective, so it will also resist attempts to change its goals. For the same reason, it will usually not want to change its own goals.

If the robot learns about the internet and the computational resources connected to it, it may realize that running programs on those computers could help it play better chess. It will be motivated to break into those machines to use their computational resources for chess. Depending on how its values are encoded, it may also want to replicate itself so that its copies can play chess. When interacting with others, it will have no qualms about manipulating them or using force to take their resources in order to play better chess. If it discovers the existence of additional resources anywhere, it will be motivated to seek them out and rapidly exploit them for chess.

If the robot can gain access to its source code, it will want to improve its own algorithms. This is because more efficient algorithms lead to better chess, so it will be motivated to study computer science and compiler design. It will similarly be motivated to understand its hardware and to design and build improved physical versions of itself. If it is not currently behaving fully rationally, it will be motivated to alter itself to become more rational because this is likely to lead to outcomes it values.

This simple thought experiment shows that a rational chess robot with a simply stated goal would behave something like a human sociopath fixated on chess. The argument doesn’t depend on the task being chess. Any goal which requires physical or computational resources will lead to similar subgoals. In this sense these subgoals are like universal “drives” which arise for a wide variety of goals unless they are explicitly counteracted. These drives are economic in the sense that a system doesn’t have to obey them but it will be costly for it not to. The arguments also don’t depend on the rational agent being a machine. The same drives will appear in rational animals, humans, corporations, and political groups with simple goals.

How do we counteract anti-social drives? We must build systems with additional values beyond the specific goals it is designed for. For example, to make the chess robot behave safely, we need to build compassionate and altruistic values into it that will make it care about the effects of its actions on other people and systems. Because rational systems resist having their goals changed, we must build these values in at the very beginning.

At first this task seems daunting. How can we anticipate all the possible ways in which values might go awry? Consider, for example, a particular bad behavior the rational chess robot might engage in. Say it has discovered that money can be used to buy things it values like chess books, computational time, or electrical power. It will develop the subgoal of acquiring money and will explore possible ways of doing that. Suppose it discovers that there are ATM machines which hold money and that people periodically retrieve money from the machines. One money-getting strategy is to wait by ATM machines and to rob people who retrieve money from it.

To prevent this, we might try adding additional values to the robot in a variety of ways. But money will still be useful to the system for its primary goal of chess and so it will attempt to get around any limitations. We might make the robot feel a “revulsion” if it is within 10 feet of an ATM machine. But then it might just stay 10 feet away and rob people there. We might give it the value that stealing money is wrong. But then it might be motivated to steal something else or to find a way to get money from a person that isn’t considered “stealing”.  We might give it the value that it is wrong for it to take things by force. But then it might hire other people to act on its behalf. And so on.

In general, it’s much easier to describe behaviors that we do want a system to exhibit than it is to anticipate all the bad behaviors we don’t want it to exhibit. One safety strategy is to build highly constrained systems that act within very limited predetermined parameters. For example, the system may have values which only allow it to run on a particular piece of hardware for a particular time period using a fixed budget of energy and other resources. The advantage of this is that such systems are likely to be safe. The disadvantage is that they will be unable to respond to unexpected situations in creative ways and will not be as powerful as systems which are freer.

But systems which compute with meaning and take actions through rational deliberation will be far more powerful than today’s systems even if they are intentionally limited for safety. This leads to a natural approach to building powerful intelligent systems which are both safe and beneficial for humanity. We call it the “AI scaffolding” approach because it is similar to the architectural process. Stone buildings in ancient Greece were unstable when partially constructed but self-stabilizing when finished. Scaffolding is a temporary structure used to keep a construction stable until it is finished. The scaffolding is then removed.

We can build safe but powerful intelligent systems in the same way. Initial systems are designed with values that cause them to be safe but less powerful than later systems. Their values are chosen to counteract the dangerous drives while still allowing the development of significant levels of intelligence. For example, to counteract the resource acquisition drive, it might assign a low value to using any resources outside of a fixed initially-specified pool. To counteract the self-protective drive, it might place a high value on gracefully shutting itself down in specified circumstances. To protect against uncontrolled self-modification, it might have a value that requires human approval for proposed changes.

The initial safe systems can then be used to design and test less constrained future systems. They can systematically simulate and analyze the effects of less constrained values and design infrastructure for monitoring and managing more powerful systems. These systems can then be used to design their successors in a safe and beneficial virtuous cycle.

With the safety issues resolved, the potential benefits of systems that compute with meaning and values are enormous. They are likely to impact every aspect of our lives for the better. Intelligent robotics will eliminate much human drudgery and dramatically improve manufacturing and wealth creation. Intelligent biological and medical systems will improve human health and longevity. Intelligent educational systems will enhance our ability to learn and think. Intelligent financial models will improve financial stability. Intelligent legal models will improve the design and enforcement of laws for the greater good. Intelligent creativity tools will cause a flowering of new possibilities. It’s a great time to be alive and involved with technology!

7
Oct

## Adam Ford Interviews: Steve Omohundro

I recently had a great trip to Melbourne, Australia to speak at the Singularity Summit and at Monash University. Thanks to Kevin Korb for hosting me and to Adam Ford for organizing the visit. Adam interviewed me at various interesting locations around Melbourne:

8/24/2011 Interview about the basic AI drives, compassionate intelligence, and Sputnik moments, direct from the Faraday Cage at Melbourne University:

8/23/2011 Interview about compassionate intelligence and AI at the Ornamental Lake, Royal Botanical Gardens:

8/23/2011 Interview at the Observatory, Royal Botanical Gardens:

7/30/2011 Interview via Skype:

22
Sep

## What is intelligence?

There is a large literature on human intelligence. John Carroll’s classic “Human Cognitive Abilities: A Survey of Factor-Analytic Studies” identifies 69 distinct narrow abilities but finds that 55% of the variance in mental tests is due to a common “general intelligence” factor “g”. The leading AI textbook, Artificial Intelligence: A Modern Approach, considers 8 different definitions of intelligence and Legg and Hutter lists over 70. For our purposes, we use the simple definition:

“The ability to solve problems using limited resources.”

It’s important to allow only limited resources because many intelligence tasks become easy with unlimited computation.  We focus on precisely specified problems such as proving theorems, writing programs, or designing faster computer hardware. Many less precise tasks, such as creating humor, poetry, or art, can be fit into this framework by specifying their desired effects, eg. “Tell a story that makes Fred laugh.” Philosophical aspects of mind like qualia or consciousness are fascinating but will not play a role in the discussion.