The risk and safety of AI

Recently, the "risk of AI" has become a hot topic that got high media attention, partly triggered by the open letter of the Future of Life Institute on Research Priorities for Robust and Beneficial Artificial Intelligence. Many different opinions have been proposed, and there is no consensus in sight.

Beside the complexity of the topic itself, this messy situation is largely due to the following factors:
  • The open letter has been widely interpreted as “Prominent Scientists Sign Letter of Warning about AI Risks”, while the intention of the letter was actually to justify and to promote AI research [1].
  • In this discussion, the term “AI” has been used in very different meanings by different groups.
Originally, AI started as an attempt to build “thinking machines” that are comparable to the human mind. However, the problem turned out to be very difficult, and the projects directly targeting it all failed (with the “Fifth Generation Computer Systems” as the best-known example). Consequently, decades ago mainstream AI had turned to the study of domain-specific problem-solving methods, which in principle are not different from those used in conventional computer systems. Here the label “AI” is used just to indicate that the problems are those that can be solved by the human mind.

Under this interpretation of “AI”, the risks of this technique is not that different from those of conventional computer systems, such as whether the design and implementation match the specification, whether the system can avoid abuse and misuse, etc. [1]. With the recent progress in the field, especially in machine learning, many AI researchers believe that more and more problems can be solved by extending and combining the current theories and methods, so as to approach “human-level intelligence”. Since such a system’s behaviors are still determined by its design, they will be under control as usual. This is the case even for the learning systems, since the learning processes are specified by predetermined algorithms, and guided by training data, given reward signals, and/or implanted primary motivations. So the responsibility of making “beneficial AI” falls on the shoulder of the system designers, and the task is considered as difficult, but not impossible [1].

However, this is not the “AI” in the mind of the people outside the field. To some of them, AI, in its ultimate form, will be omniscient and omnipotent, and therefore will become harmful to the human beings, or at least uncontrollable and unpredictable. To prevent such a disaster, there are already demands that the AI researchers must prove the safety of their systems before actually building them.

Given the fact that this ultimate form of AI, which is often called “Strong AI”, “Human-level AI”, or “Artificial General Intelligence (AGI)”, has been studied only by a small number of researchers, and the systems developed so far are still immature, at the current moment no one can really prove what such a system, if can be built, will or will not do. However, at least there is a third possibility that is fundamentally different from the above “business-as-usual” and the “Frankenstein” scenario of future AI.  Such a version of AI has been presented in [2] (as well as in this book), and some other researchers have expressed similar ideas.

According to this opinion, “intelligence” is taken to be the ability for a system to adapt to its environment while working with insufficient knowledge and resources. A computer system with such ability works in a way that is fundamentally different from the traditional systems that are built to carry out “computations”, i.e., solving a problem in a predetermined way. An adaptive system’s behaviors are determined both by its nature (i.e., initial design) and its nurture (i.e., postnatal experience). Though it is still possible to give the system certain innate beliefs and motivations, they will not fully determine the system’s behaviors. When facing each problem, the system’s response is mostly decided by its beliefs and motivations evolved from the innate ones and shaped by its “personal” experience. No matter how powerful it becomes, the system will always be restricted by available knowledge and resources, so cannot guarantee the absolute correctness or optimality of its solutions, nor can it accurately foresee all consequences of its actions.

However, this does not mean that such a system behaves randomly and becomes useless for us. Being adaptive means to behave according to experience, and such a system can be useful for situations where the behaviors of the system cannot be predetermined by its designer. Such a model is closer to the reality of the human mind, so have important theoretical value that is not provided by the existing AI models.

Since mainstream AI techniques are basically about repeatable (algorithmic) problem-solving processes, they do not naturally cover the design of adaptive systems. Even the existing “machine learning” techniques mostly address special types of learning, where a system’s behaviors change during the training process, but eventually converge to a stable input-output function, while an adaptive process does not necessarily converge to such a function.

One direct implication is that the behaviors of an adaptive system cannot be determined solely by its design. It is just like that it is impossible to tell for sure whether an ordinary human baby will grow into a nice person or a mean person, since that will be mostly determined by the baby’s future experience. At the same time, it is incorrect to take this uncertainty as complete unpredictability.
As far as the issue of “AI Safety” is concerned, the opinions to be argued here are:
  • A truly intelligent system will be an adaptive system, whose behaviors are determined both by its design and its experience, but not by either of the two aspects alone. Therefore, it is impossible to “design a beneficial AI”, since no matter what innate beliefs and motivations are implanted into the system, they cannot fully bound the system’s behaviors, as far as the system is truly intelligent. This situation cannot be changed by adding a training phase into the design process.
  • Like human beings, the beliefs and motivations of an adaptive system will be mostly learned rather than innate. So the correct question to ask is “how to raise a beneficial AI?” The education of adaptive AIs will be like that of human beings in that it will be lifelong, incremental, open-ended, and do not follow any fixed procedure. Even when an AI has processing powers greatly exceeding that of the human brain, we can still influence, though not completely control, its behavior via its experience. It is too early to speculate the details before the design becomes mature. We cannot know for sure how to educate an adaptive system even before we know how to build it.
Since scientific research is always exploration into unknown territory, we cannot foresee all consequences of AI research. We indeed should proceed carefully, but given its great potential in bring significant benefits (both in theory and in application), the study of AI should not be slowed down or stopped by panic generated by misconception or ignorance. 

[1] Stuart Russell, et al., “Research priorities for robust and beneficial artificial intelligence”, attached document of the FLI Open Letter.

[2] Pei Wang, Rigid Flexibility: The Logic of Intelligence, Springer, 2006.

Comments