message3

From test@demedici.ssec.wisc.edu Mon May 12 05:19:26 2003

Date: Sun, 11 May 2003 18:14:37 -0500 (CDT)

From: Bill Hibbard <test@demedici.ssec.wisc.edu>

Reply-To: agi@v2.listbox.com

To: sl4@sl4.org, agi@v2.listbox.com

Subject: [agi] Re: SIAI's flawed friendliness analysis

On Sat, 10 May 2003, Eliezer S. Yudkowsky wrote:

> Bill Hibbard wrote:

> > This critique refers to the following documents:

> >

> > GUIDELINES: http://www.singinst.org/friendly/guidelines.html

> > FEATURES: http://www.singinst.org/friendly/features.html

> > CFAI: http://www.singinst.org/CFAI/index.html

> >

> > 1. The SIAI analysis fails to recognize the importance of

> > the political process in creating safe AI.

> >

> > This is a fundamental error in the SIAI analysis. CFAI 4.2.1

> > says "If an effort to get Congress to enforce any set of

> > regulations were launched, I would expect the final set of

> > regulations adopted to be completely unworkable." It further

> > says that government regulation of AI is unnecessary because

> > "The existing force tending to ensure Friendliness is that

> > the most advanced projects will have the brightest AI

> > researchers, who are most likely to be able to handle the

> > problems of Friendly AI." History vividly teaches the danger

> > of trusting the good intentions of individuals.

> ...and, of course, the good intentions and competence of governments.

Absolutely. I never claimed that AI safety is a sure thing.

But without a broad political movement for safe AI and its

success in elective democratic government, unsafe AI is a

sure thing.

> . . .

> Your political recommendations appear to be based on an extremely

> different model of AI. Specifically:

> 1) "AIs" are just very powerful tools that amplify the short-term goals

> of their users, like any other technology.

I never said "short-term" - you are putting words into my mouth.

A key property of intelligence is understanding the long-term

effects of behavior on satisfying values (goals).

> 2) AIs have power proportional to the computing resources invested in

> them, and everyone has access to pretty much the same theoretical model

> and class of AI.

AI power does depend on computing resources and efficiency of

algorithms. Important algorithms have proved impossible to keep

secret for any length of time. Whether or not this continues in

the future, essential algorithms for intelligence will not be

secret from powerful organizations.

> 3) There is no seed AI, no rapid recursive self-improvement, no hard

> takeoff, no "first" AI. AIs are just new forces in existing society,

> coming into play a bit at a time, as everyone's AI technology improves at

> roughly the same rate.

Where did this come from? I am very clear in my book about the

importance of proper training for young AIs, and the issues

involved in AI evolution.

> 4) Anyone can make an AI that does anything. AI morality is an easy

> problem with fully specifiable arbitrary solutions that are reliable and

> humanly comprehensible.

I never said building safe AI is easy.

> 5) Government workers can look at an AI design and tell what the AI's

> morality does and whether it's safe.

We certainly expect government workers regulate nuclear energy

designs and operation to ensure their safety. And because of

their doubts about safety, people in the US have decided through

their democratic political process to stop building new nuclear

energy plants.

Nowhere do I claim that regulation of safe AI will be simple.

But if we don't have government workers implementing regulation

of AI under democratic political control, then we will have

unsafe AIs.

> 6) There are variables whose different values correlate to socially

> important differences in outcomes, such that government workers can

> understand the variables and their correlation to the outcomes, and such

> that society expects to have a conflict of interest with individuals or

> organizations as to the values of those variables, with the value to

> society of this conflict of interest exceeding the value to society of the

> outcome differentials that depend on the greater competence of those

> individuals or organizations. Otherwise there's nothing worth voting on.

There will be organizations with motives to build AIs

with values that will correlate with important social

differences in outcome. AIs with values to maximize

profits may end up empowering their owners at everyone

else's expense. AIs with values for military victory

may end up killing lots of people.

> I disagree with all six points, due to a different model of AI.

I think we do have different models of AI. I think an AI

is an information process that has some values that it

tries to satisfy (positive values) and avoid (negative

values). It does this via reinforcement learning and a

simulation model of the world that it uses to solve the

credit assignment problem (i.e., to understand the long

term consequences of its behaviors on its values). Of

course, actually doing this in general circumstances is

very difficult, requiring pattern recognition to greatly

reduce the volume of sensory information, and the

equivalent to human conscious thought to reflect on

situations and find analogies.

The SIAI guidelines involve digging into the AI's

reflective thought process and controlling the AI's

thoughts, in order to ensure safety. My book says the

only concern for AI learning and reasoning is to ensure

they are accurate, and that the teachers of young AIs

be well-adjusted people (subject to public monitoring

and the same kind of screening used for people who

control major weapons). Beyond that, the proper domain

for ensuring AI safety is the AI's values rather than

the AI's reflective thought processes.

In my second and third points I described the lack of

rigorous standards for certain terms in the SIAI

Guidelines and for initial AI values. Those rigorous

standards can only come from the AI's values. I think

that in your AI model you feel the need to control how

they are derived via the AI's reflective thought

process. This is the wrong domain for addressing AI

safety.

Clear and unambiguous initial values are elaborated

in the learning process, forming connections via the

AI's simulation model with many other values. Human

babies love their mothers based on simple values about

touch, warmth, milk, smiles and sounds (happy Mother's

Day). But as the baby's mind learns, those simple

values get connected to a rich set of values about the

mother, via a simulation model of the mother and

surroundings. This elaboration of simple values will

happen in any truly intelligent AI.

I think initial AI values should be for simple

measures of human happiness. As the AI develops these

will be elaborated into a model of long-term human

happiness, and connected to many derived values about

what makes humans happy generally and particularly.

The subtle point is that this links AI values with

human values, and enables AI values to evolve as human

values evolve. We do see a gradual evolution of human

values, and the singularity will accelerate it.

Morality has its roots in values, especially social

values for shared interests. Complex moral systems

are elaborations of such values via learning and

reasoning. The right place to control an AI's moral

system is in its values. All we can do for an AI's

learning and reasoning is make sure they are accurate

and efficient.

> . . .

> > 3. CFAI defines "friendliness" in a way that can only

> > be determined by an AI after it has developed super-

> > intelligence, and fails to define rigorous standards

> > for the values that guide its learning until it reaches

> > super-intelligence

> >

> > The actual definition of "friendliness" in CFAI 3.4.4

> > requires the AI to know most humans sufficiently well

> > to decompose their minds into "panhuman", "gaussian" and

> > "personality" layers, and to "converge to normative

> > altruism" based on collective content of the "panhuman"

> > and "gaussian" layers. This will require the development

> > of super-intelligence over a large amount of learning.

> > The definition of friendliness values to reinforce that

> > learning is left to "programmers". As in the previous

> > point, this will allow wealthy organizations to define

> > intial learning values for their AIs as they like.

> I don't believe a young Friendly AI should be meddling in the real world

> at all. If for some reason this becomes necessary, it might as well do

> what the programmer says, maybe with its own humane veto. I'd trust a

> programmer more than I'd trust an infant Friendly AI, because regardless

> of its long-term purpose, during infancy the FAI is likely to have neither

> a better approximation to humaneness, nor a better understanding of the

> real world.

I agree that young AIs should have limited access to senses

and actions. But in order to "converge to normative altruism"

based on collective content of the "panhuman" and "gaussian"

layers as described in CFAI 3.4.4, the AI is going to need

access to large numbers of humans.

> . . .

> > 4. The CFAI analysis is based on a Bayesian reasoning

> > model of intelligence, which is not a sufficient model

> > for producing intelligence.

> >

> > While Bayesian reasoning has an important role in

> > intelligence, it is not sufficient. Sensory experience

> > and reinforcement learning are fundamental to

> > intelligence. Just as symbols must be grounded in

> > sensory experience, reasoning must be grounded in

> > learning and emerges from it because of the need to

> > solve the credit assignment problem, as discussed at:

> >

> > http://www.mail-archive.com/agi@v2.listbox.com/msg00390.html

> Non-Bayesian? I don't think you're going to find much backing on this

> one. If you've really discovered a non-Bayesian form of reasoning, write

> it up and collect your everlasting fame. Personally I consider such a

> thing almost exactly analogous to a perpetual motion machine. Except that

> a perpetual motion machine is merely physically impossible, while

> "non-Bayesian reasoning" appears to be mathematically impossible. Though

> of course I could be wrong.

I never said "Non-Bayesian", although I find Pei's and Ben's

examples of Non-Bayesian logic in their systems interesting.

What I really meant by my fourth point is that because your

model of intelligence is incomplete, there are things in your

model of friendliness that really belong in your model of

intelligence. For example, recommendation 5 from GUIDELINES 3

"requires that the AI model the causal process that led to

the AI's creation and that the AI use its existing cognitive

complexity (or programmer assistance) to make judgements

about the validity or invalidity of factors in that causal

process." Any sufficiently intelligent brain will have a

simulation model of the world that includes the events that

led to its creation, and will make value judgements about

those events. The failure to do so would be a failure of

intelligence rather than a failure of safety.

I think this confusion between model of intelligence and

model of safety leads to the difficulty of finding rigorous

standards for terms described in my second point, and the

difficulty of finding initial values described in my third

point.

> Reinforcement learning emerges from Bayesian reasoning, not the other way

> around. Sensory experience likewise.

> For more about Bayesian reasoning, see:

> http://yudkowsky.net/bayes/bayes.html

> http://bayes.wustl.edu/etj/science.pdf.html

> Reinforcement, specifically, emerges in a Bayesian decision system:

> http://singinst.org/CFAI/design/clean.html#reinforcement

This describes a Bayesian mechanism for reinforcement learning,

but does not show that reinforcement learning emerges from

Bayesian reasoning. In fact, learning precedes reasoning in

brain evolution. Reasoning (i.e., a simulation model of the

world) evolved to solve the credit assignment problem of

learning.

----------------------------------------------------------

Bill Hibbard, SSEC, 1225 W. Dayton St., Madison, WI 53706

test@demedici.ssec.wisc.edu 608-263-4427 fax: 608-263-6738

http://www.ssec.wisc.edu/~billh/vis.html

-------

To unsubscribe, change your address, or temporarily deactivate your subscription,

please go to http://v2.listbox.com/member/?listname=agi@v2.listbox.com