message6

Date: Mon, 26 May 2003 16:43:42 -0500 (CDT)

From: Bill Hibbard <test@demedici.ssec.wisc.edu>

Reply-To: sl4@sl4.org

To: sl4@sl4.org

Subject: RE: SIAI's flawed friendliness analysis

This discussion has split into many threads, and I'll bring

them together into this single response. Ben's comments are

a good starting point for this, and I'll address all the

recent questions.

On Fri, 23 May 2003, Ben Goertzel wrote:

> There are a lot of good points and interesting issues mixed up here, but I

> think the most key point is the division between

>

> -- those who believe a hard takeoff is reasonably likely, based on a radical

> insight in AI design coupled with a favorable trajectory of self-improvemetn

> of a particular AI system

>

> -- those who believe in a soft takeoff, in which true AI is approached

> gradually [in which case government regulation, careful peer review and so

> forth are potentially relevant]

>

> The soft takeoff brings with it many obvious possibilities for safeguarding,

> which are not offered in the hard takeoff scenario. These possibilities are

> the ones Bill Hibbard is exploring, I think. A lot of what SIAI is saying

> is more relevant to the hard takeoff scenario, on the other hand.

>

> My own projection is a semi-hard takeoff, which doesn't really bring much

> reassurance...

I think we'll eventually get to a time (the singularity) when

intelligence increases very quickly to very high levels. But I

think it will take a long time to get there, during a sort of

soft takeoff. In particular it will be years or even decades

from the first intelligent machines until the true singularity,

and it could be decades from now until the first intelligent

machines. I agree with Donald Norman that people tend to

overestimate the short-term progress of technological change,

and underestimate the long-term effects.

I think real intelligence is decades away because no current

research is making any real progress on the grounding problem,

which is the problem of grounding symbols in sensory

experience and grounding reasoning and planning in learning.

That is, you cannot reason intelligently about horses unless

the word horse is connected to sight, sound, smell and touch

experiences with horses. Solving the grounding problem will

require much faster computers than are being used for current

AI research.

I think there will be years or decades from the first real

machine intelligence until the singularity because of the

likelyhood of difficult technical problems even after the

first signs of machine intelligence, and because of the

amount of learning for intelligence to acheive its true

potential. Applying intelligence effectively (we might call

this wisdom) requires many fine value judgements that can

only be learned from experience. Humans require decades of

learning for their intelligence to mature. A super-intelligent

machine may learn faster, but it may also need a lot more

experience for its super-intelligence to mature (just as

higher animals generally take longer to mature than lower

animals).

There is some chance that the first intelligent machines will

be hidden from the public. But probably not for long, because

they will be built in a wealthy and open society like the U.S.,

with lots of whistle blowers and where exciting news has a way

of getting out. Furthermore, a machine designed as I advocate,

with values for human happiness, or a machine designed as the

SIAI advocates, with a friendliness super-goal, would create

the singularity openly rather than hiding it from humans. It

is hard to imagine a safe singularity created in secret.

There are three broad public policy choices for AI:

1. Prohibit it, as advocated by Bill Joy in his April 2000 Wired

article "Why the Future Doesn't Need Us".

2. Allow it without regulation, as advocated by the SIAI and

most members of the SL4 mailing list.

3. Allow it but regulate it, as I advocate.

I think prohibiting AI is technically impossible and politically

unlikely, and unregulated AI is politically impossible and will

almost certainly be unsafe for humans. So we have no alternative

but to find our way through the difficulties of regulating AI.

In more detail:

1. Prohibit AI.

In his article, Bill Joy is pessimistic about prohibiting AI

because people will want the benefits. It will be politically

difficult to decide the right point to stop a technology whose

development continually creates wealth and relieves people of

the need to work.

As several people have pointed out, it will be technically

impossible to prevent people from building outlaw AIs,

especially as technology matures. The only way to do it

would be to stop technological progress world wide, which

won't happen.

2. Allow AI without regulation.

Ben's question about timing is relevant here. If you think

that the singularity will happen so quickly that the public

and the government won't have time to act to control the

singularity once they realize that machines are becoming

intelligent, then you don't have to worry about regulation

because it will be too late.

If the public and the government have enough time to react,

they will. People have been well primed for the dangers of

AI by science fiction books and movies. When machines start

surprising them with their intelligence, many people will be

freightened and then politicians will get excited. They will

be no more likely to allow unregulated AI than they are to

allow unregulated nuclear power. The only question is whether

they will try to prohibit or regulate AI.

Wealthy and powerful institutions will have motives to build

unsafe AIs. Even generally well-meaning institutions may

fatally compromise safety for mildly selfish motives. Without

broad public insistence on aggressive safety regulation, one

of these unsafe AIs will likely be the seed for the

singularity.

3. Allow AI with regulation.

Ben's question about timing is relevant here too. The need

and political drive for regulation won't be serious until

mchines start exhibiting real intelligence, and that is

decades away. Even if you disagree about the timing, it is

still true that regulation won't interfere with current

research until some project acheives an AI breakthrough. At

the current stage of development, with lots of experiments

but nothing approaching real intelligence, regulation would

be counter-productive.

Like so many things in politics, regulation is the best

choice among a set of bad alternatives. Here is a list of

objections, with my answers:

a. Regulation cannot work because no one can understand my

designs. Government employees are too stupid to understand

designs.

Government employees include lots of very smart people, like

those who worked on the Manhattan Project and those who are

finding cures for diseases. While it is healthy for citizens

to be skeptical of politicians and government, thinking that

all politicians and government employees are stupid is just

an ignorant prejudice.

The regulators will understand designs because the burden

will be on the designers to satisfy regulators (many of whom

will be very smart) of the safety of their designs, as with

any dangerous technology.

Even if some smart designers don't want to cooperate with

regulators, other designers just as smart will cooperate.

b. Regulation will hobble cooperating projects, enabling

non-cooperating unsafe AI projects create the singularity

first.

Non-cooperating projects will be hobbled by the need to hide

their resource use (large computers, smart designers, network

access, etc).

As long as regulation is aggressively enforced, major

corporations and government agencies will cooperate and

bring their huge resources to the effort for safe AI.

The government will have access to very smart people who can

help more than hinder the designers they are inspecting.

Given the importance of AI, it is plausible that the U.S.

government itself will create a project like the Manhattan

Project for developing safe AI, with resources way beyond

those available to non-cooperating groups. Currently, the

U.S. GDP is about $10 trillion, the federal government

budget is about $2.3 trillion, the defense budget is $0.4

trillion, and global spending on information technology is

$3 trillion. When the public sees intelligent machines and

starts asking their elected representatives to do something

about it, and those representatives hear from experts

about the dangers of the singularity, it is easy to imagine

a federal safe AI project with a budget on the scale of

these numbers.

c. A non-cooperating project may destroy the world by

using AI to create a nano-technology "grey goo" attack.

This is possible. But even without AI, there may be a world

destroying attack using nano-technology or genetically

engineered micro-organisms. My judgement is that the

probability of unsafe AI from a lack of regulation (I think

this is close to 1.0) is greater than the marginal increase

in the probability of a nano-technology attack caused by

regulation of AI (as explained in my answer to the previous

objection, active government regulation won't necessarily

slow safe AI down relative to unsafe AI).

d. Even if AI is regulated in most countries, there may

be others where it is not.

This is a disturbing problem. However, the non-democracies

are gradually disappearing, and the democracies are

gradually learning to work together. Hopefully the world

will be more cooperative by the time the singularity

arrives.

Democratic countries are wealthier than non-democracies,

so may create a safe singularity before an unsafe

singularity can be created elsewhere.

e. We can't trust an AI because we can't know what its

thinking. An AI will continue to develop and design

other AIs that are beyond the ability of human

regulators to understand.

There is no way to trace or predict the detailed thoughts

of an AI, but we can make the general prediction that it

will try to satisfy its reinforcement values. The safety

of an AI is primarily determined by its values (its

learning and simulation algorithms also need to be

accurate).

I would trust an AI designed by another safe AI, with

reinforcement values for human happiness. It may decide

that we would be happier if its design was checked by

another independently-designed safe AI, and so seek such

peer review.

f. The intelligence of AIs will be limited by the

ability of human regulators to understand their designs.

This is related to the previous objection. Once we have

safe AIs, we can trust them to design other safe AIs with

greater intelligence, and to verify the safety of each

other's designs.

** There are other objections to the specific form of

regulation that I advocate, rather then regulation in

general:

g. You advocate regulations on reinforcement values, but

some designes don't rely on them.

Based on knowledge of human brains, and on the Solomonoff

Induction model of intelligence, I think the essence of

intelligence is reinforcement learning. Reinforcement

learning is very hard to do effectively in general situations

(like those faced by humans), which leads to all sorts of

design optimizations (e.g., human consciousness) that don't

look much like reinforcement learning. But at base they are

all trying to learn behaviors for satisfying some values.

h. An AI based on reinforcement values for human happiness

can't be any more intelligent than humans.

Values and intelligence are independent. As long as there

is no fixed-length algorithm that optimally satisfies the

values (i.e., values are not just winning at tic-tac-toe or

chess) there is no limit to how much intelligence can be

brought to bear to satisfying the values. In particular,

values for human happiness can drive unlimited intelligence,

given the insatiable nature of human aspirations.

i. Reinforcement values for human happiness are too

specific to humans. An AI should have universal altruism.

Universally altruistic values can only be defined in terms

of symbols (i.e., statements in human language) which must

be grounded in sensory experience before they have real

meaning. An AI will have grounding for language only after

it has done a lot of reinforcement learning, but values

are necessary for such learning. The third point of my

critique of the SIAI friendliness analysis was the lack of

values to reinforce its learning until the meaning of its

friendliness supergoal could be learned.

Reinforcement values for human happiness can be implemented

using current or near-future machine learning technology

for recognizing emotions in human facial expresssions,

voices and body language. These values have grounded

definitions.

I think that a number of current AI efforts underestimate

the importance of solving the grounding problem. This

applies not only to grounding symbols in sensory experience,

but grounding reason and planning in learning. Speculation

about AI values that can only be expressed in language also

fails to appreciate the grounding problem.

There are always trade-offs, with winners and losers, that

must be faced by any set of values, even universal altruism.

That is, in this world there is no behavior that always

gives everyone what they want. I think it is likely that

"universal altruism" is one of those language constructs

that has no realization (like "the set of all sets that do

not contain themselves").

Any set of values that tries to protect interests broader

than human wellfare may motivate an AI behavior that has

negative consequences for humans. In the extreme, the AI

may destroy humanity because of its innate xenophobia or

violence. Some people think this may be the right thing

to do, but I cannot advocate any AI with such a possible

consequence. I only trust values that are grounded in human

wellfare, as expressed by human happiness.

Using human happiness for AI reinforcement values equates

AI values with human values, and keeps humans "in the loop"

of AI thoughts. Human values do gradually evolve, as for

example xenophobia declines (its bad, but not as bad as it

used to be). My own hope is that super-intelligent AIs with

reinforcement values for human happiness will accelerate

the pace of evolution of human values. For example, the AI

will learn that tolerant people are happier than intolerant

people, and promote tolerance in human society.

** Summary

I am sure some people won't accept my answers to these

objections, and be skeptical of regulation. I admit that

regulation is not guaranteed to produce a safe singularity.

But I think the alternatives are worse. In my opinion,

prohibiting AI is impossible, and unregulated AI makes an

unsafe singularity almost certain.

----------------------------------------------------------

Bill Hibbard, SSEC, 1225 W. Dayton St., Madison, WI 53706

test@demedici.ssec.wisc.edu 608-263-4427 fax: 608-263-6738

http://www.ssec.wisc.edu/~billh/vis.html