message12

From test@demedici.ssec.wisc.edu Mon Jun 5 13:13:38 2006

Date: Mon, 5 Jun 2006 13:01:05 -0500 (CDT)

From: Bill Hibbard <test@demedici.ssec.wisc.edu>

Reply-To: sl4@sl4.org

To: sl4@sl4.org

Cc: wta-talk@transhumanism.org, extropy-chat@lists.extropy.org,

agi@v2.listbox.com

Subject: Re: Two draft papers: AI and existential risk; heuristics and

biases

Eliezer,

> These are drafts of my chapters for Nick Bostrom's forthcoming edited

> volume _Global Catastrophic Risks_. I may not have much time for

> further editing, but if anyone discovers any gross mistakes, then

> there's still time for me to submit changes.

>

> The chapters are:

> . . .

> _Artificial Intelligence and Global Risk_

> http://singinst.org/AIRisk.pdf

> The new standard introductory material on Friendly AI. Any links to

> _Creating Friendly AI_ should be redirected here.

In Section 6.2 you quote my ideas written in 2001 for

hard-wiring recognition of expressions of human happiness

as values for super-intelligent machines. I have three

problems with your critique:

1. Immediately after my quote you discuss problems with

neural network experiments by the US Army. But I never said

hard-wired learning of recognition of expressions of human

happiness should be done using neural networks like those

used by the army. You are conflating my idea with another,

and then explaining how the other failed.

2. In your section 6.2 you write:

If an AI "hard-wired" to such code possessed the power - and

[Hibbard, B. 2001. Super-intelligent machines. ACM SIGGRAPH

Computer Graphics, 35(1).] spoke of superintelligence - would

the galaxy end up tiled with tiny molecular pictures of

smiley-faces?

When it is feasible to build a super-intelligence, it will

be feasible to build hard-wired recognition of "human facial

expressions, human voices and human body language" (to use

the words of mine that you quote) that exceed the recognition

accuracy of current humans such as you and me, and will

certainly not be fooled by "tiny molecular pictures of

smiley-faces." You should not assume such a poor

implementation of my idea that it cannot make

discriminations that are trivial to current humans.

3. I have moved beyond my idea for hard-wired recognition of

expressions of human emotions, and you should critique my

recent ideas where they supercede my earlier ideas. In my

2004 paper:

Reinforcement Learning as a Context for Integrating AI Research,

Bill Hibbard, 2004 AAAI Fall Symposium on Achieving Human-Level

Intelligence through Integrated Systems and Research

http://www.ssec.wisc.edu/~billh/g/FS104HibbardB.pdf

I say:

Valuing human happiness requires abilities to recognize

humans and to recognize their happiness and unhappiness.

Static versions of these abilities could be created by

supervised learning. But given the changing nature of our

world, especially under the influence of machine

intelligence, it would be safer to make these abilities

dynamic. This suggests a design of interacting learning

processes. One set of processes would learn to recognize

humans and their happiness, reinforced by agreement from

the currently recognized set of humans. Another set of

processes would learn external behaviors, reinforced by

human happiness according to the recognition criteria

learned by the first set of processes. This is analogous

to humans, whose reinforcement values depend on

expressions of other humans, where the recognition of

those humans and their expressions is continuously

learned and updated.

And I further clarify and update my ideas in a 2005

on-line paper:

The Ethics and Politics of Super-Intelligent Machines

http://www.ssec.wisc.edu/~billh/g/SI_ethics_politics.doc

Please adjust your discussion of my ideas to:

1. Not conflate my ideas with others.

2. Not assume a poor implementation of my ideas.

3. Not critique my old ideas when they have been

replaced by newer ideas in my publications.

Thank you,

Bill