Artificial Intelligence
 

 

Artificial Intelligence

By Peter Gransee     11/25/06 PM

For the past 50 years or so, people have been saying that strong AI is coming soon. But it hasn’t yet. The problem is likely more complicated than we originally thought. However, we understand more about it each day and we will eventually be able to develop strong AI. 

 

Strong AI equal to about 95% human complexity may take 30-50 years. 100% or more will take much longer.

 

Complexity can be measured by various means including the size of the undeveloped construct when an ideal compression algorithm (DNA?) is applied. Another way of measuring complexity is the number of functions it can produce in a standard enviroment.

 

“Undeveloped” meaning not having been exposed as an uncompressed construct to the world around us. This precludes the years of experience that an adult has acquired.

 

It would be difficult and risky for the machine to become more complex than humans. Building a system that could evolve (in the popular sense of evolution) would likely lead to that system's demise due to noise. However, even with these limits there is a lot of room for useful machines.

 

There are multiple paths to produce AI. Two common methods are neural networks and expert systems.

 

Neural networks are more adaptable than expert systems but require more complexity. Expert systems are more affordable for applications that don’t require a high amount of adaptability.

 

1/10/2008

"A bird is an instrument working according to a mathematical law," wrote Da Vinci. "It lies within the power of man to make this instrument with all its motions."

"If the human mind was simple enough to understand, we would be too simple to understand it. " - Emerson Pugh

Of course, there is something to be said for teamwork and "pretty darn close". In fact, artificial products often lack the complexity of that thing it emulates but nevertheless in somes cases exceed the capabilities of the original in some narrow but very useful way.

--

The 95/5 rule as applies to cognitive computing

This "rule" estimates that 95% of the applications require 5% of the complexity and visa versa. Essentially, there appears to be a steeply rising curve of complexity as you add more applications. Not all it lost however. This curve does open the door for expert systems. Using a "black box" approach, you can simulate most of the in/out relationships using a much less complex system of rules. It should be said though, that even this, "simple rule set" is quite mind boggling.  IE, coming up with this 5% of artificial complexity will take say 1000% of a given individual's natural complexity. But then when completed, you can hit the "copy" button.

Such an expert system can be described as basically a set of rules that defines the link between a particular input and desired output. Of course, these systems don't do that well by themselves. They are packaged with supporting systems that manage context (pattern recognition, etc), manage the link system, etc.

For example context: A simple input like, "no", is a lever that activates a larger set of automated inputs. This is context. The combined input then is sent to the database. But most of the really interesting inputs always seem to not have a direct reference. So a commonly used function is one that correctly scores the best link. Scoring can be done in multiple passes where a confident score can make a neighboring score more firm (or really mess things up).

But to score a link is to, as some level, also understand the input. So there are layers...

Because of the size of the link database there is some value in automating the creation of some links and compressing similiar links. However, the tempation to go after low hanging fruit is often met by the ouch of being smacked by the unavoidable complexity of the system.

Want more? Offer me a job. :)

4/3/2008

Another teaser... Finding shortcuts to populating an Expert System database is a NP Complete type problem. You can either produce a "Chinese Room" map of all the links (might take awhile) or come up with a bunch of rules that can be used to calculate the link in a series of steps. The irony is that the rules might take longer to develop than the map.

Taking an example from Evolution, if the rules are fairly low in complexity compared to the map, then they probably can't produce the map.

 5/28/08

Developing on that thought...

I was looking at Tic tac toe (noughts and crosses) back in 2004 while learning about the Minimax theorum. There are many versions of TTT on the internet and many of the authors claim their version is unbeatable. However, many of these "unbeatable" game engines can be beat. In my search however, I did find the excellent work of Jesper Juul. He calculated all 255,168 possible games for 9 square tic tac toe. He then built an engine that recognized the current game and selected the track(s) that didn't end in a loss. The db of possible games is about 50 megabytes in size and it takes an average desktop a few seconds to churn through 50 megs and calculate a move. For my uses, this engine provided a ready example of the proper method for each type of game (a "fitness test"). I then distilled the first 2 moves into a simple set of rules and then used another set of rules for moves beyond the second move. The result was a messy but simple perl program less than 13k in size (no seperate db required) that produces an identical unbeatable result using significantly less space and processor cycles. This perl program can be downloaded here (2.4k zip). Give it a try and see if you can win.

Now that was a fun exercise and some of the concepts scale and some do not.

With more complex expert systems, like those parsing natural language input, some researchers are approaching the problem by either the brute force mapping of all the possible permutations or trying to come up with a set of rules that produce a similiar result. 

6/5/08

Neural networks are not my specialty but I was looking recently at the effect of limits on connection length (axon/dendrite) to the network architecture. The effect of unlimited connector length is for messages traveling through the network to enjoy a network with many columns. However, if connector length is limited, then messages are localized and clusters only communicate on the periphery. The network is an overall flat ("2D") network with clusters of 3D networks. There is a lack of overall depth or deep columns. In a human brain, this could produce interesting effects (it is theorized that the shortage of long connections is one of the factors in Autism Spectrum Disorder and Asperger's). This could limit a patients ability to, "see the big picture" while still enjoying a high level of detail within tasks.

I suspect the actual system is much more complex than that (I am reminded of examples of systems that originally appeared simple but turned out to have a whole 'nother world of complexity), but thinking about the length of connections does bring up some interesting questions.

7/17/08

When most people thing of AI, they think of a computer they can have a normal conversation with. This is a system that has a sufficient level of Natural Language Processing combined with some other modules.

 

Just how complex is the task? Say we limit the vocabulary to 100k words (most people can use a multiple that), sentences to less than 6 words and conversations to less than 6 sentences, assume 10% of the permutations follow proper grammar, still leaves us with a number of possible topics that is 1 with over 20 zeros after it. A good sized number. Assuming the theoretical (Shannon) max lossless compression, each word in that large db will take about 5-8bits of information and each of the many conversations will take about 23 bytes each. Multiple 23 by about 20 zeros and you begin to get some idea. Roughly 23,000,000,000 terabytes. Again this is an estimate, I am surely off by a few zeroes. But suffice to say, such a computer would require more memory than any system known today.

 

You will notice I too believe that to correctly map an input to an output requires that the full complexity of a set of text be represented somewhere in the system. This means that the size of the system starts at the size of the pattern you seek to recognize times the compression method you are using and goes up from there (depending on your overhead). And there is a limit to how much you can compress the size of the text set without loss. For more along this vein of thought, read about the Hutter Prize involving text compression.

 

Now even the human mind, with it's very brilliant design, employs a lot of lossly compression. To fit most topics on to even our most powerful computer system would also require lossy compression. Basically the pattern to be recognized would be represented in the system using compression of >99% with quite a few 9's to the right of the decimal point. This is like a single drop in a large swimming pool.

 

So when someone demos a new AI program and they give the impression in the demo that their system can handle a good chunk of natural language, the size of their database is a good sanity check. "5mb, including the engine" like I heard recently in a certain company's demonstration was not a good sign.

 

The basic point is that a system capable of normal conversation is going to require significant computer resources and/or very lossy compression.

 

Most lossy compression methods (mp3, mpeg4, etc) are successful because they lose part of the data that the human mind also loses. Similar opportunities exist with language. (Studies have shown that people actually prefer a certain amount of lossy compression with music. Colloquial speech may already employ a pleasing level of loss for language.)

 

Even with huge amounts of loss, there is still the trick of determining what to keep to get the most bang for the buck (there is a market for focused systems). And there is the task of building this database. Even a very lossy system can take up petabytes of storage and hundreds or thousands of man-years of effort.

 

9/5/2008

I have noticed that there is a dearth of good email corpora. There is the the Enron corpus, the leaked MediaDefender archives and a few smaller collections. Not enough though.

As an experiment, I have started advertising an offer to purchase email from companies.

The email must have all sensitive data redacted and be free of any legal or privacy concern. Ideally, I would have a good amount of messages per topic domain and a wide variety of domains. If a useful collection can be formed, I may offer this as a resource to the cognitive computing community.

3/13/2009

I have been working on a 2-person chat corpus for several months now. I haven't been able to find anything like this on the Internet. More information here.

6/12/2009

Having to take some time out to pay some bills. Doing some Access DB programming and of course still consulting for various LED flashlight companies. Slowly accumulating data for the corpora. Email is slow going, only have about 20k messages (no where near enough). But I have over 20mb of 2-person chat (filtered from over 1gb of IRC logs). I am willing to share this if you make it worth my while of course. 20mb is a good start but no where near enough (my goal is 100mb of filtered and tagged 2-person chat). 100mb will still be pretty slim but it should provide some clarity on what steps are needed next. 100mb will require over 5gb of raw logs representing over 100k hours of conversation.
 
 
7/22/11
 
We will meet aliens in the future and they will have started as machines of our own making.
 
It is all in the pattern recognition. There are many big problems to solve and these require seeing meta patterns in large datasets. This will breed programs that don't think like we do, see like we do and normally communicate like we do. Computers of the future will be powered by machine code that is completely alien to our understanding.
 
Programmers will be replaced by therapists and artists. The most common connection to machines will be the mundane but to truly understand will require a transcendent connection.
 
 

4/4/12 AM

Words as Shapes

It has been said that the basic machines are the lever and incline (there are other lists).  Archimedes said that if he had a lever long enough he could move the earth.

But the lever is a bit too complicated in my opinion to be a basic machine.

Maybe the most basic machine is a contour. It either is moved by force (lever) or force causes other objects to interact with it (incline). Other contours can cause the force to interact around a pivot (wheel or lever), slide (incline), combinations (screw, hinge, etc).

Or said in another way, the most basic machine is the incline combined with any other object. Together work can be done in moving objects, changing their shape, etc.

The purpose of an incline is to redirect force and to allow a small population of objects to effect a larger population (and in doing so redistribute the force).

 

Electromagnetic: voltage vs current. Electron population

Optics: beam area vs intensity. Photon population

Force: velocity vs torque. Atom population

Information:  information vs complexity. Token population

Ideas: quality versus quantity. Unit of value

 

Basic units of nature times energy level. Energy level is expressed at vibration (calorie), charge (columb), state (photon), velocity (m/s), etc.

The contour is the basic component of a Turing machine. Combine with sufficient conversion cycles and memory, it can solve any problem. Of course, the universe is not big enough for even such a machine to solve every problem but you get the idea.

What is the unit of population in language and what do the inclines look like? I think tokens are the basic unit and the inclines are tokens that are reactive to other tokens. Relationships include temporal.