[comp.ai] The AI Breakthough -- What It Will Be Like !!!

erich@near.cs.caltech.edu (Erich Schneider) (11/01/90)

>>>>> On 31 Oct 90 17:23:10 GMT, rlp@beach.cis.ufl.edu (Trouble) said:


+> Maybe it's a trivial example, but I'll call a computer "intelligent" when
+> I can dial randomly across the radio, drop in on a song, and have it
+> identify the type of music, artist, and song, and do it as quickly as
+> a human being.  Let's see how much processor power it takes to pull
+> that trick.

This is a trivial example, but I get your point. One point I would like to
make is that "processor power" would be the only thing to needed to solve
this problem. Given a list of all of the types of music/groups/songs a
human knows, along with their characteristics, it's just an algorithmic
process (i.e. a Turing machine, by Church's thesis) to perform the ID.

If, however, one could dial randomly across the radio and have the computer do
the identification, and then _without a human manually changing the program_
have the computer do something radically different (e.g. solve a complicated
differential equation or physics problem) just by showing it the problem; 
I would say that program is possibly "intelligent".

Followups to comp.ai. While AI's are part of the cyberpunk literature, this
discussion has moved into a more AI-centered region.






--
erich@tybalt.caltech.edu  or try erich@through.cs.caltech.edu

"Why the hell anybody plug the likes of you into a deck like that? Thing ought
to be in a museum, _you_ ought to be in grade school."
                             -William Gibson, _Count Zero_

weyand@csli.Stanford.EDU (Chris Weyand) (11/01/90)

In <ERICH.90Oct31191951@near.cs.caltech.edu> erich@near.cs.caltech.edu (Erich Schneider) writes:

>>>>>> On 31 Oct 90 17:23:10 GMT, rlp@beach.cis.ufl.edu (Trouble) said:


>+> Maybe it's a trivial example, but I'll call a computer "intelligent" when
>+> I can dial randomly across the radio, drop in on a song, and have it
>+> identify the type of music, artist, and song, and do it as quickly as
>+> a human being.  Let's see how much processor power it takes to pull
>+> that trick.

>This is a trivial example, but I get your point. One point I would like to
>make is that "processor power" would be the only thing to needed to solve
>this problem. Given a list of all of the types of music/groups/songs a
>human knows, along with their characteristics, it's just an algorithmic
>process (i.e. a Turing machine, by Church's thesis) to perform the ID.

So if you give a computer a list of all the characteristics of the letter
'A' (a dubious concept in itself) it's no problem getting it to recognize
various A's?  Tell that to the visual perception people!

I kinda doubt that it's purely processing power.  Maybe recognizing individual
songs could be done by some kind of wave-form pattern matching but this
obviously wouldn't work for identifying the type of music.  How do you list
all the "types" of music?  I don't think you really can.  I mean we classify
things like music with new concepts all the time.  e.g. "Gee that song has 
sort of a Jetson's feel to it" etc.  I'm sure the original poster didn't
mean simply inputting a song and having the computer spit out one of
rap, pop, rock, jazz, classical, etc.  This wouldn't satisfy me at all.
(Although I imagine it's a pretty difficult problem.)  However if the 
program were able to take a song like Faith No More's Epic and tell me that
it's an interesting blend of contemporary rock styles combining rap, thrash,
and a funk sound; I'd be pretty impressed.  Perception definitely cuts to
the core of intelligence in my book.

Chris Weyand,
weyand@cs.uoregon.edu 
weyand@csli.stanford.edu

wayner@kama.cs.cornell.edu (Peter Wayner) (11/01/90)

weyand@csli.Stanford.EDU (Chris Weyand) writes:

>In <ERICH.90Oct31191951@near.cs.caltech.edu> erich@near.cs.caltech.edu (Erich Schneider) writes:

>>>>>>> On 31 Oct 90 17:23:10 GMT, rlp@beach.cis.ufl.edu (Trouble) said:


>>+> Maybe it's a trivial example, but I'll call a computer "intelligent" when
>>+> I can dial randomly across the radio, drop in on a song, and have it
>>+> identify the type of music, artist, and song, and do it as quickly as
>>+> a human being.  Let's see how much processor power it takes to pull
>>+> that trick.

>>This is a trivial example, but I get your point. One point I would like to
>>make is that "processor power" would be the only thing to needed to solve
>>this problem. Given a list of all of the types of music/groups/songs a
>>human knows, along with their characteristics, it's just an algorithmic
>>process (i.e. a Turing machine, by Church's thesis) to perform the ID.

>I kinda doubt that it's purely processing power.  Maybe recognizing individual
>songs could be done by some kind of wave-form pattern matching but this
>obviously wouldn't work for identifying the type of music. 


Hah, this has been done and years ago at that. I saw a talk by a the
guy. He had been working for the American Society of Composers,
Authors(?) and Publishers (ASCAP) and trying to find a way to automate
their auditing of radio stations. It turns out that every artist gets
a few pennies when their song is played on the radio. It would be too
much of an auditing nightmare to check and make sure that every
station was keeping up, so ASCAP hoped to have a computer do this.

The algorithm, as I remember it, involved severely filtering the
signal with high and a low bandpass filters which made a very narrow
hole for one narrow frequency range. Suddenly the multi-frequency
bopping turned into a binary signal where on signified the presence of
4100 cycles per second sound and off signified the absence. (Or some
other good frequency.) Then identification just turned into a string
search problem which could be done relatively quickly.

As I understand it, they didn't digitize all of music of the world,
just the top 100 or something like that. They also arranged it so the
system could be polled remotely by telephone and updated whenever
Madonna came out with her newest hot stuff. 
Peter Wayner   Department of Computer Science Cornell Univ. Ithaca, NY 14850
EMail:wayner@cs.cornell.edu    Office: 607-255-9202 or 255-1008
Home: 116 Oak Ave, Ithaca, NY 14850  Phone: 607-277-6678

rlp@beach.cis.ufl.edu (Trouble) (11/02/90)

In article <ERICH.90Oct31191951@near.cs.caltech.edu> erich@near.cs.caltech.edu (Erich Schneider) writes:
>>>>>> On 31 Oct 90 17:23:10 GMT, rlp@beach.cis.ufl.edu (Trouble) said:
>
[ my example of randomly selecting a song for an "AI" to identify ]


>This is a trivial example, but I get your point. One point I would like to
>make is that "processor power" would be the only thing to needed to solve
>this problem. Given a list of all of the types of music/groups/songs a
>human knows, along with their characteristics, it's just an algorithmic
>process (i.e. a Turing machine, by Church's thesis) to perform the ID.

Aaah, but sooo much processor power.  Man, I'd like just a teeny bit
of it on my desk...and the Hoover Dam to power it.  I realize it's
basically just a matter of a "brute force" algorithm to have a computer
do it, but consider how fast even the average high-schooler can tell you
a tune is Warrant's "Cherry Pie," or if it's just some of that "goofy
classical stuff."


>If, however, one could dial randomly across the radio and have the computer do
>the identification, and then _without a human manually changing the program_
>have the computer do something radically different (e.g. solve a complicated
>differential equation or physics problem) just by showing it the problem; 
>I would say that program is possibly "intelligent".

I like this extension of the concept.  Or how about this:  the AI works
on the problem, a co-worker flips on the radio, the AI says "Yuck,
I hate Vanilla Ice, change to the easy listening station," and keeps
working on the problem.  Consider that humans do it all the time.
Are we multitasking, context switching, "shadow processing" (two
things running on the same processor literally at the same time
[though at different priorities, perhaps], not just switching really
fast from one thing to another), or something else?

Disclaimer:  Sorry if I've mangled any terminology or paradigms or
anything.  My only exposure to AI, academically anyway, was an
"AI Concepts" class I took during my MBA program.  Talk about feeling
like a fish on a bicycle...I definitely did not dig predicate calculus
at that time.  So, maybe I'm just being a nay-saying gadfly.

Bob
--
rlp@beach.cis.ufl.edu  					    Air:  PP-SEL
AMA # 541283 				   		   Road:  750 Ninja
DoD # 0068						  Water:  NAUI OW-I
<=-									   -=>

arh@sage.cc.purdue.edu (Eric B) (11/02/90)

In <25224@uflorida.cis.ufl.EDU> rlp@beach.cis.ufl.edu (Trouble) writes:

>...  Man, I'd like just a teeny bit
>of it on my desk...and the Hoover Dam to power it.

   Makes one realize the efficiency of neurons!

>I like this extension of the concept.  Or how about this:  the AI works
>on the problem, a co-worker flips on the radio, the AI says "Yuck,
>I hate Vanilla Ice, change to the easy listening station," and keeps
>working on the problem.  Consider that humans do it all the time.
>Are we multitasking, context switching, "shadow processing" (two
>things running on the same processor literally at the same time
>[though at different priorities, perhaps], not just switching really
>fast from one thing to another), or something else?
>Bob


   Consider this... you're listening to music, foot tapping to the beat
(assuming one has rhythm 8-), while you're doing something simple like:
talking on the phone, typing a message, or reading news.  Meanwhile, a 
whole host of essential body functions are monitored & controlled.  The
fact is that human brains are divided up into very specialized areas.  
Many are hard-wired to their corresponding body parts.  When one area is
damaged, its function is usually lost, although it is possible for 
other neurons to "rewire" themselves, gaining a partial recovery of the
lost function.
   The idea is that brains are specialized multi-processors with
dedicated hardware.  A PhD. in Brain Psychology could use all the big 
words to describe it, but I wouldn't understand him.

   I would recommend taking (or auditing) a course on brain function or
brain behavior.  At Purdue, it's Psych 220.  When I took it, I continually 
analyzed how the brain could be represented by silicon.  Amazing stuff.
I actually developed an idea for an artificial eye.  I'm just waiting to 
apply optical computing to it.  The one problem is using a {optic,wire}
to neuron connection.  We'll see...

					just spinnin' my cycles away, 

					Eric G. Bolinger  8-)
					arh@sage.cc

kingsley@hpwrce.HP.COM (Kingsley Morse) (11/03/90)

                              OPPORTUNITY

If CLASSIFICATION ALGORITHMs that don't use a lot of CPU on large problems 
exist, they could be applied widely in challenging areas such as speech 
recognition, forecasting the stock market, forecasting weather, artificial 
intelligence, pattern recognition, neural nets, etc..... Perhaps you're working
on one. See the end of this posting for a detailed description of this type 
of problem.


                               PROPOSAL

I propose we use this entry on the net as a grass roots "CLEARINGHOUSE" for 
people interested in such algorithms. For example, If you are working with an 
algorithm which classifies a LOT of data, you can tell others about it here;
the algorithm's name and computational complexity for large problems would 
be helpful. 

I've started the list, and I'm hoping people will contribute in kind. A 
natural consequence of enough contributions will be a comprehensive list 
that specifies the state of the art in classification.

PS: I've speculated on some of the complexity measures, please feel free
to correct them.


Algorithm                                      CPU time needed
  name                               Pre-processing            Classification

                                     (P = number of training patterns
                                      C = number of classes (bins)
                                      D = number of dimensions in each pattern
                                      * = multiplication
                                      ** = exponentiation
                                      x = a data dependent constant)
=========================================================================
Multiple regression                        P*D**3                     D
                                    
CART                                       (PlnP)*D*(C**x)         (x**D)*(lnP)

ID3                                        (PlnP)*D                (x**D)*(lnP)

nearest neighbor                           P*D                       P*D
                                
FACT                                       (PlnP)*D*(C**x)         (x**D)*(lnP)

FIRM                                       (PlnP)*D*(C**x)         (x**D)*(lnP)

Back propagation                           P**x                        D


Comments: 

FACT runs > 100 faster than CART.
FIRM is faster than CART or FACT.



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

                     Detailed Problem Description

Can you suggest a fast algorithm for the following classification problem?
I have millions of patterns, and the class that each belongs to. I want to 
use these patterns (and their corresponding classes) to predict which class
new patterns belong to. For example, say we already have the following 
historical data:

Pattern           historical patterns, ('Dimensions' or        historical Class
(or example)      'Keys')                                      (or 'bin') number
number
===============================================================================
 1                K(1,1)     K(1,2)   K(1,3) ..... K(1,D)      C(4)
 2                K(2,1)     K(2,2)   K(2,3) ..... K(2,D)      C(7)
 3                K(3,1)     K(3,2)   K(3,3) ..... K(3,D)      C(2)
 .                  .          .        .            .          . 
 .                  .          .        .            .          . 
 .                  .          .        .            .          . 
 .                  .          .        .            .          . 
 P                K(P,1)     K(P,2)   K(P,3) ..... K(P,D)      C(15)

I would like to use this historical data to classify new patterns whose class
is unknown. I've used class numbers of 4,7,2 and 15 as examples only. Many  
patterns will belong to the same class. I expect to have millions of 
historical patterns to start with (say "P" = O(1E6)), hundreds of dimensions
in each pattern (say "D" = O(1E2)) and hundreds of classes (say C = O(1E2)).

I'm looking for a FAST algorithm which uses this data to classify new patterns 
whose class is unknown. The algorithm will probably have two parts, each with
it's own computational complexity.

1.) Pre-processing the historical data once to facilitate classifying 
    subsequent new patterns.  
             
2.) Actually classifying new patterns

ddj7203@cec1.wustl.edu (David D. Jensen) (11/05/90)

In article <1870006@hpwrce.HP.COM> kingsley@hpwrce.HP.COM (Kingsley Morse) 
writes:
>                              OPPORTUNITY
>
>If CLASSIFICATION ALGORITHMs that don't use a lot of CPU on large problems 
>exist, they could be applied widely in challenging areas such as speech 
>recognition, forecasting the stock market, forecasting weather, artificial 
>intelligence, pattern recognition, neural nets, etc..... Perhaps you're working
>on one. See the end of this posting for a detailed description of this type 
>of problem.
> ...
>I propose we use this entry on the net as a grass roots "CLEARINGHOUSE" for 
>people interested in such algorithms. For example, If you are working with an 
>algorithm which classifies a LOT of data, you can tell others about it here;
>the algorithm's name and computational complexity for large problems would 
>be helpful. 

More than just computational complexity is important here.  We should
also consider:

- Performance: How well does the algorithm classify new cases?  This is
  often a largely empirical question because we lack good answers to the
  question: "How is the world really structured?".  Consequently, we
  often just take a pool of (what we think are) representative problems
  and test the various algorithms.

- Use of prior knowledge: What if we already know something about the
  problem?  Can this knowledge be used by the algorithm?  Must the knowledge
  be explicitly encoded, or is more flexible human guidance a possibility?

- Representation of results: How are the results expressed?  Are these
  results directly usable by other programs?  Are the results directly
  usable by humans?  How much do they aid our further exploration of
  the problem?

My overall point is: Don't confuse "fast" with "useful".

kingsley@hpwrce.HP.COM (Kingsley Morse) (11/13/90)

ddj7203@cec1.wustl.edu (David D. Jensen)  writes:

> How well does the algorithm classify new cases?  
> just take a pool of (what we think are) representative problems
> and test the various algorithms.

This is a good point that I've been struggling with in my research. Indeed,
a classification algorithm's ability to generalize what it learned from the 
training set to correctly classify new cases IS important. But frankly, I 
don't know how to match a classification algorithm with a representative 
problem. In other words, if I have a classification problem in mind, and 
training data in hand, how do I know which classification algorithm will best 
classify new cases? It seems to me that testing classification algorithms
with representative problems would be unreliable; there's always another
classification algorithm that partitions the space so it is more like the 
the test data. For example, the nearest neighbor algorithm will work well
with test data that is centrally clustered in euclidian space. Multiple
regression will work well with test data that is partitioned into linear
regions. But how do you know in advance if the test data is clustered in 
euclidian spheres, linear regions, etc....? Even then, it's a safe bet that a 
nearest neighbor algorithm could approximate a multiple regression algorithm, 
or vica versa, given enough training patterns. This is
similar to a Turing machine, in which a simple cpu can emulate any other cpu
given enough time. In this case, we can assume that most classification 
algorithms can emulate most other classification algorithms, given enough 
time and training patterns. So although any classification algorithm will do,
given enough time and training patterns, perhaps we should develop an 
algorithm which is capable of partitioning the data space several ways, ie:
euclidian, linear, etc....., then focus on that partitioning which classifies
more new cases in less time. 

In summary, would it make sense to develop a classification algorithm which 
can partition a decision space several ways, then focus on the way that
correctly classifies new cases fastest?