Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership | k5 store

[P]
AI Breakthrough or the Mismeasure of Machine? (Science)

By Baldrson
Fri May 27th, 2005 at 02:22:40 PM EST

Software

If a computer program took the SAT verbal analogy test and scored as well as the average college bound human, it would raise some serious questions about the nature and measurement of intelligence.

Guess what?


Introduction

Artificial intelligence with human-level performance on SAT verbal analogy questions has been achieved (warning: PDF) using corpus-based machine learning of relational similarity. Peter D. Turney's Interactive Information Group, Institute for Information Technology of the National Research Council Canada, achieved this milestone.

The timing of this achievement is highly ironic since this is the first year that the College Board has given the SAT's without the verbal analogy questions.

For the last hundred years many researchers have claimed that analogy tests are among the best predictors of future performance via their strong correspondence with the g factor or general intelligence, while others claimed this is a mismeasure of man with severe political ramifications and questionable motivations.

Is this a true breakthrough in AI or is it just the mismeasure of machine?

The Achievement

Dr. Turney's group developed a technique called Latent Relational Analysis and used it to extract relational similarity from about a terabyte of natural language text. After reading a wide variety of documents, LRA achieved 56% on the 374 verbal analogy questions given in the 2002 SAT. The average college bound student score is 57%. These are statistically identical scores.

Everyone is familiar with attribute similarity -- when we say two objects are similar we usually mean they share many attributes such as employer, color, shape, cost, age, etc. An example of a statement about attribute similarity is "Mary has the same employer as Sally." Relational similarity -- when we say two pairs of objects have similar intra-pair relationships -- is only a little less familiar. An example of a statement about relational similarity is "John's relationship to Mary is Thor's relationship to Mjolnir." (Perhaps John was the unnamed 'employer' in the attributional statement.)

We can see two things from this example:

  1. Relational similarity underlies analogy.
  2. Relational similarity underlies metaphor.
The study of relational similarity usually cites Dedre Gentner's Structure Mapping Theory, summarized as follows:
The basic idea of Gentner's structure-mapping theory is that an analogy is a mapping of knowledge from one domain (the base) into another (the target) which conveys that a system of relations which holds among the base objects also holds among the target objects. Thus an analogy is a way of noticing relational commonalties independently of the objects in which those relations are embedded.

But a mathematical theory of relational similarity was to have been the crowning achievement of the 1913 publication of the final volume of Principia Mathematica -- something Bertrand Russell called "relation arithmetic".

Russell was adamant that without relation arithmetic people are prone to misunderstand the concept of structure and thereby fail in the empirical sciences:

I think relation-arithmetic important, not only as an interesting generalization, but because it supplies a symbolic technique required for dealing with structure. It has seemed to me that those who are not familiar with mathematical logic find great difficulty in understanding what is meant by 'structure', and, owing to this difficulty, are apt to go astray in attempting to understand the empirical world. For this reason, if for no other, I am sorry that the theory of relation-arithmetic has been largely unnoticed. Bertrand Russell "My Philosophical Development"
Unfortunately, their formulation of relation arithmetic had a defect.

I've had a career-long interest in subsuming information systems in a relational paradigm. When contracted to work on Hewlett-Packard's E-Speak project, I was able to hire (only after threatening to resign when told I had to hire only h-1b's from India for this work -- but that's another story) a science philosopher named Tom Etter, whose work I had heard of from Paul Allen's Interval Research. I set Tom to the task of reformulating relation arithmetic for use in HP's E-Speak project. As a result of this work, lasting a few months before before the E-Speak project ran into trouble, he was able to produce a paper titled "Relation Arithmetic Revived" wherein he describes the new formulation:

Here is relation-arithmetic in a nutshell:

Relations A and B are called similar if A can be turned into B by a 1-1 replacement of the things to which A applies by the things to which B applies. Similarity in this sense is a generalization of the algebraic concept of isomorphism. If, for instance, we think of a group (as defined in group theory) as a three-term relation x = yz, then isomorphic groups are similar as relations. The relation-number of a relation is defined as that which it has in common with similar relations. Relation-arithmetic was to be the study of various operators on relation-numbers.

For reasons that will become clear below, we'll substitute the word shape for Russell's term relation-number. Thus, in our current language, the shape of a relation is what is invariant under similarity. Note that these three words have analogous meanings in geometry.
...
If we substitute congruence for similarity in the [Russell's - JAB] definition of relation-number, then operators like product and join can in fact be defined in an invariant way, and Russell's conception of relation-arithmetic makes sense. Since Russell's definition of these words is not in general usage, this substitution should not produce confusion, so let us hereby make it:

A relation-number is defined as an equivalence class of partial relations under congruence.

In other words, relational congruence provides relations in context that can be composed to yield new relations -- and relational similarity provides relational shapes whose importance is more abstract. Russell and Whitehead failed because they were trying to come up with a way of composing shapes out of context. (The context-dependent relation numbers of Etter's relation arithmetic are a more general form of "attribute similarity" described above.)

Given this understanding of Russell and Whitehead's work, Turney's group has, at the very least, made a major advance toward bringing practical natural language processing into greater consilience with a wide range of science and philosophy, and conversely, brought those ranges of science and philosophy closer to practice.

Controversy In 'g'

For the last century a controversy has raged over the significance of something cognitive psychologists call the "g factor" or "general intelligence". Indeed, Charles Spearman invented factor analysis to test for the existence of an hypothesized general factor underlying all of what we think of as intelligent behavior. Spearman used a variety of tests for intelligence and then looked for correlations between them. He invented factor analysis so he could find common factors between these correlations. Spearman was strongly influenced by Charles Darwin's cousin, Francis Galton. Galton was one of the earliest proponents of eugenics, and invented the statistical definition of correlation to study the degree of heritability of various phenotypes, including intelligence. Eugenics is a highly controversial field so we should be unsurprised that the g factor, originating as it did with such a controversial area of research, has resulted in a long-standing dispute.

What is not in dispute is that analogy tests correlate most strongly with Spearman's g. What is in dispute is whether verbal analogy tests are culturally neutral enough to be a fair measure of g independent of education. In other words, no one disputes that a high score on verbal analogy tests are evidence of high g -- they merely dispute whether low scores on verbal analogy tests imply low g.

Most objections to the use of analogies tests to measure general aptitude claim they are reducible to little more than "rote memory" tasks. Quoting the Victoria, BC health site on autistic savants:

In all cases of savant syndrome, the skill is specific, limited and most often reliant on memory.

This sounds a lot like the objections raised by the opponents of the use of verbal analogies tests. Finding an autistic savant whose specialized skill was to do exceedingly well on verbal analogies would go a long way toward validating this view of verbal analogies and hence the view that Turney's accomplishment is not the AI breakthrough it might appear to be.

On the other hand we must remember that a sufficiently compressed "rote memory" might be indistinguishable from intelligence. A genuine AI program, assuming it could exist, itself can be seen merely as a compressed representation of all behavior patterns we consider "intelligent" and the Kolmogorov complexity of those behaviors might not be as great as we imagined them. Taxonomies of intellectual capacity which place analogy and metaphor along-side critical thinking are quite possibly compatible with a sufficiently compressed description of a very large "rote memory".

The most widely-read attack against the g theory to date has been Stephen J. Gould's The Mismeasure of Man. Gould summarizes the objections to g theory:

"[...] the abstraction of intelligence as a single entity, its location within the brain, its quantification as one number for each individual, and the use of these numbers to rank people in a single series of worthiness, invariably to find that oppressed and disadvantaged groups--races, classes, or sexes--are innately inferior and deserve their status" (pp. 24-25).
Recently this on-going controversy has boiled over into college admissions with the College Board removing verbal analogies from the SATs as of 2005. Ironically there is now an argument raging over whether this change biases the SATs against whites and for blacks and Hispanics or whether it biases the SATs against blacks and Hispanics and for whites. We can certainly expect this debate to continue without resolution since it seems rooted as much or more in ethnic politics than science.

And of course none of this has stemmed the century-long pattern of on-gong research indicating that analogy tests are highly predictive of future performance as well as disputations of the validity of such research.

It is precisely the sustained acrimoniousness of this debate that renders Turney's accomplishment so refreshing -- for regardless of your viewpoint, machines are not a voting bloc. Either this work shows itself to be a turning point in the progress of artificial intelligence, or it will merely lead to mundane benefits such as better search engine results. This is just the start of what will undoubtedly be a long series of measurements of artificial intelligence quality.

The question before us now is whether Latent Relational Analysis' human-level performance on verbal analogies truly represents an artificial intelligence breakthrough or whether it merely represents the mismeasure of machine.

Sponsors
Voxel dot net
o Managed Servers
o Managed Clusters
o Virtual Hosting


www.johncompanies.com
www.johncompanies.com

Looking for a hosted server? We provide Dedicated, Managed and Virtual servers with unparalleled tech support and world-class network connections.

Starting as low as $15/month
o Linux and FreeBSD
o No set-up fees and no hidden costs
o Tier-one provider bandwidth connections

Login
Make a new account
Username:
Password:

Note: You must accept a cookie to log in.

Related Links
o Slashdot
o Google
o SAT verbal analogy questions
o been achieved (warning: PDF)
o corpus-based machine learning of relational similarity
o Peter D. Turney
o without the verbal analogy questions
o analogy tests are among the best predictors of future performance
o the g factor
o mismeasure of man
o about a terabyte of natural language text
o Thor's
o Mjolnir
o Dedre Gentner
o summarized
o Principia Mathematica
o Bertrand Russell
o My Philosophical Development
o career-long interest in subsuming information systems in a relational paradigm
o E-Speak project
o Paul Allen's Interval Research
o Relation Arithmetic Revived
o isomorphism
o isomorphic groups
o Whitehead
o consilience
o g factor
o Charles Spearman invented factor analysis
o Francis Galton
o Victoria, BC health site on autistic savants
o Kolmogorov complexity
o analogy and metaphor along-side critical thinking
o The Mismeasure of Man
o against whites and for blacks and Hispanics
o against blacks and Hispanics and for whites
o analogy tests are highly predictive of future performance
o disputations of the validity of such research
o artificial intelligence quality
o More on Software
o Also by Baldrson


View: Display: Sort:
AI Breakthrough or the Mismeasure of Machine? | 171 comments (156 topical, 15 editorial, 0 hidden)
Tic-tac-toe (none / 0) (#170)
by eodeod on Sun Jun 12th, 2005 at 12:42:27 PM EST
(eod@nospamremovethisandstuff.e0d.com) http://www.grabmyip.com

How is this any different than a program playing tic-tac-toe. There is a set of rules and conditions, and it follows them.

The Flip Side (none / 1) (#168)
by bobej on Mon Jun 6th, 2005 at 07:33:12 PM EST
(rejamison@yahoo.com) http://rejamison.blogspot.com

This seems relevant. This is a guy who purposely set out to get every question wrong on the SAT: link.

What sweet irony. AI makes great strides in taking human evaluation tests, human makes great strides in failing them utterly and completely.

standardized tests (none / 0) (#167)
by fourseven on Mon Jun 6th, 2005 at 02:22:24 PM EST

If anything, this reveals how inadequate the current tests are at evaluating human capability. Mechanistic, standardized, they are a manifest of the shortcomings of the "educational" "system". No wonder a computer system is scoring well -- computer systems are used in preparing, collating and generating these tests. We should be asking whether existing tests are capable of measuring uniquely human potential in any useful way.

On the other hand, the accomplishments of Turney et al are an interesting step forward -- we're now a little closer to being able to use the word "like" as an element of the human-machine interface.

Overall, a great article. Thanks for the write-up.

tells you little (none / 1) (#166)
by jcarnelian on Sat Jun 4th, 2005 at 04:04:25 AM EST

The SAT is correlated with scholastic achievement, but it does not measure it directly. By analogy, you can tell a lot about a person's health status from their age, height, and waist, but producing a lump of wood with the same age, height, and waist doesn't make a healthy person. For a computer program to score well on the SAT is a decent achievement in information retrieval, but it has nothing to do with artificial intelligence.

AI vs. Information Analysis (none / 0) (#164)
by bobej on Fri Jun 3rd, 2005 at 11:21:33 AM EST
(rejamison@yahoo.com) http://rejamison.blogspot.com

This is information theory, not AI. I have to put this into the category of the AI fakers. AI is stimulus and response.

Regarding the cultural problems of testing for intelligence, obviously any test involving human languages will be biased. Period. Perhaps someone could work out a test that evolves it's own unique language from first principles for each test (evening out the field), but invariably such a test would need to use a human language to instruct the test taker, thereby re-introducing bias.

So do we throw out standardized tests? Nope, they are still useful when we want to empirically measure a candidates suitability for a task. For college and work, skewed results due to cultural influence might be appropriate (a candidate for an American company should have a good grasp of English).

Where this gets sticky is when social institutions like police or government use such tests to determine policy. Universities are borderline in my view, since it's easy to switch universities, not so easy to change your government.

No. (none / 0) (#147)
by CAIMLAS on Tue May 31st, 2005 at 01:09:56 PM EST
http://benjamin.hodgens.net

Definitively wrong. Using an arbitrary test to assess the supposed intelligence of a human is very wrong-headed to begin with. Add to the fact that the questions on SAT exams are figured out to be mathematically opitimal in many fashions. This doesn't work well - at all - for a test question merely being one of quintuple possible answers for each question, a random number generator would do "better than average", statistically speaking - particularly since most students don't finish/rush through and end up answering off the cuff when they're not sure. You could theoretically do well on the SATs (IIRC - it might have been the ACTs) simply by answering 1 question from each section correctly, and leaving the rest blank. There are a lot of ways you can cheat the system.

I've heard of many people that have simply filled out random blocks (or patterns) and have subsequently scored well above average on the SAT. I did so myself on one of the tests (just wanted to get it over with, and I was already admitted in the school I wanted), and I got (IIRC) in the top 6%. So, a machine could theoretically do this without any problem. Randomly.


--

Socialism and communism better explained by a psychologist than a political theorist.

AI is far from human brain (2.00 / 9) (#146)
by Kitch on Tue May 31st, 2005 at 12:16:34 PM EST

You know, all these stuff are just models of how human brain works, but nobody really knows what's going on inside it. And obviously we can't buid such model. Why? The answer is simple. Because of non-algorithmic basis of the brain. How can we linearize something that is non-linear without data-loss?

Measurement of Intelligence is Useless (none / 0) (#145)
by RadiantMatrix on Tue May 31st, 2005 at 11:40:39 AM EST
(file13@theoffice.world) http://radiantmatrix.org/

The "exact measurement" of intelligence is a pipe dream, and it is likely to remain so.  It's like measuring how sexy someone is -- it depends far too much on a definition that requires subjectivity.

The question "what does it mean to be intelligent?", or more appropriately, "what does it mean to be more (or less) intelligent?" has no objective answer.  And, if we try to assign an objective answer, not only are we likely using circular reasoning but we are defining intelligence in a way that almost everyone will disagree with to some extent.

"Intelligent" is a subjective adjective, like "big", "small", or "sexy".  People who drive a Hummer think my VW Jetta is "small", but people who drive a Geo Metro think the same car (the Jetta) is "big".  Lots of people think Kate Winslet is sexy, I think she's boring and not sexy at all.  It's all subjective, and it has a lot to do with what the speaker's experiences are.  Intelligence is the same way: how many times have you heard (especially in arts) someone say "that person is either a genius or an idiot"?

I consider my wife and I to both be of above-average intelligence.  I can code well in several languages, my wife has a bit of trouble with any kind of programming - on first meeting, many geeks would think she wasn't that bright.  However, she is a highly talented Classical musician as well as a budding composer; those in her field think she is very smart and I'm a bit of a twit (I can barely even play an instrument).

Who is right?  Are we both intelligent?  If so, why do we have so few common mental abilities?  If not, which one of us is "really" intelligent?

The answer is, not suprisingly, that we are both intelligent in our own way.  Any attempt to ultimately define intelligence as anything more than an abstract concept is a waste of time.  Not only that, but such a definition will ultimately lead to the repression of talented individuals -- someone we deem to be "unintelligent" will be denied opportunities, and their potential may never be realized.  The arts -- and the sciences -- are full of stories about discoveries that almost never were.  The lesson we should learn from these stories is that everyone should be given opportunity, because we have no way to know what someone is capabale of.
--
I'm not going out with a "meh". I plan to live, dammit. [ZorbaTHut]

Syntax & Inane Turing Worship (none / 0) (#128)
by twestgard on Sun May 29th, 2005 at 10:24:26 PM EST
(tom@ilmechliens.com) http://www.ilmechliens.com

I can already see that there will be a period of "adolescence" in these machines, where they'll be used to try to "predict" and "deduce" things in real life. Police and prosecutors will get ahold of them, and their output will be used to determine who gets arrested and whose houses get searched. But they won't be really all that good. People will get arrested for using figurative speech like jokes or sarcasm that the computer didn't understand properly. But I guess that's not really any more random than skin color or family income, so maybe it's not worse than the current system.

But that brings me to my other point - I'm mystified by this Turing Worship. This is unscientific navel-gazing. Turing was a smart man with an amazing career coupled to a compelling and ultimately tragic personal story. Fascinating historical figure. But if the Limeys hadn't driven him to suicide, I guarantee you he wouldn't be spending all this time wondering if a particular set of tools were "intelligent" by this or that standard. Nor would he want be seen as someone who inspired that kind of pointlessness. Turing was a very practical man. He'd be coming up with interesting ways to use these tools to help people. Taking an expired version of the SAT is about the least useful application one could devise. So the premise of this story is about the same as "Caveman bangs stick against rock until one breaks." Waste of energy, time, sticks & rocks. Caveman should have been making a stone axe. What should we be making? Not this.

Thomas Westgard
Illinois Mechanics Liens

The elephant in the living room (none / 1) (#106)
by Fen on Sun May 29th, 2005 at 01:36:57 AM EST

As usual, the elephant is not mentioned are considered. That being English. It is an ambiguous, deeply flawed language--like every other natural language. There is an alternative--lojban. This can be parsed like c or Java.
----Transcend humanity.
I'm not impressed. (none / 0) (#86)
by SIGNOR SPAGHETTI on Sat May 28th, 2005 at 03:50:16 PM EST

What's next, artificial ONE? Preposterous!

--
Stop dreaming and finish your spaghetti.

great article (none / 0) (#84)
by transient0 on Sat May 28th, 2005 at 02:55:55 PM EST
(duff at (homepage domain)) http://frankduff.com

but statistical nitpick.

"Statistically identical scores" is a really sloppy term. i know it's too late to edit the article, but i would have liked to see some reference to the standard deviation, even if in a footnote.
---------
lysergically yours

I blame it on Descartes (none / 0) (#79)
by dollyknot on Sat May 28th, 2005 at 12:43:14 PM EST
http://dollyknot.com

Plato said "One should seperate reason from passion", because of that, I suspect he would take issue with Descartes's 'Cogito Ergo Sum'. We do not *think* alive - we *feel* alive (therein lies the lonely mystery of qualia :)

Instead of looking for artificial intelligence, we should be looking for artificial consciousness, we think we are capable of measuring intelligence, when we seem to not have a clear idea as to what intelligence is.

Personally I think intelligence relates to *how* whereas wisdom relates to *why*.

Human beings are very much goal orientated, to eat , defecate, procreate, assimilate, so on and so forth, these desires motivate our actions and it is how we interact with others and the environment. The fact that some people appear to be more efficient (I mean efficient in terms of taking more than one gives (sort of an economic power to weight ratio:)) at these processes than others, I would suspect will not lead us to true AI or AC :) I fear that it will lead to a more efficient enslavement of the ordinary people by the multi-nationals, more efficient killing machines with which to wipe out those who do not conform to the hegegmony of consumerised brainwashed capitalism, motivated by profit and profit alone.

BTW Baldrson, very nice article, perhaps you will find an analog for AI, but beware you do not end up with Spock AKA Leonard Nimoy.


They call it an elephant's trunk, whereas it is in fact an elephant's nose - a nose by any other name would smell as sweetly.

A damn fine article (2.50 / 2) (#70)
by Scrymarch on Sat May 28th, 2005 at 05:46:47 AM EST
(scrymarched aht yahoo dt com)

Usually this comment would be redundant and bad form, but I've been very critical of Baldrson's bias and lack of disclosure in the past.  For what it's worth, this article is both fascinating and scrupulously evenhanded.  Thanks for writing it.

It's just the vibe.
Simple Methodology to Solve Analogies (2.50 / 2) (#69)
by asolipsist on Sat May 28th, 2005 at 04:43:30 AM EST
(k5 [at] wendlink [dot] com)

"Dr. Turney's group developed a technique called Latent Relational Analysis and used it to extract relational similarity from about a terabyte of natural language text. After reading a wide variety of documents, LRA achieved 56% on the 374 verbal analogy questions given in the 2002 SAT. The average college bound student score is 57%. These are statistically identical scores."
I'm not sure how impressive of a feat this is. I developed a simple methodology in 15 minutes that scored 100% on the example SAT analogy questions listed at the URL posted above:
http://www.freesat1prep.com/sat/verbal/analogies/analogy_questions.htm
This methodology could be implemented using google, a digital dictionary and probably 20 lines of perl (10 allowing for obfuscation).
1.) BIRD : NEST ::
(A) dog : doghouse
(B) squirrel : tree
(C) beaver : dam
(D) cat : litter box
(E) book : library


Methodology is as follows:
Search for the first and second question term in google using wild cards like.
"bird * * nest"
Find the first highlighted phrase that's in a sentence and includes at least a noun or a verb. In this case google finds "bird built her nest" Search for matching cases using answer choice terms.
In this case:
"dog built her doghouse"
"squirrel built her tree"
"beaver built her dam"
"cat built her litter box"
"book built her library"
In this case google matched 0 for each term. Replace pronoun 'her' with other pronouns and try search again.
After pronouns replaced the tally was:
a) 0
b) 0
c) 7
d) 0
e) 0
Since we found positive result, choose the one with the most hits, Choose answer C.
If still 0 results, remove any adjectives and try again. If still 0 results, change any articles and try again.
I tried this methodology for the next two questions.
2.) DALMATIAN : DOG ::
(A) oriole : bird
(B) horse : pony
(C) shark : great white
(D) ant : insect
(E) stock : savings

First highlighted phrase in a sentence:
"Dalmatian is not an ideal dog."
No results on first pass, try again with adjective removed. A scores 1 hit, choose A
Question three search yields:
"Doctor from outside the hospital"
On first pass C gets 6 hits, everything else 0. Choose C.
Score: 100%.

I'm sure this method will fail on some questions, but as simple as it is, it might beat Dr. Turney's technique "Latent Relational Analysis". I'm sure with some tweaking it could beat the hell out of it Dr Turney is only getting 56%. What does that say about this type of problem, is it really a 'hard' AI problem? What does it say if a 20 line perl script can beat analysis you've decided to name with capital letters?

Thanks for a nice article that got me thinking... (none / 1) (#63)
by Oldest European on Fri May 27th, 2005 at 09:12:06 PM EST
http://www.kerokero.de/

After reading the article I couldn't get one question out of my mind: What is artificial intelligence and why do we call it that way?

And while I continued thinking about it, it seemed more and more obvious to me that artificial intelligence is just an euphemism for not intelligent.

All we have achieved in the field of artificial intelligence yet is basically creating more or less suffisticated tools.

Is a car intelligent because it has ABS?
No it isn't.

Is a computer program intelligent, that can perfectly translate a text from one language into another?
No it isn't.

So what is lacking?

Self-awareness - and coming with that a will to survive.

As long as those 'artificall intelligences' don't have self-awareness, one shouldn't call them intelligent but instead just highly suffisticated tools.

And another thing about intelligence: intelligence is self-adapting to new situations and enviroments!

The electronic parts of my car will never learn how to play chess, they will never learn how to write a poem or how to be a good football player.

And someone or something either is or isn't intelligent.

And if a machine gets to the point where one can call it intelligent, I will call it truly intelligent not artificially intelligent.

And in that case a good term might be computer based intelligence or maybe silicon based intelligence - in contrast to carbon based intelligence.

I think there is a good chance that we will really see truly intelligent machines one day.

And this might also prove that we humans are not halve as intelligent as we think we are, because if we create such machines, we will create predators, that might just be responsible for our future extinction.

Or if we are a bit more lucky, we will simply become their slaves - wouldn't that be ironic?

There are two distinct issues here. (2.00 / 2) (#60)
by jd on Fri May 27th, 2005 at 07:25:18 PM EST

First, if it is possible for a non-intelligent machine to score well in a test, the test is flawed. It should be impossible, using only predicate logic, to pass an exam. One possible split would be to specifically design tests where 25% was on logic and deductive reasoning, 25% was on lateral thinking and interpretive reasoning, 25% on conceptualizing and modelling, and 25% on semantics.

The idea here is that pure rules-based engines should score an average of 25% and a maximum of 50%. So should individuals who do but don't think. 75% should be achievable only by the application of ALL forms of intelligence, and 100% only by the mastery of ALL forms of intelligence.

This would be as true in the "hard sciences" as in the arts. If you can't apply all of your brain to the problem, then you cannot have all of the skills and therefore should not have all of the marks.

This leads onto the second issue, of "intelligent" machines. There is no reason, so far proposed, for why machines could not become intelligent. However, the Turing Test is simply too vague to be a good measure of intelligence, and is useless when you want anything more than a yes/no answer or a study of non-humanlike intelligence.

The breakdown I proposed analyzes different forms of intelligence, not one form alone, and no one form can be used to compensate for the lack of another. This would allow you to test a machine's intelligence by studying its ability to reason on different levels and allow for a study of non-humanlike intelligence as it is not a relativistic system.

You would still need something akin to a Turing Test, this would not replace it, but rather it would extend it to allow you to get a measure of intelligence rather than a mere binary result.

IQ tests, as they stand, are useless for measuring intelligence, as different schools of thought use different types of test and different scales. There is no way to use the result to get a useful, understandable, measure.

The tests also tend to be very culturally-oriented, so different cultures will score differently on the same test. Americans will generally do badly on UK tests, and the British generally do badly on American tests. Who is smarter? Logically, neither, but a single test result would not enable you to prove that.

Breakthrough is relative (3.00 / 2) (#59)
by schrotie on Fri May 27th, 2005 at 07:07:49 PM EST
(schrotie at uni dash bielefeld dot de)

If one doesn't believe in the immortal divine soul or some such tale, one has to acknowledge the possibility that intelligence is not a monopoly granted us by eating from the wrong (?) tree, but is only maybe a monopoly and even that only by chance. AI will very likely be constructed some decade, century or millennium down the road - if we don't eradicate ourselves before. Currently there is no obvious reason why it should be in principle impossible to build a computer that has enough computing power to emulate dozens or millions of people in real time, synchronously.

Such a computer would seem vastly more intelligent than humans in terms of IQ, because IQ tests are usually conducted under a lean time limit. But is intelligence only processing speed? Maybe, I don't know. An interesting fact: people rating high on IQ tests have less active brains for a given task than lower ranking people - seems to be at least partly about optimizing certain pathways. Anyway, a specialized computer beat a humans. Again. Again it used brute force (terabytes is a hell of a lot).

I don't understand the technology used for the task. Indeed I don't understand much of the story at all - even though I work in AI research, but then I'm from the bottom up fraction and don't know expert systems at all. I can't see how this helps with computer vision or any other sensor processing (might help with sensor fusion and high level analysis though). The technology might be a breakthrough like Hopfield networks (backpropagation), which were a huge milestone in AI research. Or like Bayseian networks which are also rather significant (not just for filtering your spam). But neither made machines "intelligent" all of a sudden. They are tools for specific classes of problems and this new tech will likely also be. Thus we have pattern matching and complex dynamic attractors (Hopfield), plausibility estimation (Bayes) and what (Etter)? Correlation of causal topologies? Tile by tile machines learn aspects of human intelligence. Remember that a "computer" used to be a human who calculated not so long ago (it was another age though). Humans suck at calculating when compared to even humble computers. Same for logic and increasingly many other tasks. Its kind of nice that the process is so slow. Copernicus, Darwin Freud: they all took our illusions by formulating one theory and throwing it in our faces. The AI crowd do it step by step. Thanks for that.

And don't expect interesting conversation from your toaster any time soon.

Democracy is the recurrent suspicion that more than half of the people are right more than half of the time.
E. B. White


Latent Relational Analysis (none / 1) (#58)
by God of Lemmings on Fri May 27th, 2005 at 06:35:25 PM EST
(devanhoe at gmail)

The question before us now is whether Latent Relational Analysis' human-level performance on verbal analogies truly represents an artificial intelligence breakthrough or whether it merely represents the mismeasure of machine.
How about neither? What we have here is a way for computers to understand what we mean on a much higher level. In my opinion, this represents just another misdirected effort towards trying to create a viable intelligence, however it does has its uses in agents, expert systems, and even better parsers.

analogy questions (2.00 / 2) (#55)
by mpalczew on Fri May 27th, 2005 at 06:23:20 PM EST
(mike at palczewski dot net) http://students.washington.edu/mpalczew

Those analogy questions are not a measure of intelligence.  I can honestly say that I and most people I know would easily get each of those questions right if I had memorized what every word in the english language actually meant.  That would take a huge amount of effort and would be very useless except for getting good scores on the SAT, and for trying to sound smart.  The way they pull words our of their ass right now, some of the questions may as well be in a foreign language.  It's a test of memorization.
-- Death to all Fanatics!
The true horror of it all (2.80 / 10) (#53)
by LilDebbie on Fri May 27th, 2005 at 05:07:59 PM EST
(astropulp at google mail) http://astropulp.blogspot.com

The SATs are not a mismeasure of intelligence (however loosely defined) and AI is fast approaching humanlike capabilities. All the handwringing about rote memory and critical thinking results from a shared delusion and fear among humanity: we're beginning to discover that we're not as cool as we thought we were (or pretended to be).

Newsflash people: you have no soul. You have no "higher self" either for all you secular humanists out there. Your thoughts, emotions, and behavior are simply a complex data set, some of it hard-wired, some of it acquired, some of it reasoned out (and yes, machines can do this - play with prolog some time if you want to see for yourself). The illusion of consciousness is merely another behavior that gives structure to the vast amount of knowledge stored in your cranium and genes.

Inspiration does not come from the divine. It comes from the right data accumulating in one organism (be it an individual, a research group, or society at large) so that a new conclusion can be drawn from that set. In the early years of man when we didn't know much, this happened frequently at the individual level. Men like Archimedes and Plato made mind-blowing discoveries. People started to get full of themselves and decided that they were part of the Divine, that we are the very image of God.

We're not. At best, all of creation is a reflection of God, but why should we care? As the data set becomes more complete and complex, discoveries by individuals become less and less frequent and all progress is driven by groups, eventually giving way to all progress being done by machine. We can only pray that the machines, like us, allow the previous iteration (ours being animals) some level of freedom and/or existence.

Under that evil, cynical, dream-crushing exterior, LilDebbie's got the heart of the Dalai Lama.
- Russell Dovey -
wow! great article! (1.50 / 4) (#52)
by CodeWright on Fri May 27th, 2005 at 04:44:09 PM EST

this is VERY relevant to my work.

--
A: Because it destroys the flow of conversation.
Q: Why is top posting dumb? --clover_kicker

silly wabbit (none / 1) (#51)
by modmans2ndcoming on Fri May 27th, 2005 at 04:31:38 PM EST

The statement about relations from A->B being 1-1 is not a generalization of isomorphism. isomorphism is already the abstract concept of equivalence.

;-)

Hmm (3.00 / 3) (#49)
by SiMac on Fri May 27th, 2005 at 03:46:16 PM EST
(simon at my website domain name) http://simonster.com/

I work for the Center for History and New Media at George Mason University. Recently, one of my colleagues created a little tool to take multiple choice tests using the information contained on Google. (See our H-BOT tool, which has a slightly different purpose, but is based on the same premise.) On the national proficiencies in U.S. History, we got over 80%. These were fact-based questions, not analogies, but it's still a bit disturbing that school funding is being measured this way.

the true mismeasure of machine (2.90 / 10) (#48)
by Polverone on Fri May 27th, 2005 at 02:56:45 PM EST
(gfxlist@yahoo.com) http://www.sciencemadness.org

Artificial intelligence is everything that (some) humans can do well that no machines or other animals can do well. As soon as you make a machine perform reasonably well on an AI task, it's discovered that the task has no bearing on intelligence.

Let's look at some of the things that it turns out are not at all intelligent:

-SAT verbal analogies
-Playing checkers, chess, or backgammon
-Optical character recognition
-Symbolic equation manipulation

This has some interesting consequences. It turns out that Go players must be more intelligent than chess players, because there are no really good Go machines. Likewise, polo players are even more intelligent than Go players, because the best polo-playing machines are worse than the best Go-playing machines. Polo and Go players of modest ability are both far more intelligent than the man who is merely good at symbolic algebra. If we really wanted to admit the creme de la creme to institutions of higher education, we would ignore machine-accessible pseudointelligence measures like verbal analogies and logical reasoning, and instead organize a giant polo championship.
--
It's not a just, good idea; it's the law.

Even if their AI is at the Rainman level... (2.50 / 2) (#45)
by Russell Dovey on Fri May 27th, 2005 at 02:25:46 PM EST
(antipaganda@gmail.com) http://www.flickr.com/photos/80291310@N00/4079985

...that's not too bad. I'd expect the early AIs to seem like brain-damaged humans for quite a while. After that, they'll seem like weird, emotionally immature people. After that, they'll emulate humanity so well we won't know just how fucked-up crazy they really are.

"Blessed are the cracked for they let in the light." - Spike Milligan

Excellent (2.50 / 2) (#43)
by vadim on Fri May 27th, 2005 at 01:46:28 PM EST

I impatiently await for the arrival of cute androids who say "Chii!" somewhere in the next 20 years ;-)

More seriously, I'm not sure how much practical stuff will come out of this. IMHO, intelligence is tightly bound to the environment, and pretty much impossible to separate from it.

So, I think that we'll only get real intelligence when we start with a human-like robot and try to enginner intelligence into it, but probably will not get very far if we continue to obsess on one particular detail and ignoring everything else.
--
<@chani> I *cannot* remember names. but I did memorize 214 digits of pi once.

it's possible both are true (3.00 / 6) (#40)
by Delirium on Fri May 27th, 2005 at 11:29:34 AM EST
(delirium-k5@hackish.org)

It's possible that simultaneously human performance on analogy tests is well-correlated with general intelligence, yet this performance by machine does not indicate a generally-intelligent machine.

In particular, it may be the case that humans who are able to build up the sort of mental structures to excel at analogy tests are in general able to solve many other reasoning problems as well. With a computer program specifically constructed for the analogy test, that may not be the case: It may be good at taking SATs and not good at the other things that, in humans, the SAT is a pretty good predictor for.

Read my diary.

Terabyte of text for a machine... (3.00 / 3) (#27)
by dimaq on Fri May 27th, 2005 at 08:13:30 AM EST
(nobody@dev.null.org)

how would you estimate the amount of text an average SAT participant has read in their entire life?

terabyte is like a thousand books... I certainly never read that much literature in my life, and I wonder if I ever read that much text of any kind in my life.

next difference - what sort of memory (size, compare to human) does the mahine in question have and what sort of efficiency is achieved in the storing algorithm?

i.e. what is the probability that an average person remembers something (word, phrase, semantics) they've read only once? and what is it for a machine? my bet a machine could be programmed to remember a lot better, giving it an unfair advantage.

Bloom's Taxonomy (3.00 / 3) (#25)
by minerboy on Fri May 27th, 2005 at 06:54:48 AM EST

One common way to rate the cognitive difficulty of problems is Bloom's Taxonomy, which categorizes tasks as knowledge, comprehension, application, analysis, synthesis, and evaluation. Analogies were thought to be fairly high on this scale, but I suspect that this is a mistake, and what the success of the AI on analogies shows that analogies have been overrated wrt Bloom. I suspect that there are alot of tasks that are thought to be the "higher order thinking skills" that can simply be broken down into knowledge and algorithms.



For me (2.66 / 3) (#24)
by whazat on Fri May 27th, 2005 at 06:53:00 AM EST

Some of the necescary but not sufficient things an AI has to do are.

Alter its code to improve the following


  • Performance on a task

  • Increase robustness of important code

  • Regulate energy usage and heat production so that important code is more likely to be able to perform its task

So I think it is a mismeasure.

A see a distinction here. (3.00 / 4) (#21)
by A Bore on Fri May 27th, 2005 at 04:56:42 AM EST

On one hand you have the adaptable human brain which, as a result of its general intelligence, is able to do these verbal analogy tests. On the other you have a computer expressly programmed to crack a particular problem - whether it be chess, verbal analogy, random, normal sounding conversation - which is not solving these problems as part of an indicator of general intelligence, but rather through number crunching and analysis of a single problem.

It kind of misses the point. Any test aims to measure intelligence by correlation. If you had an idiot savant vegetable and scroed him on a variety of tests, you would have perhaps mathematic based ones showing him as highly intelligent, social situations showing him retarded etc. etc. Until a computer is developed which scores across the board on a variety of different tests, even ones it has not been specifically designed for - that is an actual measurement of AI.

Specific IQ tests are called the "next challenge for AI" because they are the most difficult to number crunch a solution to. Well, someone has managed here. It isn't the breakthrough you present it as, anymore than deep thought beating Kasparov was a breakthrough. It just showed that human ingenuity can eventually force computers to closely model human responses to some of the most complicated tests we can devise.
Che Guevara wears a T-shirt of me.
why do geeks get to define intelligence? (2.33 / 9) (#20)
by SIGNOR SPAGHETTI on Fri May 27th, 2005 at 04:18:02 AM EST

Intelligence is by strange coincidence those cognitive skills possessed by people in society that have or serve power. That would be the technocrats currently, because the measure of civilization has become elaborate weapons, designer erections, ersatz sugars, ipods and the market mechanisms and computer functionaries in which to flog all this glittering bullshit. I for one have yet to meet a geek that was smarter than a cocktail waitress. Anyway, ONE.

--
Stop dreaming and finish your spaghetti.

Some thoughts (3.00 / 4) (#18)
by strlen on Fri May 27th, 2005 at 02:42:47 AM EST
(strlen)

Jeff Hawkins goes into that subject, somewhat in his book On Intelligence where he discusses that the proper way to think about is intelligence is the ability for predictive reasoning, i.e.: seeing the pattern, than merely memorization (he suggests that as the solution to the Chinese Room problem [google for it, for those who don't know it -- very widely known]).

This is where my idea with SATs comes in: you can cram for SATs, by studying the vocab that goes along with it and you can cram, through examples, the verbal analogy questions.

Now if the verbal analogy questions were randomly generated and more complex, it would be a good test of intelligence. Also, from what I've heard they've replaced the verbal analogy session with critical thinking essays. Now, given the standard for the ways essays/papers are graded in college -- this is likely going to be lax, but at least in theory writing a paper should be a more comprehensive test of analytic reasoning than simple analogy questions.

What is also ignored is that there's several kinds of intelligence. One needs mathematical reasoning to suceed in a computer science program, while one needs a high verbal ability and overall good analytic (not necceserily purely mathematical) reasoning skills to succeed in a history program.
Given that people who take the SATs aren't even decided in terms of what they will be majoring -- there is no way that the SAT can measure their capacity to succeed in any specific field of study (unlike the LSAT/GRE/MCAT), thus it may as well be pointless. Of course high school GPA is probably far more pointless, but that's a different story (Note to geeks reading this: you can gain entry into much more competitive universities and avoid the SATs all together by transfering from a community college -- at least in California (and the UC system)).

Jeff Hawkins also discusses in his book, that one of the problems behind AI is that it uses behaviorism, which is an outmoded psychological model. E.g.: a machine that can ace the SATs may be behaviorally equivalent to an SAT aceing student when it comes to the SATs, but can such a machine be used for other intelligence related tasks?

In short, I do tend to agree with the idea of existance of the g factor, and of genetics playing a role in IQ. I do believe that proper analytic tests can be designed, so on.. what I don't believe, is that A) IQ is fixed throughout an individual's lifespan (even excluding the obvious physiological factors such as dementia) B) that there's one specific kind of intelligence (i.e: there's different sorts of aptitudes for different tasks). Both sides of the issues are deeply polarized: we have the Gould types (to whom you seem to be referencing in the story title) on one side, fearing that any acknowledgement of genetics in intelligence would automatically mean justifying eugenics and racism and well... and the types who take the idea of genetics having any influence to mean genetical determinism (as well as assuming that racial groups are going to be more or less homogenous when it comes to intelligence -- and ignoring both cultural (e.g.: was the child trained with puzzles at the early age?) and physiological (e.g.: diet, child rearing) factors).

Excuse the somewhat rambling tone of the comment, I am in the middle of procastinating exam study and merely wanted to express some [somewhat random] thoughts I've developed over time on this issue.

--
FATMOUSE + YOU = FATMOUSE

Maybe this just highlights a flaw in the tests (3.00 / 3) (#17)
by StephenThompson on Fri May 27th, 2005 at 02:34:59 AM EST

The PDF uses this example as an analogy: mason:stone :: carpenter:wood This relationship is trivial because they can be analyzed purely syntactically. The semantics of the words, such as what is wood is irrelevant, just the syntax of the dictionary definition is enough to see the relationship. Thus, if 56% of the test is only simple syntactical substitution we shouldn't be surprised of the results. Computers are excellent at syntax and humans aren't so much. How well would the system when no amount of syntactical analysis can find the right answer, but any bubba [of the right age group hehe] would get the answer easily: mounds:almond joy :: Loni Anderson:? a) Sybil b) Richard Simons c)Burt Reynolds d)Mr Goodwrench

AI is not about SATs (2.00 / 2) (#16)
by monkeymind on Fri May 27th, 2005 at 01:03:56 AM EST
(monkeymind02@fastmail.fm)

When the AI can go into a bar after the test and chat someone up, then you will have reached you goal my son.

I believe in Karma. That means I can do bad things to people and assume the deserve it.

All this tells us is... (2.00 / 3) (#9)
by BJH on Thu May 26th, 2005 at 10:05:40 PM EST

...that the SAT is not a useful measure of intelligence.

Fascinating article (3.00 / 4) (#6)
by esrever on Thu May 26th, 2005 at 09:17:55 PM EST
(esrever_otua AT $Homepage) http://pythonhacker.is-a-geek.net

I think that one of the main reasons for the acrimony in the debate over "What is AI" is the term "AI" itself.  Peter F. Hamilton neatly recognises and defuses this debate in his latest book Pandora's Star by referring to his intelligent programs as "SI" or Sentient Intelligence (and they are, actually, sentient), and the merely highly sophisticated programs as "RI" or Restricted Intelligence (whereas these are merely anthropomorphic).  This clearly dileneates and removes the ambiguity around the word "Intelligence" which is at the root of most of the disagreement over the term "AI".

People associate Intelligence (rightly or wrongly) with sentience, and therefore denounce "AI" as a pipe-dream.  Meanwhile, many AI researchers and pundits are not much better; conflating the rise of Intelligent programs automatically with the concomitant rise of Sentience in said programs.  Which leads us to such ludicrously wrong-headed nonsense as "Should Intelligent Machines have Rights" (don't have link, but this made big news on Wired a year or so ago, IIRC).


Audit NTFS permissions on Windows

AI Breakthrough or the Mismeasure of Machine? | 171 comments (156 topical, 15 editorial, 0 hidden)
View: Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest � 2000 - 2005 Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
If you can read this, you are sitting too close to your screen.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories! K5 Store by Jinx Hackwear Syndication Supported by NewsIsFree