Back home
Boston Globe's Boston.com

SectionsToday
Globe feature
Communications

Software library
Tech events &
    conferences
Downloads
Discussions
Local ISPs
Techie Corner

Local industry market quotes
Computers
Electronics

More news on Boston.com
MIT Technology Review
On Computers
Game Zone
Mass HighTech
Computer News
Businesswire

From the Boston Globe
Plugged In
Emerging
     Business

Yellow Pages
Computer repair
Computer services
Consultants
Internet services
New computers
Software
Supplies
Used computers
Cellular phones
Pagers & paging
Phone equip.
Phone service

SEARCH:
Alta Vista search
On boston.com
Keyword or Site

A search engine that uses linguistic analysis to cut to the chase

By Bob Weinstein, Globe Correspondant

Kathleen Dahlgren is on a mission: It's not to save the world, but to " InQuizitize" it.

She's talking about InQuizit, the intelligent software she and her partner Ed Stabler designed to find exact information rather than contend with a mountain of useless information most search engine software dishes up. InQuizit is the principal product of Santa Monica, Calif., based InQuizit Technologies, the company Dahlgren heads.

Dahlgren is a computer scientist with a Ph.D. in linguistics, and Stabler, a professor of computational linguistics at the Uninversity of California at Los Angeles. If you calculate the number of hours it took to perfect the software, Dahglren estimates she and her partner, plus a small army of linguists and programmers, have invested 85 person years. The partners started developing the software 14 years ago when they worked as senior scientists at IBM. So far, they've gone through more than $10 million in funding from investors, government, and nonprofit organizations.

Developing the software was a tedious and exhausting process because the partners had to mimic every aspect of human linguistic reasoning. But, it was worth the effort because the software has enough data -- including vocabulary, grammar and world knowledge -- to interpret Time magazine.

InQuizitize is the word Dahlgren coined to describe the process her software uses to analyze and index information so you can ask it questions in plain English and get precise answers back.

Radical? You bet, because the program is the closest thing to artificial intelligence, or AI, on the market. InQuizit is not the only AI software available, but it's the only one that has introduced semantics in the processing of information, according to Dave Waltz, president of the American Association of Artificial Intelligence and an adjunct professor of computer science at Brandeis University.

Says Victoria Fromkin, professor of linguistics at UCLA and former president of the Linguistics Society of America," InQuizit is unusual because it actually analyzes language according to linguistic principles. From a linguistic standpoint, it is the most sophisticated program of its kind. The other programs try engineering tricks, but they don't draw on what we know about the structure of human language."

How does it work? Dahlgren isn't about to over-explain the software and give away proprietary secrets -- not that we'd understand what she's saying. But, she will shower you with easy-to-understand examples of how InQuizit spits back precise information in record time.

Ask a typical search engine what the stock market's high and low was for a specific day and you're asking for trouble. Be prepared for everything relating to the word" market," which could mean virtually any kind of market and" stock," which could be livestock or" stock" meaning inventory or shares of stock." InQuizit interprets `high' with high value and `stock' means certificate of ownership as opposed to a kind of cow," she says." And, a `market' means the place where it sells those things, not a `market' where you go to buy food or other products."

No search engine can reason this way, according to Dahlgren. " The software out there uses 25-year-old technology that only matches words or patterns," she says." It does not use linguistics, meaning, or context. Because language is ambiguous and words have multiple meanings, most pattern-matching retrieval is highly inaccurate, and usually results in vast amounts of irrelevant data."

The problem is most software cannot figure what words mean, says Dahlgren. Take the question," How much do monitors (computer monitors) cost?"

" The best standard software can do is number crunching," says Dahlgren. " It says, `Here is a word and here is another word and they belong in this context cluster when they are near each other."

There is another family of standard software that throws back more information than you ever wanted, asserts Dahlgren." They either gather up words that are similar in context or they use a technique called `statistical enhancement' which throws away all the grammatical words, which actually tell you what words mean in context," she says." In the `monitor' example, you wind up getting everything related to monitor, which could mean `advise' as a noun or `observe' as a verb, not to mention synonyms which include `counselor,' `director' or `informant,' to name a few."

Like it or not, search engine veterans must deal with information overload or" thesaurus enhancing" as Dahlgren calls it, which yields one percent precision.

InQuizit is being used by corporate and government search engines to locate data, documentation and inventory. Next month, Inquizit will be hoisting a sprawling Web site for Bible.com, an opportunity to put InQuizit through its paces. Ask it a question such as" Who's skin was cantankerous?" and it will return," Job had boils." Or, ask it a tricky one like," What was Job's job?" and InQuizit will tell you Job was the servant of the lord. Most search engines would go nuts with the last question, according to Dahlgren, because they couldn't see the difference between Job the person and `job,' something you do for a living. Or ask it," What did Jesus cross?" and it will retrieve the answer," The river."" Other search engines will give you a million references to the word `cross'," Dahlgren adds.

Waltz calls InQuizit a pleasant surprise." They [Dahlgren and staff] have given the computer enough knowledge to understand natural language," he says." Other systems that are trying to use statistical means to understand language are not going to be as successful."

Waltz explains that InQuizit is a significant improvement over what is currently being used because it is able to understand words more deeply than current systems that view documents as word salads.

InQuizit may be a step above current search engine products, but Dahlgren says it's far from perfect. You can't feed it complex chess moves and expect it to return perfect plays, for example. InQuizit can answer questions and retrieve information in context, but it cannot reason." If you were to ask it, `How many people died in World War II?', it wouldn't retrieve all the people who died -- the Jews, Slavs, Poles, Catholics, etc. -- and give you a total," says Dahlgren." But, it will provide a comprehensive list of every group that lost people."

Reasoning enters the picture when the software can figure out what words mean in context. But, we're getting closer every day, according to Rob McGovern, chief executive officer of Careerbuilder.com of Reston, Va., a network of career sites on the Web." The big issue for PC users is navigation," he says." This generation of technology assumes a vast working knowledge of computers. In order for computers to go from 40 percent of households to 100 percent, computers will have to be intuitive and be able to retrieve virtually any kind of information. You'll be able to say, `Find me a restaurant close to the movie theater' and it will know that you mean close to your home."

Attesting to the value of Dahlgren's efforts, McGovern says the" holy grail of computing is natural language navigation. It won't be long before computers can deliver enough power to users to process all the information needed to make sophisticated distinctions," he says.

Dahlgren insists InQuizit will be playing a vital role in the next generation of AI software. Plans are in the works to take the company public to raise enough money to invest in more computing equipment to design its own Internet search engine to be used on the World Wide Web.

That's when things heat up. If all goes according to plan, Dahlgren is confident InQuizit could put some of the major search engines out of business -- that is until a competitor leaps into the market with software that can achieve more dazzling linguistic feats.



 


Advertise on Boston.com

or
Use Boston.com to do business with the Boston Globe:
advertise, subscribe, contact the news room, and more.

Click here for assistance.
Please read our user agreement and user information privacy policy.

© Copyright 1999 Boston Globe Electronic Publishing, Inc.