The tumultuous history of Dialog Systems

Posted on Jun 9, 2009 by in artificial intelligence | 0 comments

Chinese version – 中文

The idea of a {en:Dialog System} is probably as old as the field of computer science itself.  It is hard to know if {en:Charles Babbage} already thought about it in the 1830s when he created his  {en:Analytical Engine} and then his {en:Difference Engine}; but it is clear that {en:Alan Turing} set the definition of the ultimate Dialog System when he described the {en:Turing Test} in his paper {en:Computing Machinery and Intelligence} in 1950.
turing_test_version_3
From Wikipedia –   The “standard interpretation” of the Turing Test, in which player C, the interrogator, is tasked with trying to determine which player – A or B – is a computer and which is a human. The interrogator is limited to only using the responses to written questions in order to make the determination.

Turing predicted that machines would eventually be able to pass the test and that 30% of human judges would be fooled in a five-minute test by the year 2000.  Futurist {en:Raymond Kurzweil} updated it to 2020 in 1990 and revised it to 2029 in 2005.

This last prediction appears to me as uncertain as any of the prior ones, but many interesting Dialog Systems have been developed already and, thankfully, the market does not need the Turing Test to be passed to start adopting them.

The fundamental difference between Chatterbots and Dialog Management:

Before providing some elements about the history of Dialog Systems since 1950, it is important to consider that two different trends have been pursued throughout the past decades: the one of simulating a dialog in appearance (which I will call the Chatterbot approach) and the one of modelling a real understanding of the dialog and generating the appropriate answers dynamically (which I will call the Dialog Management approach). We can find the reason for this co-existence directly in the definition of the Turing Test which considers only the impression of validity of the answers provided and not any other sort of proof  of understanding.

In the reality, the developed system sometimes mix both approaches, but one almost always has a clear predominance on the other (and to some degree, one can claim that a Chatterbot has a Dialog Manager inside, even if it usually based on simple pattern matching rules). Let me give an simple example to compare what I mean:

The user asks: “Can you buy me a bottle of milk?”

In a Dialog Management approach, the computer could (but this is just an example), build an model of this kind: [type:Question; action:buy;interrogation:ability to perform action;object:bottle of milk] on which it would query what would have to be a fairly complex knowledge reference-base to answer, for instance, “no, because I have no money”, or possibly to ask a question: “It depends, can you give me some money?”.

In a Chatterbot approach, this internal processing doesn’t exist and a predefined answer is selected through quite simple rules (“Can you*?” => list [“yes, of course”, “no, i can’t”, “no, I don’t want to”]). The answer is often randomly selected in the list. The answer can appear to make sense, however, the system has no real understanding of the question, he just fakes to be able to perform a dialog.

It is easy to understand that the Chatterbot approach simply doesn’t make any sense if the goal is to perform a real action and not just to provide an answer (how could a system perform any meaningful action by faking the understanding of the requests of the users?). In my view, it doesn’t make sense when it comes to textual conversation either, because the fundamental limitations of Chatterbots are too great to provide any sustainable value, even when only answering questions.

However, the “wow” affect of good demo cases is so great that, repetitively during the last 50 years, many people have been duped  by the false impression that a free dialog system could work efficiently with a Chatterbot approach. Billions of dollars have been spent in pure vain and, to my great desperation, I predict more will be spent in the future, until the Dialog Management provides sufficient results to simply eradicate this this shameful error in the evolution of computer science.

The history of Chatterbots:

Everything really started in the 60s, when {en:Joseph Weizenbaum} developed {en:Eliza} in the MIT, which is commonly referred to as the first Chatterbot. The most famous program of Eliza was the DOCTOR script, which provided a “parody” of the responses of a non-directional psychotherapist in an initial psychiatric interview. The irony was that, even if Weizenbaum believed the system had a great interest due to the emotional reaction it created on people, he never really considered it as a solid base for more intelligent systems.  Even more so, the multitude of meaningless discussions and wrong conclusions Eliza created in the society pushed him to write the book Computer Power and Human Reason: From Judgment to Calculation, which argued that the misuse of artificial intelligence has the potential to devalue human life. But people preferred to play with Eliza rather than to read his book and, as in Pygmalion (from which, due to another irony, he chose the name Eliza), he had no control on his creation. If people wanted to believe in the potential of a Chatterbot, even without any scientific justification, it couldn’t be stopped. 50 years later, as I write this post, many companies tried to convince people to buy pattern matching-based systems, even if many historical cases should remind us they are doomed to fail miserably.

6 years later, in 1972, psychiatrist {en:Kenneth Colby} created {en:PARRY} at Stanford University. The basis were the same as for Eliza, but Parry tended to simulate a paranoid schizophrenic, instead of a psychotherapist. Even if Colby made a much more serious effort, the result was fully similar. As you can imagine, the connection between Eliza (the psychotherapist) and Parry (the schizophrenic) was inevitable, here is the result of their meeting at the ICCC in 1972. As it could have been guessed, connecting two stupid Chatterbots together didn’t result in anything great…

The field didn’t get more serious with {en:Racter}, and The Policeman’s Beard Is Half Constructed, a book the creators, {en:William Chamberlain} and {en:Thomas Etter}, pretended had been fully written by Racter. The result was impressive, but the program was never released to the general public. One year later, when {en:Mindscape} released a Chatterbot version of Ractor, it became clear for everyone that Ractor was far less sophisticated than anything that could have written the fairly prose of The Policeman’s Beard. The story is still not very clear today, but it seems obvious Chamberlain and Etter created huge data files containing most of the text of the book, which Ractor just “joint” together.

As early as 1978, Michael Mauldin was one of the first person who tried to bring some reasoning into a Chatterbot, when he created PET, a Chatterbot able to posit new information and became famous for the following Dialog:

Subject:                I like my friend
(later)
Subject:                I like food.
PET:                       I have heard that food is your friend.

Then, Mauldin created Virtual Personalities (now Conversive) and two famous Chatterbots: Sylvie (1994) and Julia (1997).  The key aspects provided by Maulding and Peter Plantec (the other founder) was, in addition to provide incorporated animation and synthetic voice, the ability to explore a virtual world (e.g.: a web site) and use the gathered information in a dialog. In this sense, Mauldin try to open a way out of the pattern matching approach, realizing the knowledge had to come from outside and couldn’t be a pre-formatted data file as for ELIZA. Another interesting aspect is that Mauldin is also the founder of Lycos, a search engine which was initially an extrapolation of Julia. Mauldin is also the inventor of the term {en:ChatterBot} in 1994 (as a synonym for Artificial Conversational Entity (ACE))

In 1990, the {en:Loebner Prize} contest was created as an annual competition in artificial intelligence that awards prizes to the {en:Chatterbot} considered by the judges to be the most human-like, following the same format as the Turing test. The Loebner Prize does not require the Dialog Systems to be based on a pattern matching approach, and therefore, the day reasoning based system will work, they will be able to prove their ability. However, this contest do not reward the sophistication of the approach, but only the result, by following casual chatting scripts and evaluating the relevance of the result. As an effect, systems providing quick results are destined to be rewarded, rather than more serious efforts which would try to solve one small aspect at a time.

Another significant player in the Chatterbot history is {en:Richard Wallace}, the founder of  A.L.I.C.E. (Artificial Linguistic Internet Computer Entity). Wallace took a different approach which was quite rewarded as Alice won the Loebner Prize 3 times (2000, 2001 and 2004). His approach went back to a purely pattern matching one, but he created an XML Schema called {en:AIML} (Artificial Intelligence Markup Language) for specifying the heuristic conversation rules. The advantage of this approach was that it was easy to create and share knowledge in an AIML file, as well as to load many AIML files together to have a “smarter” bot.

My opinion is that all these efforts on Chatterbots based on pattern matching are a monumental waste of time and money (as we will see here-after, to the level of billions of dollars). You don’t believe me? try Eliza and compare it with 2008 {en:Loebner Prize} winner Elbot. Tell me how you really believe that these 50 years of efforts were worth it. Were we really digging at the right place?

Chatterbots in the business world:

The average lifetime of commercially employed chatterbots is restricted to only 6 month.
Forrester Research

One of the most fascinating story related to how large companies believed in the potential of Chatterbots is the case of {en:Artificial Life}, a company founded in 1994 which was able to sell custom-made Chatterbots applications to companies like Credit Suisse First Boston, Price Waterhouse Coopers and UBS. The company still exists, and is actually doing quite well, but now in a totally different field (mobile gaming), as, after the Internet bubble burst, they lost more or less all their market.

What is interesting is that the company was able to become public on the NASDAQ (ALIF)  in 1998 and the market capitalization of this company reached a stock value of over $38 in February 2000. In June 2003, the stock was only worth $0.05, or 760 times less 3 years before. What is interesting is that the market capitalization when the stock had a value of $38 was over 1.8 billion USD. This value was less than 2.5 million USD 3 years later.

Artificial life is not the only case, but it is, to my knowledge, the biggest one ever. All this being said, I have to express my highest admiration to their executive team, and more particularly to their founder, Eberhard Schoneburg, who is still their current CEO, not only for having created such an amazing value in the domain of Chatterbot (even if for a short time), but more especially for succeeding in turning the company around, moving it to Hong Kong and having the second highest mobile penetration rate in the world.

Another famous example is the company Ask Jeeves (now Ask.com), who was able to convince Dell to adopt “Ask Dudley” for the online technical support in 1998. Ask Jeeves capitalized quite strongly on its natural language capacities with this Chatterbot-based technology and was able to grow quite well until 2000, reaching $58 million in sales. From a high of $190 per share in 1999, the company’s stock began spiraling downward, falling to just $.86 per share by 2002. Stuck with a technology that simply didn’t have what was necessary to perform properly, Ask Jeeves found a way out by purchasing a search engine company called Teoma Technologies. In 2005, the company announced plans to phase out Jeeves. On February 27, 2006 the character disappeared from Ask.com.

However, both these cases did show pretty strong sales successes compared to the average case in the world of Chatterbot as, in many cases, only grants funding are able to justify the costs of their implementation:

“Most of the German bots were build with funds from grants.”
A Trend from Germany: Library Chatbots in Digital Reference

The Chatterbot technology is inherently stunted in it’s evolution and it’s effectiveness due to the approach and fundamental basis of Chatterbot technologies idea of pattern matching… The fact that there is a completely faked but nevertheless appearance of some form of Artificial Intelligence to the user may make for a sexy sale… but the smoke and mirrors approach in offering customers a new ‘feature’ with very limited value, if any at all, has been historically proven that the road to failure is a short one…

The history of Dialog Management:

Dialog Systems based on Reasoning, in opposision to Chatterbots, try to do much less, but with much more control. As a result, their implementations are usually quite focused to a specific domain requiring specific actions. Even if the history of Dialog Management has not been exposed to the market as widely as Chatterbots, they also have an interesting history.

The first reference I found of Dialog Management being really used is in 1986, in the article “Dialog management for gestural interfaces” of IBM. Of course, many efforts have been done before, but not really on a separated module defined as a Dialog Manager (or at least, not to my knowledge).

{en: Carnegie Mellon University} (CMU) is probably one of the research center which has been the most active in Dialog Management during the last 20 years, especially since the AGENDA dialog manager of Wu & Rudnicky in 1999. In 2003, Bohus & Rudnicky created the RavenClaw, which is now the standard Dialog Manager of the Olympus Dialog System Framework, the CMU architecture for spoken dialog system.

Such architecture already show very impressive results, not only limited to the scope of the Dialog Manager, but throughout the entire flow of the Dialog System (speech recognition, natural language processing, dialog management, output generation and text to speech). I am personally particularly impressed by the RoomLine application which show, in my opinion, already a great business potential yet not exploited at all by the market.

Dialog Management in the Business world:

While Chatterbots found their place in casual textual chatting, Dialog Managers tended to penetrate the vocal environment, but first, we needed a standard: AT&T, IBM, Lucent, and Motorola formed the {en:VoiceXML} Forum in March 1999, in order to develop a standard markup language for specifying voice dialogs. They published the VocieXML 0.9 standard the same year and the version 1.0 in 2000, followed by the version 2.0 in 2003.

In that light, the field has been heavily pushed in the direction of speech recognition and some very large company were created, like Nuance, the worldwide leader. The company was founded in 1992 and has now a market capitalization of 3.5 billion USD. Even if most of their products are related to speech recognition and document management, their products line based on dialog management is a valuable and growing part of their revenue.

The future:

Nobody knows how long Chatterbots based on pattern matching will find their place in ther market world and how many cases will be necessary for the market to finally understand the limitation of this approach.

On the other side, the hype of Dialog Management is still to come and it is to hope that they will go to the same heights than Chatterbots. We can already see the accomplishment of their early potential in the impressive works done at CMU and it is, in my opinion, just a matter of time until these technologies hit the market properly.

The interesting thing with the hype driven by Chatterbots, is that there is a clear desire especially in our current market context in having machines understand us.  The key lies not just in spewing out a response but rather in having computers ‘understand’ customers and interpret the understanding based on learned ‘experience’.  This in my view is where Dialog Management is taking us, with this approach we can see the basis for evolution, even organically, learning and adapting to ‘experience’ … while enabling technology to deliver the ‘understanding’, guidance and result we are all looking for.

en.pdf24.org    Send article as PDF   

Submit a Comment

Your email address will not be published. Required fields are marked *