AI tries to learn English

This is the 11th installment in a series on the stupidity of artificial intelligence. Read Part 1,Part 2, Part 3, Part 4, Part 5, Part 6 , Part 7, Part 8, Part 9 and Part 10.

From the very beginning, language has been central to AI’s effort to imitate the human mind.

As I mentioned in the last installment, “mind” per se is a nebulous and slippery concept. Without some sort of structure, artificial intelligence has nothing to work with, nothing to bite into.

Here language comes to the rescue. Languages involve sounds, letters or other symbols, collections of words and phrases, grammatical rules and constructions. Plenty of structure, of the sort a digital computer can play with.

Thus, one might expect that human language would provide an ideal starting point for AI to take on the human mind in general. Logic and mathematical reasoning, which also have lots of structure, would provide a second flank. I’ll talk about that in a later article.

Artificial intelligence has made astonishing progress in dealing with human languages, also referred to as “natural languages”, in contrast to artificial languages such as those invented for programming computers.

Today people are already conversing with computers. AI chatbots with friendly voices are waiting to help us with banking, travel booking, e-commerce, complaints to the government and medical emergencies. They even give us psychiatric care.

At the same time, there are more or less usable apps, combining voice recognition, voice synthesis and automatic translation, that can serve as simultaneous interpreters, allowing people with different languages to carry on a conversation.

It would be a big mistake, however, to conclude that from these examples that AI has succeeded in simulating the language processes that actually take place in the human mind. Quite the contrary. What AI has been able to accomplish up to now could best be characterized as a useful form of cheating. I shall explain what I mean.

In fact, “meaning” is the key to the whole problem of artificial intelligence. Meaning is where the dog sleeps.

Take the problem of machine translation (MT) between two natural languages. This is one of the first problems taken up by the developers of artificial intelligence – a clear-cut, practical task that poses many of the fundamental issues still confronting AI today.

AI has not yet succeeded in simulating the language processes that actually take place in the human mind. Image: Facebook

In this context, it is fascinating to read the papers delivered at the historic “Conference on Mechanical Translation” held at the Massachusetts Institute of Technology in June 1952.

The list of the participants – 18 in all – includes individuals who subsequently played major roles in the development of computer technology, AI and modern linguistics. Defense-related institutions were well represented, given the obvious relevance of the subject. (Think of Russian-English translation.)

By 1952 people were already experimenting with rudimentary computer translation programs, and a great deal of creative thinking was going on, particularly on the part of language experts. The linguists clearly recognized the fundamental nature of the challenge posed by machine translation. There were many interesting ideas.

Most participants saw 100% automated translation as an extremely distant perspective, if at all possible. In the meantime, obtaining usable and reliable results would require human post-editing of the machine output, and possibly also human pre-editing – for example, to rephrase complex grammatical constructions that the computer could not digest.

Note: reliability and precision were important requirements for the applications MT sponsors had in mind at the time. Similarly today. However helpful Google Translate and similar systems might be in informal contexts, it would be folly to employ them in diplomatic negotiations, for example, or anywhere else where precision is important and errors or misunderstandings could have serious consequences.

Particularly interesting and insightful, I find, were the contributions by Yehoshua Bar-Hillel, chief organizer of the MIT conference and one of the most influential figures in the early period of machine translation.

Yehoshua Bar-Hillel in a file photo. Photo: Wikipedia

Bar-Hillel later made major contributions to linguistics and the philosophy of language. His writings exemplify how wrestling with the problem of machine translation stimulated creative thinking in a variety of directions.

From the very beginning, Bar-Hillel emphasized “brain-machine partnership” as the only practical approach. He argued against unrealistic expectations that computers could fully replace humans in the translation of natural languages.

Later, amid growing hype about machine translation and artificial intelligence generally, his warnings became more emphatic.

In 1960 Bar-Hillel authored a controversial article entitled “A Demonstration of the Nonfeasibility of Fully Automatic High Quality Translation.” Among other things the article prophesizes the disaster referred to as “AI Winter” – the drastic cutback of AI funding, brought on by exaggerated promises concerning machine translation, which began six years later.

In the opening paragraph, he speaks of “the reluctance of many MT workers to recognize that the idea of inventing a method for fully automatic high-quality translation (FAHQT) is just a dream which will not come true in the foreseeable future. By not realizing the practical futility of this aim … they have misled themselves and the agencies which sponsored their research into not being satisfied with a partly automated translation system, whose principles are well understood today, and instead to wait for the real thing which was believed, and made to believe, to be just around the corner.”

Bar-Hillel’s “Demonstration” is so very relevant the subject of this series, that I cannot resist quoting briefly from it here.

To make his point he poses the problem how an machine translation system would deal with the phrase “The box was in the pen,” as it appears in the passage: “Little John was looking for his toy box. Finally he found it. The box was in the pen. John was very happy.”

The immediate problem, of course, is that the English word “pen” has more than one meaning. It can signify a utensil used for writing – but it can also refer to an enclosure, such as one in which children can play: a “playpen.” To make a correct translation into a second language, the computer obviously has to determine which of the two meanings is intended.

A writing pen versus a playpen. Image: Wikipedia

Bar-Hillel writes: “I now claim that no existing or imaginable program will enable an electronic computer to determine that the word pen, in the given sentence within the given context, has the second of the above meanings, whereas every reader with a sufficient knowledge of English will do this ‘automatically’….

“Why is it that a machine, with a memory capacity sufficient to deal with a whole paragraph at a time, and a syntactico-semantic program that goes, if necessary, beyond the boundaries of single sentences up to a whole paragraph … is still powerless to determine the meaning of pen in our sample sentence within the given paragraph? The explanation is extremely simple, and it is nothing short of amazing that, to my knowledge, this point has never been made before, in the context of MT.”

To paraphrase Bar-Hillel’s argument: An intelligent human reader will naturally understand that the word “pen” in the above passage refers to an enclosure – a playpen for children to play in – rather than a pen to write with. Why? The first sentence identifies “box” as a box holding toys. Since the reader is aware that such a box is much larger than a writing pen, the first interpretation is automatically excluded, without the reader even having to conciousnessly think about it.

There are evidently countless analogous cases, where a correct choice between alternative meanings of a word or phrase depends on knowledge that is not contained in the text, including the neighboring sentences and even paragraphs.

Bar-Hillel remarks: “Whenever I offered this argument to one of my colleagues working on MT, their first reaction was: ‘But why not envisage a system which will put this knowledge at the disposal of the translation machine?’”

Replying that, in view of he fact that “the idea of a machine with encyclopedic knowledge has popped up also on other occasions,” he explained:

“The number of facts we human beings know is, in a certain very pregnant sense, infinite. Knowing, for instance, that at a certain moment there are exactly eight chairs in a certain room, we also know that there are more than five chairs, less than 9, 10, 11, 12, and so on ad infinitum, chairs in that room.

We know all these additional facts by inferences which we are able to perform, at least in this particular case, instantaneously, and it is clear that they are not, in any serious sense, stored in our memory.”

This is only the beginning of the story, however. Stay tuned for the next installment of this series.

Jonathan Tennenbaum received his PhD in mathematics from the University of California in 1973 at age 22. Also a physicist, linguist and pianist, he’s a former editor of FUSION magazine. He lives in Berlin and travels frequently to Asia and elsewhere, consulting on economics, science and technology.

AI tries to learn English

Turkey-China look beyond Uyghurs toward BRI and BRICS

Will foreign recruits be willing to die for Australia?

US in a hypersonic hustle to catch China, Russia

Sign In

Want to comment on Asia Times stories?

Sign In