It's a good sign when the creator of a piece of software ends up using it. On a recent trip to Japan, Franz Och, who doesn't speak Japanese, was able to decipher restaurant menus and even read local news -- using his mobile phone, which provided him with the translations within seconds.
Och spent the last six years developing Google Translate, a translation program, at Google headquarters in Mountain View, California, "and so far I've never really used it myself," Och admits. But then the 38-year-old research scientist has a change of heart and adds, "I am very happy with what we have achieved."
Och, a German citizen, is the behind-the-scenes star of a segment of the software industry that has taken on a challenge no less daunting than tearing down global language barriers. In his job at Google, Och wrestles with multi-clause sentences, the subjunctive and auxiliary verbs, to produce a result that is an affront for any linguist. His machine translation program is based on sheer computing power, not linguistic know-how.
The system already commands 52 languages, and the databases for 296 other languages are in development. They include such exotic tongues as Sardinian, West Frisian and Zulu.
Google Translate translates entire websites, theses and even love letters in next to no time, often delivering surprisingly useable results. For Google, the benefits are obvious: With such a useful application, which also happens to be free, even more Web surfers can be lured to the company's website.
"Machine translation has reached a new quality level," Och enthuses, "it is much more heavily used all over the place; the software now has an impact in the real world."
"What Google is doing here is very impressive," says Alon Lavie of Carnegie Mellon University in Pittsburgh. The computer scientist sees the entire industry in motion. The market for translation software is growing rapidly, says Lavie. "These are extremely exciting times."
The age of machine translation has begun. Programs like Google Translate are pointing the way to a future in which anyone will be able to speak in foreign tongues at the press of a button. The ultimate goal of scientists who develop translation programs is an electronic version of the Babel fish, the fictitious species British author Douglas Adams concocted in his science fiction classic "The Hitchhiker's Guide to the Galaxy." In the book, he describes a leech-like creature, which simultaneously translates any language when it is inserted into a person's ear. Arthur Dent, the novel's protagonist, can even understand the crude poetry of the Vogons.
Developers haven't come that far yet in real life. However, there are already iPhone apps like "Jibbigo," which translates spoken English into Spanish at lightning speed. Alex Waibel, a computer scientist at the University of Karlsruhe in southwestern Germany, and also at Carnegie Mellon University, created the software. Waibel already uses computers to simultaneously translate many of his lectures, and he has also tested the technology with parliamentary debates.
A translator straight out of a computer lab was long seen as an audacious dream. How was the machine to know that in English, for example, the expression "breaking records" doesn't usually mean destroying vinyl LPs? Or that the German sentence "wir treffen uns im Schloss," means "we meet in the castle," and not, "we meet in the lock" (the German word "Schloss" means both castle and lock).
For a long time, computer scientists tried to cram the necessary world knowledge into the programs using a complex system of rules. But even with straightforward texts, the software often produced complete nonsense. According to Swamy Viswanathan of the US company Language Weaver, the attempt to force the English language, with all of its nuances, into a set of rules is a "nightmare." "Words often have several meanings, and the number of combinations is endless," says Viswanathan.
This prompted the experts at Language Weaver to pursue a different concept early on. They fed countless texts from the Internet that already existed in multiple languages into their systems. The specialists' reasoning was that almost every sentence and every phrase has already been translated many times over, and that pure statistics would suffice to decipher a linguistic construct.
For example, to figure out the German sentence "wir treffen uns im Schloss," the program searches its database for texts in which the words "treffen" ("meet") and "Schloss" ("castle" or "lock") appear in close proximity to one another. Then it goes through the translations of these texts, where it frequently finds the word "castle." As a result, the computer spits out the phrase "we meet in the castle" and not "we meet in the lock."