Technology that translates, and unites
A cellphone may let a U.S. soldier 'speak' in Pashto or Dari. A browser can pick up on linguistic nuance.
Jud Guitteau
This is the second in a two-part series on making the Web more worldwide. The first article is available here.
Skip to next paragraphThe "digital divide" between those who can afford an Internet connection and those who can't is sprouting an evil twin: a "language divide."
The roots of the Internet lie in United States military and university research projects, conducted in English. That language is still preferred online for international commerce and science.
But the scene is shifting rapidly. Tens of millions of new Internet users do not speak or read English and seek content in their own languages. China alone has 400 million people online – more than the entire US population – and the vast majority only read Chinese.
"There's a Chinese Internet that we don't interact with very much. There's an Arabic Internet that we don't interact with very much," says Ethan Zuckerman, cofounder of Global Voices, a community of more than 300 bloggers and translators around the world who seek out voices that are not ordinarily heard in the mainstream news media.
Those who can only read in their native language are "missing this extraordinary opportunity to get much, much better at understanding what people around the world are thinking and saying and feeling," he says.
In May, the organization that regulates Internet domain names made history when it permitted three countries – Egypt, Saudi Arabia, and the United Arab Emirates – to display their Web addresses (those ending in .eg for Egypt, .sa for Saudi Arabia, .ae for the UAE) in their native Arabic characters rather than English. Many other countries, including China, are expected to follow suit.
We're "way beyond" English as the language of the Internet, contends Mr. Zuckerman, whose Global Voices website is translated into 15 languages by more than 200 volunteers. "The Internet is for everybody these days."
The Web giant Google sees a commercial opportunity in making more of the Web readable, whether it's helping an English speaker read a Web page in Urdu or a Basque read an English-only website. Google Translate (translate.google.com) now offers quick, computer-generated translations between 57 languages, including Urdu (spoken by 60 million to 90 million people in parts of India and Pakistan) and Basque (with more than 600,000 speakers in Spain and France).
Google's Web browser, Chrome, sports a tool bar that offers to translate any Web page into a user's own language.
"The last few years there's been a blossoming of languages on Google Translate," says company spokesman Nate Tyler. "Our goal is to make it as good as it can be. At this point it is not as good as a human translator. It's hard to know when it can ever be."
Traditionally, efforts to undertake computerized translation centered on devising rules that the computer would follow, such as "If you see this word or phrase, it means this word or phrase in the other language."
But all sorts of problems creep in, many of which demand customized solutions. If a headline says "Clemson Tigers beat Georgia Bulldogs," for example, does it mean one sports team "actually physically beat" the other? asks Prem Natarajan, vice president of speech and language technology at Raytheon BBN Technologies in Cambridge, Mass. (For that matter, are we talking about real "tigers" and "bulldogs" or human athletes?)










Become part of the Monitor community
36K on Facebook | 12K on Twitter | 2,250 on YouTube