Demo Site

Tuesday, July 20, 2010

Translating Wikipedia

Translation is key to mission of making information useful to everyone. For example, Wikipedia is a phenomenal source of knowledge, especially for speakers of common languages such as English, German and French where there are hundreds of thousands—or millions—of articles available. For many smaller languages, however, Wikipedia doesn’t yet have anywhere near the same amount of content available.


To help Wikipedia become more helpful to speakers of smaller languages, They’re working with volunteers, translators and Wikipedians across India, the Middle East and Africa to translate more than 16 million words for Wikipedia into Arabic, Gujarati, Hindi, Kannada, Swahili, Tamil and Telugu. They began these efforts in 2008, starting with translating Wikipedia articles into Hindi, a language spoken by tens of millions of Internet users. At that time the Hindi Wikipedia had only 3.4 million words across 21,000 articles—while in contrast, the English Wikipedia had 1.3 billion words across 2.5 million articles.

They selected  articles using a couple of different sets of criteria.

 

  • They used Google search data to determine the most popular English Wikipedia articles read in India.
  • Using Google Trends, Wikipedia found the articles that were consistently read over time—and not just temporarily popular.
  • Finally using Translator Toolkit they translate articles that either did not exist or were placeholder articles or “stubs” in Hindi Wikipedia.

 

Number of non-stub Wikipedia articles by Internet users, normalized (English = 1)

 

Wikipedia also found that there are many Internet users who have used their tools to translate more than 100 million words of Wikipedia content into various languages worldwide. If you do speak another language we hope you’ll join Wikipedia in bringing the content to other languages and cultures with Translator Toolkit.

0 comments:

Post a Comment