Microsoft Research And Wikipedia Teamed Up to Enhance Multilingual Content

WikiBhasha tool will help simplify and speed up the process of creating multilingual content in Wikipedias.

Microsoft Research today announced the launch of the beta version of WikiBhasha, a multilingual content creation tool for Wikipedia. The WikiBhasha tool enables contributors to Wikipedia to find content from other Wikipedia articles, translate the content into other languages, and then either compose new articles or enhance existing articles in multilingual Wikipedias. The WikiBhasha beta is available as an open source MediaWiki extension, under the Apache License 2.0 at http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/WikiBhasha, and as a user gadget in Wikipedia. The tool is also available as an installable bookmarklet at http://www.wikibhasha.org, which is hosted on the Windows Azure platform from Microsoft Corp. The name WikiBhasha derives from the well-known term “wiki,” denoting collaboration, and “bhasha,” which means “language” in Hindi and Sanskrit.

WikiBhasha will support content creation in more than 30 languages. The beta version of WikiBhasha will enable easy content creation in non-English Wikipedias by leveraging the large volume of English Wikipedia content as the source of information. Initially, the Wikimedia Foundation and Microsoft Research will also work closely with the Wikipedia user communities focusing on content creation in Arabic, German, Hindi, Japanese, Portuguese and Spanish.

“We’re always happy to see work on improving multilingual collaboration between wikis,” said Danese Cooper, CTO of the Wikimedia Foundation. “Microsoft Research is doing some interesting work with WikiBhasha, and we’re very pleased that it chose to share its client code in open source as well.”

By making it easier for the Wikipedia community to create multilingual content, Wikipedia and Microsoft Research hope to inspire a new wave of multilingual content creation.

“The WikiBhasha beta holds the promise of enabling easy creation of content in multiple languages, and also of generating a large body of parallel language data for researchers to work on to further machine translation technology,” said P. Anandan, managing director, Microsoft Research India. “Creating quality content in multiple languages can be greatly improved and accelerated with the active participation of the Wikipedia communities.”

The WikiBhasha beta is a browser-based tool that works on Wikipedia sites. It features an intuitive and simple user interface (UI) layer that stays on the target-language Wikipedia for the entire content creation process. This UI layer integrates content discovery with linguistic and collaborative services, focusing the user primarily on content creation in the target Wikipedia. A simple three-step process guides the user in the content discovery and sourcing from English Wikipedia articles, composing a local-language Wikipedia article and publishing it in the target Wikipedia. Although a typical session may be to enhance a target-language Wikipedia article, new articles may also be created following a similar process. The WikiBhasha beta currently works on Windows Internet Explorer (7.0 and 8.0) on Windows XP, Windows Vista and Windows 7, and on Firefox (3.5 or above) on Linux Fedora (11 and 12), Windows XP, Windows Vista and Windows 7.

WikiBhasha, which is supported by Microsoft’s Machine Translation system and Microsoft’s Collaborative Translations Framework, was conceptualized by the Multilingual Systems Group at Microsoft Research India. The Multilingual Systems Group explores multilingual and cross-language technologies that work seamlessly across languages, and in creation of resources for aiding computational linguistics research. The tool was developed in collaboration with the Natural Language Processing Group in Microsoft Research Redmond.

A video tutorial that familiarizes users with the rich functionalities offered by the beta of WikiBhasha can be found at http://www.wikibhasha.org.

 

LEAVE A REPLY

Please enter your comment!
Please enter your name here