On 30 June, the Wall Street Journal announced twelve finalists for the 2010 Asian Innovation Awards. The awards are Asia’s premier honour for individuals and companies who come up with innovative ideas, methods or technologies. Engkoo was selected out of hundreds of new technologies reviewed by the publication. The competition is diverse and wide ranging, and includes innovations in the areas of biomedical engineering, environmental efficiency and software, spanning all of Asia.
Engkoo is a new kind of language assistance technology helping Chinese people master English as a native speaker might. It unifies human translation mined from the web,
Read more…
On 30 June, the Wall Street Journal announced twelve finalists for the 2010 Asian Innovation Awards. The awards are Asia’s premier honour for individuals and companies who come up with innovative ideas, methods or technologies. Engkoo was selected out of hundreds of new technologies reviewed by the publication. The competition is diverse and wide ranging, and includes innovations in the areas of biomedical engineering, environmental efficiency and software, spanning all of Asia.
Engkoo is a new kind of language assistance technology helping Chinese people master English as a native speaker might. It unifies human translation mined from the web, machine translation and a language learning experience, into one easy-to-use search and explore interface. By continuously discovering and analysing high quality translation knowledge on the internet, Engkoo can be used to close the expanding gap between English and Chinese.
The Engkoo technology is particularly relevant in China today. Scenarios range from assisting information workers who increasingly need to communicate globally, to addressing the challenges of finding fluent sources of English learning materials in schools, to helping solve the problem of “Chinglish” (poor English-Chinese translations) found on many public displays and materials.
Notable features of Engkoo include:
1) A state-of-the-art Chinese-English statistical machine translation engine which works behind the scenes when a sentence or paragraph of text is submitted for translation. The corresponding Chinese and English words are highlighted as the mouse moves over them, and each can be individually selected to get a deeper understanding through an automatic seamless dictionary lookup experience.
2) An innovative method for comparing two similar words. This is relevant because the differences in use of near identical words are often quite subtle to English learners. Engkoo addresses this with a new kind of combinable interface that was recognised by an international research conference as a novel human-computer interaction technique; the Chinese press and online community echo this sentiment.
3) The ability to explore example sentences by breaking them down by category/domain. Users can learn at their own speed by selecting easy/medium/difficult English, as well as English that is ‘written’, ‘oral’ or ‘technical’. These classifications are performed automatically through a novel machine learning technique (an automatic classification of data by computer algorithms) and then applied on a massive scale.
4) A new kind of phonetic-based ‘fuzzy’ search adapted to the local pronunciation habits of Chinese to enable them to find the difficult-to-spell words they are looking for, and then learn the proper spelling. By performing a user study, the Engkoo team discovered that Chinese ESL (English as a Second Language) users often search for words as they sound, such as those they hear from foreign colleagues or from music or television. Such behaviour is a major limitation of other language learning services because often those misspelled words cannot be found and hence the learning process ends. An example is searching for ‘shampin’, which mainland Chinese speakers commonly pronounce when their intent is ‘champagne’ – such a mapping is enabled with Engkoo.
5) The ability to see related translated words or phrases in bilingual sample sentences in real time as the user hovers his or her mouse over the text. The alignment information not only clearly exposes the structural differences between translated sentence pairs, but also provides instant translations. The technique used here leverages a unique approach to exposing machine learning based alignment.
6) The ability to learn and explore fluent/native English by finding statistically nearby words or ‘collocations,’ which is difficult if not impossible to discover by oneself without reading huge volumes of English text. In particular this system works due to a novel technique of leveraging ‘part of speech’ wild cards. For example, users can find prepositions that typically follow the word ‘terrific’ by simply searching for ‘terrific prep’. In this example, they could find sentences such as “I think it looks terrific on you”. These sentences are statistically significant because they are mined from the web, capturing human translation, and on a massive scale: 10 million bilingual pairs are currently in the system and growing daily. Another example is that the user can complete a sentence by finding common words to fill the search template, such as finding the best adjectives in the sentence ‘she is a adj. lady’. This would result in examples such as “She is a charming young lady’.
7) The Text-to-Speech (TTS) feature in Engkoo, which can convert input text into natural sounding speech, is one of the most well-received features. The technology is capable of synthesising the sound evolution, the ups and downs of intonation change and stressed or unstressed points of any given sentence. Besides being phonetically accurate, the prosody (rhythm of spoken speech), which is very difficult for a non-native English speaker to produce, is rated as very close to that of a native English speaker. The TTS interface is also well-designed to facilitate easy playback and downloading to a user’s MP3 player for later listening and practicing.
8) An ever-growing lexicon, currently composed of millions of terms and sample sentences, is available to users. To achieve this, Engkoo uses novel web-mining technology to extract high quality human translation knowledge from the web. This essentially creates a dynamic dictionary from discovered translated Internet content. What makes this useful is that it’s ‘real’ English, relevant and endlessly expanding. The Engkoo web crawlers work every day to extract translation knowledge, discovering thousands of new words and sample sentences; in effect using the web as a sponge to capture language as it changes and grows.
Since its first appearance in the Chinese market on 1 May 2009 as a product of Microsoft Bing, Engkoo’s achievements include:
1) The official adoption by the Shanghai government as a tool to help correct poor English translations on public signs and other materials for the ongoing 2010 Shanghai Expo event. The Government reported that over 10,000 signs have been republished with correct English.
2) Adoption by several leading Chinese universities for their English learning curriculums and related programmes, notably by the Department of Foreign Languages in Tsinghua University in Beijing. Feedback from the professors is very positive for using Engkoo as a learning device, and it is recommended to other English teaching leaders in China. Another example is the Education Information Technology Center of China’s Ministry of Education in Huazhong Normal University, which also uses and promotes Engkoo as an effective method of learning English in China’s schools.
3) A 500% traffic growth to the Engkoo web application over the past year, indicating escalating interest from the Chinese public. It is a sticky service with the majority of daily users being return users.
4) The quality of the underlying technology of the Engkoo Chinese-English machine translation engine achieved top ranking in the recent China Workshop on Machine Translation (CWMT) and the NIST MT evaluation workshop.
5) Rating the Text-to-Speech (TTS) technology used in Engkoo as the best in intelligibility in the international TTS contest, Blizzard Challenge 2010, in both English and Chinese. It should also be noted that the underlying TTS technology has been successfully applied to 26 languages and shipped within the 2010 Microsoft Exchange Server product.
Within Microsoft, the Engkoo project has won both regional engineering excellence and research awards. The project is a good demonstration of cross group multi-disciplinary collaboration, because it is a composition of both product and research group contributions. In China, the Bing and MSN product teams contributed engineering experience and market expertise to make Engkoo a competitive product. The research groups of Microsoft Research Asia, including Natural Language Computing (in collaboration with Microsoft Translator in Redmond), Speech, Human Computer Interaction and Web Data Management, all contributed to make Engkoo a state-of-the-art technology. The coordination between research and product was carried out by the Innovation Engineering Group, which led the development of the project.
A driving force towards making Engkoo a reality was the decision by the senior leadership team to sanction deployment-driven research (DDR) as one of the available strategies for the Microsoft Research Asia lab. The DDR strategy provides a way for technologies under active research and development to be released directly to the public; this allows for bigger scenario driven projects.
Engkoo Video Demo:
http://cid-a7a7ea4bf16b905a.skydrive.live.com/self.aspx/.Public/engkoo-screencast.wmv
Engkoo Web Address:
www.engkoo.com