Posts Tagged ‘mt’

 

The end of free Machine Translation API

Last June Adam Feldman (API Product Manager at Google), announced they were pulling the plug on their Google Translate API, causing a lot of concern and some protests in the developers and localization world. You can read the announcement here.

Then in August, Jeff Chin, (International Product Manager at Google) took that back and announced that they were offering the Translate API at a cost instead of free of charge. You can read the announcement here.

Here is Google’s pricing model:
$20 per million characters of text translated.

In September, Vikram Dendi (Director of Product Management at Microsoft), announced something very similar, but not many people took notice. You can read the announcement here.

Here’s Microsoft’s pricing model:
No cost up to 4M characters a month. Then $10 per million characters.

Unlike Google, Microsoft will only charge you when you reach the threshold of 4M characters a month and will then cost half as much ($10 per million characters instead of $20).

Quality of Google and Bing Machine Translation services

The quality of Google and Bing Statistical Machine Translation systems now that the technology is mature, heavily depends on the quality of the parallel text found on the web and crawled by their MT engines. Before the advent of Google and Bing translate, parallel text found on the web – more often than not – was produced by professional translators, and therefore of good quality.

Now, translating content professionally is expensive. Depending on the domain of translation and the language pairs, professional translation can cost as much as $0.50 per word for a language such as Japanese and between $0.18 to $0.21 per word for European languages.

During the recent financial crunch in 2008, many web publishers needed to cut costs. It’s not a surprise that they started to abuse the free Google Translate and Bing Translate API to translate content and then publish it as is, with no professional review.

This is a common technique that SEO companies have been applying to bring more users to a website and then turn them to premium content (professionally translated content).

The problem is that no algorithm is (yet) capable to understand whether content has been translated by a Machine Translation system or by a professional translator. Only trained human translators that speak the language can do that.

Today, both Microsoft and Google Machine Translation engines are crawling and processing web content that may have been published without any human proof-reading after being translated using the very same Google or Microsoft’s translation API.

In other words, these two companies are “polluting their own drinking water”.

I hope that by starting to charge for their Machine Translation Services both Google and Microsoft can decrease or at least control the amount of sub-standard translations published on the web so that in turn their MT engines can produce more reliable translations. Feeding their engine with United Nations and European Union bilingual documents is not enough to produce high quality translation.

Size doesn’t matter without quality

Many publishers in recent years have started to build their own corpora of bilingual texts to feed their Machine Translation engines with. It’s a given that an ad-hoc Machine Translation database fed only with high quality human translated and proof-read bilingual text in a specific domain can produce higher quality than Bing and Google Translate.

Unfortunately at some point these publishers may start to pollute their MT systems with content that has been machine translated and not carefully reviewed by professional translators.

We have seen this happening in the past, for example when the hype was all about Translation Memories instead of MT engines as it appears to be today. Some companies saw their Translation Memories growing bigger and bigger with no or little control on the quality of the content they were fed with, thus polluting their TMs and making them almost unusable.

Francesco Pugliano

 

Advertisements