The translation engines implemented in the online translation services perform real-time translation of words, phrases and texts between over 60 languages in more than 4,000 language combinations.
The translation is powered by Google Translate, Microsoft Translator, Babylon Translator and other machine translation engines.
These online translation providers use the statistical machine translation approach, which generates translations using statistical methods based on bilingual texts. The statistical machine translation uses existing source and target language translations (done by human translators) to find patterns it then uses to build rules for translating between those languages. The statistical approach allows improving the accuracy of the translation with more bilingual texts utilized. Where such corpora are available, impressive results can be achieved translating texts of a similar kind.
The quality of the machine translation directly depends on the quality of the delivered source text. Certainly, the guidelines below will not solve all problems of machine translation, but using some simple tips, you can obtain better results.
Avoid misprints and spelling errors
The machine translator cannot correct errors and recognize incorrectly written words. Check the spelling of your text to make sure that it is error free prior to translating.
Bear in mind punctuation marks
Skipped or, on the contrary, redundant punctuation mark can prevent an electronic translator from understanding of syntactical structure of the sentence correctly. Therefore it is necessary to put a period (.) at the end of the sentence. Put the correct question marks (¿ ? ) in Spanish language question sentence.
Place diacritics correctly
Spanish language : "sí" would be translated as "yes", but "si" (without the accent) would be translated as "if".
French language : "où" would be translated as "where", but "ou" (without the accent) would be translated as "or".
German language : "wurde" would be translated as "became", but "würde" would be translated as "would become".
Observe the case of letters
A lowercase letter in a word can quite become a capital one (for example, at the beginning of the sentence, in the header), and it is taken into account when developing MT systems. Besides, there are languages where the first capital letter in a word changes its attribute to one or another part of speech. For example, in German language all nouns are written with capital letter both at the beginning and in middle of the sentence. Compare these translations: "wie funktioniert das ubersetzen mit dem "clipboard"?" - "How it works translate with “clipboard”?" Or " Wie funktioniert das Übersetzen mit dem "clipboard"?" - "How does the clipboard translation work?"
Try to use simple constructions with direct word order
For example, the first place in the sentence takes up the subject or its group (I, you, he, my cat, my chief, son of my girlfriend). The second place takes up the predicate expressed by a verb (want, know, like). Next there come modifiers expressed by different parts of speech.
Avoid skipping of syntactic words
English sentence: "Your e-mail address is the address other people use to send e-mail messages to you" will be translated into Russian as not quite understandable text: "Ваш адрес электронной почты — адрес другое использование людей, чтобы послать почтовые сообщения Вам." After restoring a skipped word, the conjunction "that": "Your e-mail address is the address that other people use to send e-mail messages to you", the translation is quite correct: "Ваш адрес электронной почты — адрес, который другие люди используют, чтобы послать почтовые сообщения Вам."
Use only standard abbreviations
An incorrect translation of an abbreviation is only one part of the problem. The problem is that even one not translated word can prevent the translator from analyzing the sentence correctly.
How the rule-based machine translation works
Machine translation (MT) systems work with natural language, a data set that is infinitely varying, ambiguous and structurally complex. To translate adequately, the rule-based MT system must encode the knowledge of hundreds of syntactic patterns, variations, and exceptions, as well as the relationship among these patterns. MT system must include the dictionary and specific semantic knowledge about the usage of tens of thousands of words.
The system is to ensure the accurate identification of the part of speech and grammatical characteristics of words in different contexts to identify them as nouns, verbs, or adjectives, each having many possible translations. Translation also requires a vast store of knowledge about the world, the intent of the communication, and the subject matter. All this information is stored in the system as a set of rules and relationships.
How the statistical machine translation works
Statistical machine translation uses a computer to compare two documents – one in the original language and one translated by a human. It finds patterns and links between the two and uses them to create its own future translations.
Statistical machine translation was re-introduced in 1991 by researchers at IBM's Thomas J. Watson Research Center and has contributed to the significant resurgence in interest in machine translation in recent years. Nowadays it is by far the most widely studied machine translation method.
Google dreams of a world where hundreds of languages can be simultaneously translated by machines which compare texts using statistics rather than applying grammatical rules. Google has used documents from the European Commission and United Nations to feed its machines.
Franz Och, who runs Google’s translation team, said that early efforts impressed people with experience of machine-run translation systems. Och said: “The more we feed into the system the better it gets.”
Afrikaans, Albanian, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Catalan, Chinese (simplified), Chinese (traditional), Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Irish, Italian, Japanese, Kannada, Korean, Lao, Latin, Latvian, Lithuanian, Macedonian, Malay, Maltese, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Vietnamese, Welsh, Yiddish
Arabic, Bulgarian, Catalan, Chinese (simplified), Chinese (traditional), Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Haitian Creole, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Malay, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Thai, Turkish, Ukrainian, Urdu, Vietnamese
Arabic, Bulgarian, Chinese (simplified), Chinese (traditional), Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hausa, Hebrew, Hindi, Hungarian, Italian, Japanese, Korean, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Spanish, Swedish, Thai, Turkish, Ukrainian