Fine-Tuning AI: Machine Translation Post Editing
Breaking down language barriers has become increasingly efficient thanks to machine translation (MT). MT is like a supercharged, polyglot robot that can deliver pages of translation at lightning speed. But there’s a catch—it’s not quite perfect yet. This gap between speed and having a message that appears native and engages audiences is where the human touch comes into play, known in the field as machine translation post-editing (MTPE).
MTPE and the Gap Between Automation and Precision
Over the past year, ChatGPT and other Generative AI-driven large language models have ignited substantial discussions about their potential impact on various aspects of life and business. These conversations included predictions about the transformative influence of AI in fields such as translation. However, well before ChatGPT made its debut, AI had already established its presence in the language services industry through the evolution of Neural Machine Translation (NMT).
Until 2015, MT was software that translated text using language rules, algorithms, and statistical models to analyze and interpret the meaning of the text and provide the results in the desired language. Leading MT models delivered a moderate level of accuracy and performed better with simple and formal text. The challenges of these early models were that they required data sets, regular upkeep, tuning, and maintenance of dictionaries. The early models also had significant accuracy issues.
In 2015, NMT became the preferred model for most language services companies. NMT utilizes artificial neural networks trained on large parallel corpora consisting of pairs of sentences in the source and target languages. Unlike previous approaches to MT, NMT uses end-to-end learning to generate a higher translation quality while supporting more complex content. It’s faster, more accurate, easier to manage, and cheaper than the previous iterations of MT.
The strength of NMT is that it can quickly translate large volumes of simple content that doesn’t require a high level of accuracy. However, both NMT and Gen AI have challenges. Each requires large training datasets and domain-specific learning datasets. NMT and Gen AI struggle with understanding and conveying nuanced or context-dependent information, and both introduce gender and social biases into their results.
Machine translation post-editing (MTPE) uses MT to process translation alongside a human editor who reviews and improves the output text. While raw MT output works well in a number of use-cases, it often lacks the necessary level of cultural sensitivity, tone, and readability required to make a translation sound natural to a native speaker. As such, post-editing by a human translator is recommended in nearly all cases when MT is used. The editor refines the MT output by correcting errors, improving the fluency of the text, and ensuring it reflects the source material. Overall, the MTPE process is designed to leverage the speed and efficiency of machine translation while still providing the high-quality human touch essential for accurate and effective communication.
MT vs. Human Translation
Content with specific terminology, large bilingual datasets, and a consistent structure yields the best results from MTPE. Content types that currently perform well are technical manuals, product descriptions, e-commerce content, and low-priority internal corporate communications, among others.
There are still many types of content that are better handled by humans. Linguists are equipped to translate complex content that requires a high level of creativity, accuracy, and precision, such as marketing copy, legal briefings, or medical documentation. Human translation is also needed for text with humor or idiomatic expressions. Humans best convey tone and emotion and effectively render the intended style. They consider the cultural nuances of the target language, ensuring that the translation is appropriate for the intended audience.
Content that is best handled by human translators includes:
- Marketing and advertising
- High-profile website copy
- Branded material
- Public-facing content
- Journalistic copy
- Literary texts
- Exhibition didactics
- Unique or new subject areas
- Content where accuracy is crucial
- Language combinations with limited available training data
- Specialized content with limited domain-specific training data
- Sensitive or confidential information
Post-Editing and the Perfect Fit
Light post-editing is the fastest, most cost-effective version of MTPE, in which the editor cleans up any errors, but makes minimal stylistic edits. The end goal is legibility and accuracy. Full post-editing is a more thorough process, in which the editor also improves style, tone, and flow. This approach further enhances the overall readability and improves local resonance. A project-specific approach can also be taken, prioritizing certain segments over others, depending on business needs.
Effective MTPE: The Role of the Human Editor
The effectiveness of MTPE relies heavily on the skills and experience of the human editor. An ideal linguist must be fluent in both the source and target languages, well-versed in the subject matter, and familiar with the idiosyncrasies of machine-generated translation. A good MTPE linguist typically employs a range of tools, including translation memory, glossaries, and quality assurance features within Computer-Assisted Translation (CAT) tools, to ensure consistency and accuracy throughout the text.
A Well-Defined Process
A successful MTPE process requires a clear roadmap, quality control measures, and a feedback loop for continuous improvement. Any post-delivery feedback must be reviewed, validated, and incorporated into future work. By establishing these steps, the overall quality of translations can be enhanced, leading to better outcomes and increased efficiency over time.
MTPE: MT and Human Translation
By leveraging the strengths of both MTPE and human translation where appropriate, businesses can unlock the full potential of their translation strategy and effectively overcome language barriers.