# 218 How Large Language Models Replace Neural Machine Translation with Unbabel’s João Graca

Name: # 218 How Large Language Models Replace Neural Machine Translation with Unbabel’s João Graca
Uploaded: 2024-07-12T00:00:00.000Z
Duration: 1 h 28 min 59 s

New Section

In this section, the conversation introduces the guest, Xoua Graa, Chief Technology Officer and Co-founder of language operations platform Unbabel.

Introduction and Motivation for Founding Unbabel

Xoua discusses the motivation behind co-founding Unbabel, stemming from a previous startup failure. The idea was born out of observing language barriers faced by Airbnb users.

The team aimed to address language struggles by combining machine translation with human correction to ensure quality. This approach led to the inception of Unbabel.

Initially focusing on translation services, Unbabel encountered challenges due to diverse project requirements. This prompted a shift towards specializing in customer service as a vertical market.

Transition to Language Operations Platform

Unbabel made a strategic decision to concentrate on customer service as its primary vertical, streamlining services for platforms like Zendesk and Salesforce.

After establishing dominance in customer service, Unbabel expanded into other verticals by acquiring companies and enhancing its product offerings through an M&A approach.

Impact of Large Language Models on Machine Translation

This segment delves into the influence of large language models (LLMs) on machine translation and highlights Unbabel's contributions in this domain.

Evolution of Research Contributions

Xoua traces back to academia roots, emphasizing the importance of research in addressing unsolved challenges within machine translation. Early initiatives included sponsoring research projects on text anonymization for translation.

Expansion and Research Focus

In this section, the speaker discusses the early days of the company, focusing on team expansion and research collaboration with academia to address research problems effectively.

Team Expansion and Research Collaboration

The initial team aimed to collaborate with academia to work on research problems while leveraging resources for their own projects.

Andre Martin, a professor at IST, played a significant role in the early team setup.

Emphasis was placed on research from the beginning, with a high ratio of PhDs in NLP compared to other areas.

Contributions to Machine Translation (MT) and Quality Estimation (QE)

This part delves into the company's contributions to machine translation (MT), particularly focusing on enhancing tools like Moses for specific domains and later transitioning to neural MT models. Additionally, it highlights their efforts in quality estimation (QE) tool development.

Machine Translation Contributions

Initially worked on improving Moses for specific domains in MT.

Transitioned quickly to neural MT models when they emerged, contributing early to Maran MT model development.

Started contributing significantly to QE tool frameworks due to the lack of open-source tools available for users.

Open Source Approach and Community Contribution

This segment focuses on the company's belief in open-source tools for research purposes while restricting commercial use. It emphasizes community contribution as essential for mutual learning and progress.

Open Source Philosophy

Advocates for open-source tools in research but limits commercial use to avoid providing an advantage to competitors.

Believes that sharing tools benefits both the community and the company by facilitating learning from others' work.

Evolution of Natural Language Processing (NLP)

Discusses how NLP has evolved over time due to advancements in technology, making previously challenging tasks more accessible through language models like LLMs.

Evolution of NLP

Despite technological advancements enhancing NLP tools, fundamental challenges remain that require proper validation sets and methods for solving various problems.

Language models like LLMs have revolutionized tasks such as transcreation by simplifying cultural adaptation through prompt-based approaches.

Role of Regeneration (Rag) Tool in Model Customization

Explores the significance of Rag tool in customizing language models by providing declarative information during decoding processes.

Customization with Rag Tool

Rag is a powerful tool enabling model customization through declarative information input during decoding stages.

Allows users to specify translation styles, terminologies, and style guides for model adherence during output generation.

Discussion on Challenges in Production with Large Language Models

The conversation delves into the challenges faced when implementing large language models in production environments, highlighting engineering obstacles and decision-making processes.

Challenges in Implementing Large Language Models

Companies encounter significant engineering hurdles when launching large language models in production due to various complexities.

Implementing a vector database efficiently is crucial for providing context and proximity in translation tasks, requiring ultra-efficient search capabilities for similar requests.

Adapting existing pipelines from sentence-by-sentence translation to handling paragraphs or entire documents necessitates infrastructure changes.

Running large models in production poses challenges such as managing costs, scaling based on demand, and balancing latency versus expenses.

Choosing between utilizing CHPT API for model deployment or managing model operations internally involves considerations of cost-effectiveness and operational control.

Advantages of Large Language Models Over Traditional Machine Translation Approaches

The discussion focuses on the benefits of adopting large language models like GPT or CLA over traditional machine translation methods, emphasizing improved results, cost-efficiency, and enhanced customer privacy.

Benefits of Large Language Models

Embracing state-of-the-art cloud-based models like GPT or CLA can lead to superior outcomes compared to traditional approaches.

Companies can achieve better results with large language models while also potentially reducing costs associated with word translations.

Leveraging the advantages of large language models can provide a competitive edge by addressing customer privacy concerns effectively.

Future Prospects of Large Language Models vs. Neural Machine Translation

The conversation explores the evolving landscape of machine translation technologies, highlighting the dominance of large language models over neural machine translation systems and predicting continued advancements in LLM capabilities.

Evolution from Neural Machine Translation to Large Language Models

Traditional machine translation approaches are gradually being phased out in favor of large language models due to their superior performance and ongoing enhancements.

Large language models are considered the future standard for machine translation due to continuous improvements driven by extensive research efforts.

Despite the progress with LLM technology, challenges remain regarding optimizing model size for efficiency without compromising effectiveness across multiple languages.

Cost Considerations and Decision-Making in Model Selection

The dialogue touches upon pricing dynamics between traditional NMT systems and emerging LLM solutions, shedding light on factors influencing cost competitiveness within the industry.

Pricing Dynamics and Model Selection

Contrary to expectations, large language models are becoming more cost-effective compared to traditional NMT systems like Google Translate.

Cost reductions in LLM adoption may stem from optimized pricing strategies or external sponsorships driving down model expenses.

Detailed Background of Comet QE

In this section, the speaker provides a detailed background on Comet QE and its development process.

Development with Large Language Models

Two years ago, the team started working with large language models (LLMs), including Comet QE and previous neural machine translation (NMT) models.

Investors questioned the impact of LLMs on the language industry, leading to a strategic evaluation of their advantages in simplifying tasks like automatic spelling correction and quality enhancement.

Leveraging expertise, an exceptional research team, linguistic curation skills, and valuable data insights enabled the team to enhance translation data curation through Quality Estimation (QE).

Enhancing Multilingual Capabilities

This part delves into improving multilingual capabilities within large language models.

Multilingual Training Process

Initial focus involves utilizing open-source LLMs for their existing word representations but enhancing multilingual performance through continuous training on diverse datasets.

Continuous training on multilingual data sets results in improved model performance across various languages beyond English, showcasing enhanced representation learning capabilities.

The model's ability to excel in different languages is attributed to its refined data representation learning process during training phases.

Instruction Tuning for Enhanced Performance

This segment explores instruction tuning as a key factor in optimizing model performance.

Instruction Tuning Process

Instruction tuning involves curated instructions for specific tasks like translation from A to B, aiding in improving model proficiency through targeted prompts and glossaries.

Training models on multiple tasks simultaneously enhances translation quality by enabling abstraction learning and adherence to specific prompts, elevating overall model performance.

Detailed Discussion on Multilingual Language Models

In this section, the conversation delves into the development of multilingual language models and the advantages they offer over non-multilingual models.

Building Multilingual Language Models

Training a model to be intentionally multilingual from scratch can provide advantages over simply continuing to train a big model.

Understanding the difference between a non-multilingual language model and a multilingual one is crucial for grasping their respective capabilities.

Performance of Multilingual Models

Multilingual models excel in answering questions across different languages, but non-multilingual models may struggle with performance decay as languages vary.

The importance of having parallel data and an effective tokenizer for aligning representations between languages is highlighted.

Translation Capabilities

Effective translation abilities are showcased through examples where questions posed in German are accurately translated into English by the model.

The discussion touches on how translations sometimes feel like they have been generated in English rather than originating directly in another language.

Differentiating Open Source and Proprietary Versions of Language Models

This segment explores the distinctions between open-source versions of language models available for public use and proprietary versions utilized internally for enhanced performance.

Utilization of Language Models

While open-source versions are accessible for general use, proprietary workflows incorporate additional data and optimizations to enhance performance.

Commercial Use Restrictions

Open-source versions are not intended for commercial use, limiting their application to academic research rather than commercial ventures.

Enhanced Performance Through Internal Versions

New Section

In this section, the speaker discusses the evolving nature of models and the continuous improvement in zero-shot learning.

The Evolution of Models

Zero-shot learning evolves over time, with what is true today not necessarily being accurate in the future.

Emphasis on continuous learning and improvement rather than getting stuck on a specific model.

New Section

This part focuses on the operational simplicity and advantages of LLMs beyond zero-shot capabilities.

Operational Simplicity of LLMs

LLMs offer more than just zero-shot capabilities, providing operational simplicity and efficiency.

The disruption factor lies in using a single model for various tasks, enhancing ease of use.

New Section

Here, the speaker elaborates on comparing different models and their accessibility for testing.

Model Comparison and Testing

Users can compare models by inputting text into an online tool to assess performance metrics like CQV score.

Accessibility is highlighted through open access for testing different models globally via an API.

New Section

This segment delves into challenges faced by Tower LM, including syntactic errors and language nuances.

Challenges Faced by Tower LM

Tower LM encounters difficulties with specific error types, syntax challenges, and language combinations.

Issues such as incorrect translations or tones indicate ongoing improvements needed in model training.

New Section

The discussion shifts towards refining contextual understanding for improved model performance.

Refining Contextual Understanding

Ongoing efforts focus on optimizing context usage to enhance model performance gradually.

New Section

In this section, the discussion revolves around automatic post-editing and quality estimation in machine translation, particularly focusing on identifying errors and improving results through agentic machine translation.

Automatic Post-Editing and Quality Estimation

The process involves automatic post-editing with quality estimation to identify errors in translations.

Agentic machine translation is discussed as a method where the system runs empty, identifies errors, and prompts for correction.

Notable improvements have been observed in results using this approach.

New Section

This part of the conversation delves into the concept of agentic machine translation and its potential impact on future translation processes compared to existing pipeline methods.

Agentic Machine Translation vs. Pipeline Approach

Agentic machine translation is considered as a potential future direction in contrast to the current pipeline approach.

The pipeline method involves calling different modules sequentially for translation tasks.

Agentic MT allows setting goals for the system to decide which model to call based on the desired outcome, showing promise for more efficient translations.

New Section

Here, the focus shifts towards exploring new approaches like X Tower for correcting translation errors through free text explanations, aiming to guide improved translations effectively.

X Tower: Correcting Translation Errors

X Tower is designed to provide free text explanations for translation errors to facilitate generating corrected translations.

The tool aims to offer descriptive explanations of errors detected by QE (Quality Estimation) systems.

It presents an innovative way of guiding translators or automated systems towards producing more accurate translations.

New Section

This segment discusses ongoing debates within the team regarding error detection models and their implications on bias, prompting considerations about using generative models versus discriminative models for error identification.

Error Detection Models Debate

Internal discussions revolve around whether using one model for error detection and correction introduces bias compared to employing separate models.

Considerations are made regarding utilizing generative models like GPT-style LLMs versus discriminative models such as X Com for error discrimination.

Detailed Discussion on Language Models and AI Progress

In this section, the conversation delves into the impact of language models (LLMs) on low-resource languages, decision-making processes for adding new languages to models, resource allocation, talent acquisition in the AI field, and the future progression of LLMs.

Impact of LLMs on Low-Resource Languages

Contrary to concerns, advancements in LLMs are beneficial for low-resource languages as they require less data compared to previous models.

Decision-Making Process for Adding Languages

Language additions to models are based on customer demand; effort is required to open a new language due to data needs and community involvement.

Strategic Approach to Market Expansion

Adding languages is a strategic decision based on market targeting rather than aiming for a high number of languages like Google; resources are allocated efficiently based on specific needs.

Talent Acquisition and Resource Management

The company's strong presence in academia facilitates talent acquisition; reputation among NLP and translation experts aids in attracting skilled individuals.

Future Progression of LLMs and AI