# 224 On AI for Deep Content Understanding with Zetta Cloud Chief Strategist George Bara

Name: # 224 On AI for Deep Content Understanding with Zetta Cloud Chief Strategist George Bara
Uploaded: 2024-09-12T13:31:06.000Z
Duration: 1 h 26 min 19 s

Introduction to AI and Data Training

Overview of AI Technology

The speaker discusses the expectation that revolutionary technology should immediately make sense of data, but emphasizes the need for training, extensive data, and human expertise.

Guest Introduction

George Bara, Chief Strategist at Zeta Cloud, is introduced. Zeta Cloud specializes in AI solutions for text analytics and multilingual data processing.

George's Background and Career Journey

Personal Insights

George shares his current location in Romania and mentions wearing a Romanian-themed shirt due to the European Championship hype.

Professional Path

He recounts his career journey starting as a programmer for ten years before transitioning into the language industry through roles at SDL and RWS.

Founding Zeta Cloud

Transition to Localization Industry

George explains how he entered the localization business unexpectedly while working with Language Weaver, which was later acquired by SDL.

Motivation Behind Zeta Cloud

He highlights that both he and his co-founder were technically inclined but felt limited by traditional localization applications; they aimed to explore broader uses of language technology.

AI Evolution in Language Processing

Early Days of AI in Language Technology

George reflects on how when he started at Language Weaver, terms like "AI" were not commonly used; they focused on machine learning without understanding its implications fully.

Current Focus at Zeta Cloud

He notes that despite advancements in large language models, their focus remains on building smaller language models tailored for specific applications within natural language processing.

Reflections on Localization Industry

Unique Perspective on Career Exit

Unlike many who remain entrenched in localization careers, George successfully transitioned out after finding more exciting opportunities outside this field.

Future Directions

Understanding AI Solutions and Client Expectations

Overview of Z Cloud's Approach to AI

Z Cloud has been developing end-to-end solutions for understanding unstructured content for over a decade, utilizing Transformers even before the advent of large language models (LLMs).

Expert machine learning systems are often more effective at specific tasks compared to generalist LLMs, leading Z Cloud to focus on creating small, robust AI engines that require less hardware and offer greater flexibility.

Focus on Privacy and Data Processing

Z Cloud collaborates with organizations that prioritize privacy, particularly in the public sector, exemplified by a recent project involving multimodal data processing that integrates various data sources into a unified system.

The integration of blockchain technology ensures the integrity of processed information, making it verifiable and immutable—crucial for legal contexts where data may be scrutinized.

Data Fusion and Process Automation

Projects often involve data triage, which entails structuring diverse data formats and languages to enhance accessibility for users.

In addition to data fusion, Z Cloud is also engaged in process automation through robotic process automation tools, emphasizing the importance of making sense out of complex datasets.

Tailored Solutions Over Large Models

Instead of relying on large models that attempt to handle everything, Z Cloud opts for smaller models designed for specific tasks. This approach allows easier quality control and adaptability.

Navigating Client Expectations in an AI-Hyped Market

Engaging with clients who have high expectations due to current AI hype can be challenging; educating them about realistic capabilities is essential.

The sales cycle typically spans around a year as companies must first combat hype before demonstrating technology effectiveness through proofs of concept (PoCs).

Understanding Client Readiness and Needs

Clients vary significantly in their familiarity with AI; some advanced public sector clients possess clear requirements while others have unrealistic expectations shaped by mainstream narratives like ChatGPT.

Understanding AI Project Implementation

Challenges in Implementing AI Projects

Smaller projects often consume more time than expected due to varying organizational capabilities in adopting and purchasing AI technologies.

Keeping up with rapid developments in AI, such as new foundation models, is challenging; practitioners develop a gut feeling for what works through experimentation.

The abundance of information can lead to confusion; many new frameworks may not be suitable for specific customer needs after extensive research.

Organizations tend to follow their established technology stacks, evaluating whether new solutions are worth the investment based on customer satisfaction and ROI.

Major efforts to retrain existing systems on new frameworks are only pursued when there is a clear need or dissatisfaction from clients.

Perception of Translation in the AI Landscape

There is an ongoing project focused on machine translation that leverages past experiences with computer-assisted translation and terminology management.

Clients are increasingly excited about large language models (LLMs) as translation engines, leading to a simplified view of translation needs across different content types.

The desire for LLMs as a one-size-fits-all solution poses risks; organizations may overlook the necessity for tailored approaches in translation strategies.

The Double-edged Sword of AI Hype

While LLM hype raises awareness about potential applications, it also leads to unrealistic expectations regarding speed and quality of problem-solving capabilities.

Many organizations mistakenly believe they can quickly resolve issues using LLM technology without understanding its limitations; this misconception can harm the industry’s reputation.

Limitations of Large Language Models

The perception that LLMs can serve as silver bullets for various NLP tasks oversimplifies their capabilities; they excel at generating text but struggle with precise data extraction compared to traditional NLP tools.

Understanding Custom AI Solutions

The Need for Customization in AI

Many existing AI solutions do not meet user expectations, leading to a demand for tailored data extraction and transformation tasks that are industry-proven and high-quality.

Specific applications like sentiment analysis are crucial in customer support, enabling automatic classification of user feedback and reviews.

Key NLP Tasks

Named entity recognition is likened to a person reading documents to extract keywords; despite being seen as outdated, it remains vital in current NLP applications.

Modularized deployments allow organizations to replace or adapt components easily, contrasting with the complexity of replacing entire large language models (LLMs).

Product vs. Solution Building

The focus is primarily on product development rather than customized solutions due to the long-term maintenance challenges associated with bespoke systems.

Custom engagements can be complicated to maintain over time if they aren't integrated into a cohesive product stack.

Introduction of AI Factory

The "AI Factory" is introduced as a suite of natural language processing engines capable of various tasks such as speech-to-text and machine translation, all customizable without coding.

This platform operates fully on-premise, supporting 100 languages and allowing users control over their data-driven engines.

Challenges in No-Code Customization

While no-code platforms are beneficial, customization remains complex; the company is gradually expanding no-code capabilities across different analytics engines.

Customizing an AI engine involves extensive data collection, annotation, experimentation, and production deployment—each step represents its own project.

Empowering Customers for Self-Customization

The platform simplifies model building using existing training data through automated experimentation processes that optimize parameters efficiently.

Creating User-Friendly AI Solutions

Simplifying AI for Non-Experts

The company aims to develop AI tools that are easy to use for customers lacking in-house resources or AI expertise, enabling business users to create their own AI engines.

Users can build these engines with minimal effort, utilizing existing data from past processes without needing advanced technical skills.

Balancing Complexity and Usability

The focus is on avoiding the complexity of large machine learning platforms, which could overwhelm users; thus, customization options are limited but user-friendly.

The design allows for quick deployment of AI engines into production without requiring coding knowledge or understanding hyperparameters.

Understanding AI Limitations

Customers often struggle with the concept that not all prediction errors are technical issues; distinguishing between software malfunctions and inherent limitations of AI is crucial.

An example illustrates how an AI classifier may fail to recognize new document classes, emphasizing that such outcomes are expected rather than errors.

Achieving High Accuracy with User Control

Users can achieve up to 95%-97% accuracy using the tool, which surpasses human performance in specific NLP tasks.

Similarities drawn with ChatGPT highlight how users adapt by changing prompts when faced with inaccuracies instead of seeking refunds.

Challenges in LLM Performance

Issues with Generative Models

A humorous anecdote about a generative model's inability to correct itself after making a mistake underscores the challenges of relying on such technologies for critical tasks.

Deterministic vs. Generative Approaches

Unlike generative models that may produce incorrect outputs under pressure, the discussed system ensures it only provides results when confidence scores meet a certain threshold.

This approach allows users to control data extraction processes effectively and maintain consistency across repeated tasks.

Future Directions in AI Development

There’s a growing trend towards making generative models more deterministic through retrieval augmented generation techniques; however, skepticism remains regarding their long-term viability.

Multimodal AI and Its Challenges

Expectations and Limitations of Multimodal Models

Discussion on the potential threshold reached by current models, particularly regarding future iterations like GPT-5. The speaker expresses high expectations for multimodal capabilities.

Notable revelation that ChatGPT utilizes Google Translate for translations and an open-source TCT engine for optical character recognition (OCR), highlighting the integration of various technologies.

Observations on advancements in OCR technology, which now combines generative components to reconstruct missing words or improve handwriting quality.

Acknowledgment that while significant efforts are being made to customize large language models (LLMs) for specific industries, such as finance, the progress has plateaued due to ongoing training needs and data requirements.

Buyer Behavior Across Regions

Inquiry into differences in buyer behavior across regions like the US, Europe, and Asia; no significant differences noted but high expectations from technology observed globally.

Distinction between "AI educated customers" who understand how to set realistic expectations versus those new to AI who may have inflated hopes without proper structure or examples.

Challenges in AI Procurement

Emphasis on the necessity for organizations to know how to effectively procure AI solutions rather than just focusing on development; highlights a gap in understanding among potential buyers.

Concerns about unrealistic budgets and high expectations leading to difficulties in selling effective solutions; stresses the importance of informed purchasing decisions.

The Importance of Data Security

Cybersecurity as a Key Selling Point

Discussion on security concerns among clients, with many companies hesitant to move their data to cloud-based systems due to fears of exposure and hacking threats.

Cybersecurity is identified as a primary focus area within client discussions; there is a growing market for cybersecurity solutions amid increasing incidents of data breaches.

On-Premise Solutions as Differentiators

The company’s strategy emphasizes building software with strong security measures tailored for on-premise deployments or private clouds, catering specifically to organizations wary of cloud vulnerabilities.

National AI Strategy Involvement

Role in Romania's National AI Strategy

Understanding AI Regulation and Strategy in Europe

Insights from Collaboration with Local Universities

The speaker had the opportunity to collaborate with a local Technical University in Kapa to draft components of the National AI strategy, focusing on regulation.

This collaboration provided insights into how major companies utilize AI and their terms of conditions, highlighting the importance of understanding industry practices.

The Importance of Regulation in AI Development

The speaker reviewed multiple National AI strategies, noting that while some countries are pragmatic about funding and benefiting society through AI, others focus solely on regulation without fostering an ecosystem.

Acknowledging that a purely commercial go-to-market strategy may not suffice, the speaker emphasized the need for partnerships with educational institutions to enhance research efforts.

Building Partnerships for Research Funding

The company has signed three contracts focused on pure AI research at both national and European levels, indicating a strategic shift towards collaboration.

As an AI vendor, it is crucial to be embedded within government initiatives since regulations can significantly impact operations in Europe.

Navigating Funding Opportunities in Europe

Unlike the US, where venture capital is more accessible based on ideas alone, Europe tends to be risk-averse; thus, government support through grants is essential for launching risky business ideas.

There are numerous EU grants available for AI research at both national and local levels; being connected to these opportunities is vital for success.

Strategies for Success as an AI Vendor in Europe

Understanding regulatory frameworks and funding avenues is critical for European AI vendors aiming to thrive in this landscape.

Engaging with industry trends and potential customer needs will help businesses align their offerings effectively within the European market.

Conclusion