# 271 How aiOla Turns Natural, Multilingual Speech into Workflow-Ready Data
Exploring AI and Speech Technology
Introduction to the Podcast
- The host expresses a desire to help less technologically advanced supermarket chains leverage their data for better decision-making and ROI.
- Amir Haramati, co-founder and president of Ayola, is introduced as a guest. Ayola specializes in deep tech voice, speech, and conversational AI.
- The discussion will focus on challenges faced by standard speech technology in real business environments across various industries.
Amir's Background and Company Overview
- Amir describes himself as a "serial problem solver" rather than a "serial entrepreneur," emphasizing his focus on solving real problems with effective teams and technology.
- He has extensive experience in AI, having worked with Fortune 1000 companies through previous ventures that served as an AI platform for management consulting firms.
Insights on AI Challenges
- Amir shares insights from his experiences: despite advancements in AI tools, many enterprises struggle due to unstructured data that remains uncaptured.
- He emphasizes that the current challenge is not just about AI but fundamentally about data quality and accessibility; most data is still not utilized effectively.
Addressing Data Entry Issues
- Data entry is identified as a significant pain point; people dislike it, leading to poor quality and quantity of captured data.
- Speech is proposed as a solution since it allows for faster communication—capturing three times more data in one-third of the time compared to traditional methods.
Overcoming Real-world Challenges
- The conversation highlights the complexities of implementing speech technology in diverse environments (e.g., production floors or busy airports), where language variations and acoustic challenges exist.
- Amir outlines three main challenges: achieving high accuracy in varied settings, separating relevant information from noise, and transforming unstructured speech into structured data.
Iola's Vision and Development
Genesis of Iola
- The vision for Iola revolves around transforming speech into data, schema, and workflows.
- The founder sought validation from a leading expert, Professor Joseph Keshett, known for his work with Amazon's Alexa and Siri.
- After discussions with Professor Keshett, he expressed interest in joining as chief scientist, marking the inception of Iola.
Approach to Language Models
- Unlike large general-purpose language models (LLMs), Iola focuses on creating small, workflow-specific language models tailored to specific processes.
- This approach avoids the inefficiency of "boiling the ocean" by directly addressing specific use cases without unnecessary complexity.
Real-world Application Example
- A case study involves a chicken nugget manufacturer needing to comply with strict procedures in Thai—a challenging tonal language.
- Iola aims to develop a model that can accurately process Thai keywords without retraining existing data, achieving near 100% accuracy in specific settings.
Efficiency Gains
- The implementation of this model reduced a two-hour process to just 34 minutes, significantly improving efficiency in production environments.
- Additionally, it generates structured data from previously unstructured sources, enabling deeper insights and trend identification beyond human capabilities.
Challenges in AI Implementation
AI Pilot Failures
- A report from MIT highlighted that 95% of AI pilots fail to demonstrate value or ROI due to various challenges within enterprises.
Importance of ROI Demonstration
- There is an ongoing discussion about the AI bubble; however, distinguishing between effective solutions and pretenders is crucial for future success.
Case Study: Sales Organization Efficiency
- An example involving a Fortune 50 sales organization showed how data entry time was reduced from 7.5 minutes to 1.5 minutes per entry using Iola’s technology.
- This improvement not only enhanced data quality but also freed up significant time for salespeople—allowing them more opportunities within their existing schedules.
Adoption of Speech Recognition Technology in Diverse Industries
The Role of Automatic Speech Recognition and Natural Language Understanding
- The integration of automatic speech recognition with natural language understanding is crucial for generating measurable value in various industries.
- Rapid adoption of these technologies is evident, particularly in frontline environments that are often overlooked by tech developers.
Case Studies Highlighting Impact
- There are approximately two billion frontline workers globally, indicating a vast market potential for technology applications.
- In a jewelry manufacturing context, visual inspection time was reduced from 175 seconds to just 30 seconds through speech technology, significantly increasing productivity.
Human Factor and Adoption Challenges
- Successful implementation relies not only on technology but also on user adoption and the human element involved in the process.
- A quote from an airline baggage handler illustrates the transformative impact of this technology: "This is changing my life," highlighting personal benefits alongside efficiency gains.
Addressing Jargon in Specialized Environments
- Handling industry-specific jargon is a unique selling point (USP), as demonstrated by examples from both luggage handling and automotive sectors.
- Each industry typically has a limited vocabulary of relevant jargon (2,000 to 30,000 words), which can be effectively managed by learning machines.
Customizing Language Models for Specific Needs
- A case study involving a European airport revealed challenges due to users speaking "Turkish German," necessitating tailored language models that accommodate non-standard dialects.
- The approach focuses on identifying key jargon rather than adhering strictly to linguistic norms, allowing for effective communication even within niche contexts.
Scalability and Future Applications
- Identifying impactful use cases with minimal obstacles is essential for scaling solutions across different industries and locations.
- Safety remains paramount in aviation; thus, transforming every employee into a safety advocate through effective communication tools could enhance overall operational safety.
Incident Reporting and AI Solutions
The Challenge of Language Models
- Incident reporting and resolution times have significantly improved, showcasing a practical use case for AI in business operations.
- Major language model providers offer basic APIs but often fall short when it comes to industry-specific jargon and keywords, limiting their effectiveness.
- Tailoring AI solutions to specific workflows is essential; without automation, scalability becomes an issue.
Return on Investment (ROI)
- The focus is not solely on selling to CIOs; CFOs and CEOs are key stakeholders due to the potential for increased efficiency with fewer resources.
- A successful project with a Canadian grocery chain demonstrated significant time savings—110,000 hours saved across 600 stores—leading to a 5x ROI.
Real-world Applications
- The CEO of a $25 billion company reported that prior board meetings focused on failed AI initiatives until they implemented effective temperature monitoring for perishable goods.
- Proactive alerts from the system prevent spoilage, illustrating how leveraging data can enhance decision-making in low-margin industries.
Sales Strategies in Enterprise Markets
Navigating Enterprise Sales Cycles
- The enterprise sales cycle can be lengthy; having clear ROI metrics aids in discussions with potential clients.
- Transitioning from direct clients to channel clients requires understanding client needs and demonstrating the art of possible solutions through whiteboard sessions.
Collaborations and Partnerships
- Partnering with professional services firms helps bridge domain expertise gaps while scaling solutions effectively.
- A notable partnership with USD aims to replicate successful implementations at scale by integrating existing engagements into new projects.
Industry Recognition
- Collaboration with major tech companies like Salesforce and Nvidia enhances credibility; Nvidia's GDC keynote highlighted the transformative impact of spoken data technology.
Collaboration and Innovation in AI
Partnership with Accenture
- The speaker discusses a collaboration agreement with Accenture, enabling the scaling of their automatic speech recognition and natural language understanding capabilities.
- Emphasizes that many players are interested in leveraging these technologies for data entry into their AI platforms.
Understanding USD's Role
- The speaker explains the role of USD (a consulting firm), highlighting its position as a partner rather than just an internal product developer.
- Notes that USD is significantly smaller than other firms, with only 30,000 employees, but is agile and innovative in digital transformation.
Strategic Insights on AI Adoption
- Highlights the importance of early adoption in technology, referencing Nvidia's success as an example of identifying opportunities early.
- Discusses how collaboration with USD provides mutual benefits: differentiation for USD and scalability for their own company.
Addressing Data Challenges
- The partnership allows both companies to tackle fundamental data challenges in AI projects effectively.
- Joint clients benefit from enhanced value through this collaboration, aligning strategies for digital transformation.
Multilingual Capabilities and Data Processing
Language Translation Relevance
- The conversation shifts to multilingual capabilities, questioning whether translation fits into their roadmap.
- Clarifies that they do not claim to understand 100 languages but focus on processes within those languages.
Case Study: United Airlines
- Shares an example involving United Airlines, which recognizes data as its most significant asset rather than its fleet size.
- Describes how 42,000 employees use an app that translates relevant safety and compliance information across multiple languages accurately.
Focused Approach to Automation
- Stresses that their service is not about raw translation but about process-specific automation tailored for workflows.
- Concludes by emphasizing the precision of their approach over general translation services.
Hiring AI Talent in a Challenging Environment
The Difficulty of Hiring AI Engineers
- Hiring AI talent is increasingly challenging, with a noted shortage of skilled engineers, particularly in DevOps roles.
- There is a need to balance research and practical solutions; many candidates from academia are eager to dive deeper into their fields.
Achievements and Community Building
- Despite being a small team, the organization has consistently won competitions, showcasing unique technology and strong performance metrics.
- A community-driven approach has been effective for recruitment, leveraging personal networks to attract talent who might otherwise choose larger companies.
Cultural Impact on Recruitment
- Emphasizing culture over strategy is crucial; creating an impactful mission helps attract individuals who want to contribute to something significant.
- Real-world examples (like airport baggage handling) resonate well with potential hires, illustrating the practical applications of their work.
Future Roadmap and Product Development
Balancing Research with Productization
- The challenge lies in transitioning from deep research to scalable products that can be effectively utilized by clients across various sectors.
Exploring New Growth Strategies
- The company aims to validate its technology through client feedback while exploring product-led growth (PLG), allowing developers easy access to tools without extensive onboarding.
Innovations in Coding and Communication
- Future innovations may include voice coding capabilities, enhancing user experience by enabling coding through speech recognition.
Potential of AI Agents
- Industry leaders predict that AI agents represent a multi-trillion dollar opportunity; however, they still require effective prompting methods which remain rooted in natural communication forms like speech.
Data Generation and Utilization
- High-quality data generation will enhance the capabilities of AI agents. Collaborations with platforms like Nvidia aim at developing voice-driven workflows that leverage this data for improved agent performance.