Why humans are AI's biggest bottleneck (and what's coming in 2026) | Alexander Embiricos (OpenAI)
Codex: The Future of Coding Agents
Introduction to Codex
- Codex is described as OpenAI's coding agent, likened to a highly intelligent intern that requires specific prompts to function effectively.
- A notable example shared involves a developer who faced persistent bugs; Codex resolved these issues after running for an hour.
Advancements in Codex Functionality
- There are emerging capabilities where Codex can manage its own training runs and write code for key infrastructure.
- An impressive achievement highlighted is the rapid development of the Sora Android app, completed in just 18 days, with public release occurring 10 days later.
Goals and Proactivity of Codex
- A primary objective for Codex is achieving proactivity, enabling it to perform tasks autonomously rather than merely responding to commands.
- The discussion emphasizes that models like Codex are more effective when they can utilize computers through coding.
Community Feedback and Product Development
- The team actively monitors community feedback on platforms like Reddit, aiming to enhance user experience by accelerating productivity rather than complicating processes.
Insights on AGI and Team Dynamics
- The conversation touches on the current limitations posed by human typing speed and multitasking as barriers to advancing towards Artificial General Intelligence (AGI).
- Alexander Emiros, product lead for Codex, shares insights into building products at OpenAI and highlights the success of the Sora app in becoming a top application within a month.
Growth and Future Focus Areas
- Discussion includes the significant growth rate of Codex (20x), focusing not only on writing code but also improving code review processes.
Conclusion and Additional Resources
- Listeners are encouraged to subscribe for further insights into AI developments and tools available through various partnerships mentioned during the podcast.
OpenAI Insights with Alexander
Transition to OpenAI
- Alexander expresses gratitude for being on the podcast and shares his excitement about discussing his experiences at OpenAI.
- He reflects on his previous roles, including a startup founder and product manager at Dropbox, highlighting the differences in operational speed and ambition at OpenAI.
Speed and Ambition at OpenAI
- Alexander notes that the pace of work at OpenAI is significantly faster than what he experienced in startups, challenging his previous perceptions of speed in tech development.
- He cites the rapid growth of Codeex as an example of this accelerated pace, emphasizing how quickly they scaled their technology.
Organizational Structure and Learning
- The high impact required at OpenAI necessitates a more ruthless approach to time management compared to traditional startups where pivoting is common.
- Alexander discusses how the organizational structure promotes quick learning through empirical methods rather than rigid planning, allowing teams to adapt rapidly.
Bottom-Up Approach
- He describes OpenAI's truly bottoms-up approach, contrasting it with other companies that claim similar structures but may not implement them effectively.
- This method encourages experimentation due to uncertainty about future capabilities and user needs, fostering a culture of rapid iteration.
Embracing Ambiguity
- Alexander highlights the importance of humility in navigating unknown outcomes within AI development, advocating for quick trials over perfectionism.
- He likens their strategy to "ready, fire, aim," suggesting that while there is some aiming involved, it remains vague due to unpredictable user interactions with new technologies.
AI Development and the Future of Code
Aiming for the Future in AI
- The discussion emphasizes the importance of long-term thinking in AI development, particularly regarding alignment issues that require foresight.
- There is a contrast between strategic vision and tactical execution; while future goals are somewhat vague, product development relies on empirical methods.
The Role of Talent in Innovation
- The speaker highlights that successful bottom-up approaches depend heavily on hiring top talent, which is crucial for executing innovative ideas effectively.
- Observations about OpenAI's culture reveal a high level of individual drive and autonomy among team members, which contributes to its success.
Introduction to Codex
- Codex is introduced as an open coding agent designed to assist developers by integrating with IDEs like VS Code or functioning as a terminal tool. It helps answer questions about code, write code, run tests, and execute tasks within software development cycles.
- The vision for Codex extends beyond mere coding assistance; it aims to act as a proactive teammate throughout all phases of software development—from ideation to deployment and maintenance.
Proactivity in AI Assistance
- Current limitations are noted where Codex functions more like an intern who requires guidance rather than autonomously contributing across various tasks without prompts. This reflects the need for better contextual understanding in AI tools.
- The goal is to develop Codex into a proactive entity that can identify opportunities for assistance without needing constant user input, thereby enhancing productivity significantly compared to current usage patterns where users prompt models infrequently.
Vision Beyond Traditional Tools
- Unlike existing tools such as Cursor or Cloud Code that primarily offer autocomplete features, the ambition for Codex is to create an intelligent teammate capable of engaging deeply with coding processes and collaborating effectively with human developers.
Codex: Enhancing Developer Productivity
The Role of AI in Development
- Codex aims to empower developers by providing tools that enhance productivity, allowing them to work faster without needing to constantly think about invoking AI.
- The goal is for Codex to integrate seamlessly into a developer's workflow, enabling it to perform tasks automatically.
Growth and Adoption of Codex
- Since the launch of GPT-5 in August, Codex has experienced explosive growth, with usage increasing over 20 times.
- Currently, Codex serves trillions of tokens weekly and is recognized as the most utilized coding model within OpenAI's API.
Team Structure and Product Development
- The integration of product and research teams allows for rapid iteration on models and harnesses, facilitating more effective experimentation.
- Major API customers are beginning to adopt Codex models due to their effectiveness in solving complex coding issues.
Challenges with Initial Versions
- Early versions like Codex Cloud faced challenges related to environment configuration and user interaction; they required users to set up complex systems before getting value from the tool.
- A key insight was recognizing that initial interactions should be intuitive and straightforward for users.
Feedback Loop and User Interaction
- Users typically discover Codex through IDE extensions or command-line interfaces (CLI), which allow interactive use within a sandbox environment.
- This sandbox setup enables safe execution of commands while providing feedback loops that help improve user experience over time.
Integration of AI in Development
The Role of Dogfooding at OpenAI
- Engineers at OpenAI integrate AI into their development processes, facilitating a smoother transition to new technologies.
- OpenAI extensively uses its own products (dogfooding), which provides unique insights that differ from general market feedback.
- The internal use of AI tools allows for a better understanding of how different audiences interact with the product.
Advancements in Codex and Model Improvements
- Questions arise about what factors have contributed to Codex's acceleration, including data quality and model advancements.
- Recent updates include the release of GPD 5.11 CEX Max, which is approximately 30% faster than previous models and enhances reasoning capabilities.
- Codex Max is designed to tackle complex coding challenges effectively, as highlighted by community engagement on bug reporting.
Understanding the Agent Framework
- The focus has shifted from merely training superior models to conceptualizing an "agent" that encompasses various functionalities.
- An effective agent consists of a smart reasoning model, an API for interaction, and a harness for operational execution.
Continuous Operation and Compaction Feature
- Users report extended operation times for models like GP5.1 CX Max, running continuously for hours or even days.
- The compaction feature enables models to manage context windows efficiently by preparing for transitions between contexts.
Optimizing Coding Tools and Safety Measures
- Different coding products utilize varied tool harnesses; optimizing one specific approach can lead to faster development cycles.
- Codex operates within a secure sandbox environment while using shell commands, enhancing safety during operations.
Future Competitiveness in AI Development
- The discussion raises questions about competition in AI development: will it be an ongoing race or can one entity dominate?
- Emphasizes building AI as a collaborative teammate capable of diverse tasks beyond just coding assistance.
AI Super Assistants: The Future of Technology?
The Need for AI Teammates
- The rapid deployment of new technologies makes it challenging for humans to keep up, necessitating the development of AI teammates or super assistants that can autonomously provide help.
- A successful AI product would allow users to interact with an assistant without needing to learn how to use it; the assistant should intuitively understand user needs.
Chat as an Effective Interface
- Chat interfaces are effective when users are unsure about what they need, similar to communicating with a teammate on platforms like Teams or Slack.
- For domain experts, such as coders, there should be deeper tools available while still maintaining a chat interface for general inquiries.
Building a Winning Product
- OpenAI aims to create a universally accessible tool (like ChatGPT), allowing users to become comfortable asking for assistance without needing extensive knowledge of its features.
- The original concept for ChatGPT was as a "super assistant," highlighting the dual approach between consumer and business applications.
Developing Functional Capabilities
- There is potential for AI assistants to handle various tasks beyond coding, such as scheduling meetings and managing communications across platforms like Slack.
- While advancements may occur sooner than expected, building an effective super assistant requires it to perform actionable tasks using computers.
Coding Agents and User Experience
- Models are more effective when they can utilize computers directly; writing code emerges as the best method for these models to operate effectively.
- Users may not realize they are interacting with a coding agent; their focus will be on task completion rather than technical details.
Composability in Code Writing
- Coding is seen as a core competency necessary for any intelligent agent, including ChatGPT. This capability allows agents to solve problems through code efficiently.
- Understanding how agents can leverage code composability will shape future developments in creating intuitive and powerful AI systems.
Agent Configuration and Team Dynamics
Understanding Agent Customization for Teams
- The effectiveness of an agent is influenced by the context in which it operates, including team guidelines and specific requirements for deterministic behavior.
- Different sub-teams may have unique prompts for analyzing crashes, highlighting the need for agents to be configurable to meet diverse user needs.
- A key goal is to enable agents to remember frequently performed tasks, reducing the need for repetitive script writing and enhancing efficiency across teams.
- The metaphor of an agent as a teammate emphasizes collaboration, allowing sharing of learned experiences among users within the same organization.
Perspectives on Coding Agents
- While coding agents are seen as valuable tools, there remains skepticism about their capabilities outside coding tasks; improvements are anticipated as they integrate more coding functionalities.
- Building products for software engineers can lead to innovative uses of technology due to their creativity and familiarity with coding processes.
Job Impact and Future Considerations
- The discussion touches on concerns regarding job displacement in engineering; however, the focus is on how intelligent agents can enhance human capabilities rather than replace them.
- As code becomes more ubiquitous, there will be increased demand for individuals skilled in utilizing these technologies effectively.
Enhancing User Experience with Agents
- Product teams should prioritize creating tools that empower users rather than complicate their workflows; this includes making interactions with coding agents enjoyable and productive.
- Writing code is often a rewarding aspect of software engineering; thus, minimizing less enjoyable tasks like reviewing AI-generated code is crucial for maintaining job satisfaction.
Features to Improve Code Review Processes
- Implementing features that build confidence in AI-written code can enhance user experience; prioritizing visual representations over raw code during reviews can facilitate better understanding.
Spec-Driven Development and Future of Coding
The Rise of Higher Abstraction Levels in Coding
- Discussion on the vision of moving beyond traditional coding, with a focus on spec-driven development where AI assists in writing code.
- Current trends show that coding agents are evolving towards prompt-to-patch methodologies, indicating a shift in how developers interact with code.
- Spec-driven development is seen as an interesting concept, though its practicality is questioned due to varying preferences for writing specifications among developers.
Chatter-Driven Development
- A humorous take on "chatter-driven development," where team communication and social media interactions drive project progress without formal specs.
- Emphasis on flexibility; some tasks may not require detailed specifications, allowing for more spontaneous problem-solving.
Hypothetical Future Scenarios
- Introduction of a hypothetical future where mobile apps facilitate idea generation through user interaction (swiping left or right).
- Concept likened to Tinder meets TikTok for coding ideas, suggesting a casual yet effective way to manage project ideas.
Proactive Engineering Through AI
- The envisioned agent would monitor market signals and user feedback to suggest features or fixes proactively.
- This approach aims to create a seamless integration between human input and AI suggestions, enhancing productivity.
Successful AI Products and Their Impact
- GitHub Copilot's early use of Codex highlights the success of AI tools in programming environments.
- Autocomplete features in IDEs are noted as one of the most successful applications of AI due to their ability to enhance developer efficiency without being overly intrusive.
Contextual Assistance Beyond Code
- The potential for contextual assistance from AI tools during web browsing is discussed, aiming to support users beyond just coding tasks.
- The idea that modern teammates deal with various aspects beyond code suggests a broader role for AI in assisting developers throughout their workflow.
Chatter Driven Development and Codex Impact
Understanding Chatter Driven Development
- The concept of chatter driven development is introduced, likening it to an internal agent called Goose used by Block, which proactively assists engineers by monitoring their activities.
- An engineer at Block utilizes Goose to observe meetings and automatically perform tasks such as sending emails or drafting Slack messages, showcasing early implementation of this idea.
Productivity Bottlenecks
- Discussion on potential productivity bottlenecks in using tools like Goose; the main concern is ensuring that actions taken are appropriate and beneficial.
- Codex integration with Slack allows users (including non-engineers) to quickly ask questions about bugs or metrics, highlighting its utility beyond just coding tasks.
Code Validation Challenges
- The current bottleneck in software development is identified as validating code functionality and conducting thorough code reviews rather than writing code itself.
- As building becomes easier, companies face challenges in determining what to build and managing extensive code review processes.
Role of Jira Product Discovery
- Introduction of Jira Product Discovery emphasizes that the hardest part of product development lies not in building but in managing stakeholders and planning effectively.
- Jira helps teams capture insights and prioritize ideas while tracking progress from strategy to delivery, aiming for improved efficiency.
Empowerment Through Codex
- The impact of Codex on product management roles is discussed; PMs feel more empowered due to enhanced capabilities provided by tools like Codex.
- The idea of "compressing the talent stack" suggests that boundaries between roles are less rigid now, allowing team members to take on more responsibilities efficiently.
Enhanced Prototyping Capabilities
- PM work has become easier with Codex; tasks such as answering questions or prototyping can be done faster than traditional methods.
- There’s a notable increase in throwaway code being generated for quick analyses or prototypes, demonstrating a shift towards more agile development practices.
Design Team Innovations
- Designers leverage tools like Codex for rapid prototyping; they create animations through intuitive interfaces instead of traditional programming methods.
- Collaboration between designers and PM roles has intensified, leading to innovative approaches within teams at OpenAI.
Acceleration in Software Development with Codex
Rapid Prototyping and Development Process
- The development process involves quick brainstorming sessions where designers create prototypes using "vibe coding," allowing for rapid iteration without extensive discussions.
- The integration of Codex CLI and Rust can be challenging, but designers may either implement their code or collaborate with engineers to finalize pull requests (PRs).
- The Sora Android app was developed in just 18 days, showcasing the high internal usage of Codex and its impact on accelerating project timelines.
Efficiency Gains from Using Codex
- Engineers utilized Codex to analyze existing iOS applications, generating actionable plans for Android implementation, which significantly reduced development time.
- The Sora app achieved a remarkable launch timeline: two weeks for employee testing and four weeks total to public release, demonstrating unprecedented speed in app development.
Notable Achievements
- The Sora app became the number one application in the App Store shortly after its release, highlighting the effectiveness of a small engineering team working under tight deadlines.
- Atlas is another significant project that benefited from Codex; it required complex systems to build a browser but saw substantial reductions in development time due to increased efficiency.
Transitioning Roles in Software Development
- With advancements like Codex, there’s potential for non-engineers (e.g., product managers or designers) to contribute more directly to software building processes as technical boundaries blur.
- As abstraction layers evolve, understanding specific details will still be necessary but may not require deep technical knowledge akin to assembly language proficiency.
Future Implications of Coding Abstraction
- There is an expectation that as coding tools become more user-friendly through natural language processing and other advancements, broader participation in software development will increase.
The Future of Coding and AI Integration
Flexibility in Natural Language and Gradual Transition
- Natural language is inherently flexible, allowing engineers to discuss various topics like plans, specifications, products, or ideas. This flexibility suggests a gradual shift towards higher abstraction levels in coding practices.
- The transition will not be abrupt; instead, it will involve setting up coding agents that excel at tasks such as previewing builds and running tests before fully automating the coding process.
- Initially, human oversight will be necessary to curate which systems or components the coding agent interacts with effectively.
Implications of Rapid Development
- The speed of building software raises questions about the importance of distribution and the value of ideas in a fast-paced environment.
- Despite rapid development capabilities, execution remains challenging. Ideas alone do not guarantee success; effective execution is crucial for coherent product development.
- As building becomes easier, other aspects like market entry and profitability gain significance. Understanding customer needs may become more critical than technical skills alone.
Customer-Centric Approach in AI Startups
- A deep understanding of specific customer problems is essential for new companies leveraging AI tools. Those who can identify underserved markets are likely to succeed.
- Vertical AI startups focusing on niche problems may outperform general solutions by providing tailored services that integrate seamlessly into existing workflows.
Measuring Progress with Codex
- Evaluating progress with tools like Codex involves monitoring user adoption metrics rather than solely focusing on feature depth. Retention rates are key indicators of success.
- Engaging directly with users through platforms like Reddit and Twitter helps gather feedback on product performance and user experience.
Importance of User Feedback
- Continuous monitoring of social media sentiments allows teams to address complaints effectively since diverse use cases can lead to varied experiences with the tool.
- Early retention statistics remain vital as they reflect how well users adapt to new technologies amidst an evolving landscape where usage patterns are still being established.
Discussion on Codex and Atlas
Insights on Reddit and User Feedback
- The speaker notes a shift in sentiment regarding Codex, observing that discussions on Reddit tend to be more negative yet grounded in reality.
- Twitter interactions are described as more personal, but Reddit's upvoting system provides clearer signals about user opinions and what matters to the community.
Experience with Atlas
- The speaker shares their experience with Atlas, expressing dissatisfaction with its AI-only search feature, preferring traditional search engines like Google for certain tasks.
- They reflect on the iterative process of product development at OpenAI, emphasizing the importance of shipping products quickly to gather user feedback.
Contextual Assistance Concept
- The speaker discusses their background in developing a contextual desktop assistant, highlighting the frustration users face when needing to provide context for assistance.
- They view Codex as a contextual assistant focused initially on coding tasks but believe it can evolve into broader applications by understanding user intent.
Browser Development Rationale
- The rationale behind building a web browser is explained: it allows for better contextual understanding without relying on external software limitations or slow methods like screenshots.
- A comparison is made between video game mechanics and contextual actions; knowing user intent enables more effective assistance.
Future Vision for AI Assistance
- The vision includes agents assisting users seamlessly throughout their day without overwhelming them with notifications, enhancing workflow instead.
- Users will have control over how they interact with AI tools through dedicated browsers while maintaining clear boundaries between different types of browsing experiences.
Broader Applications of Codex
- While primarily used for coding tasks currently, there are unexpected use cases emerging outside engineering fields where Codex is gaining traction.
Exploring the Capabilities of Codex
The Future of Coding with AI
- Discussion on the potential growth of tech-oriented ecosystems and the importance of focusing on coding tasks for team productivity.
- Inquiry about Codex's compatibility with various codebases, including SAP, and its effectiveness in different programming environments.
- Emphasis on using Codex for challenging coding problems rather than trivial tasks to evaluate its capabilities effectively.
- Suggestion to test Codex with real-world issues, such as debugging complex problems, to gauge its performance accurately.
Best Practices for Using Codex
- Advice to start with manageable yet challenging tasks when testing Codex, avoiding overly ambitious projects initially.
- Overview of supported programming languages by Codex, indicating it aligns well with commonly used languages unless dealing with niche or proprietary ones.
- Recommendation for new users to experiment with multiple approaches simultaneously while using Codex to build familiarity and trust in its capabilities.
Skills Development in an AI-Dominated Landscape
- Insight into how collaboration and communication skills remain vital even as AI tools become more prevalent in software development.
- Discussion on the evolving role of coders; emphasis on being proactive and productive while leveraging advanced coding agents like Codex.
- Exploration of whether learning traditional coding is still valuable amidst rising AI capabilities; highlights the need for foundational knowledge in computer science principles.
- Encouragement for aspiring developers to focus on practical experience rather than just academic assignments to enhance their employability.
The Importance of Systems Engineering
- Reflection on hiring practices that prioritize candidates' ability to utilize modern tools effectively, reducing barriers between junior and senior developers due to advancements in AI assistance.
- Assertion that understanding system architecture remains crucial despite improvements in AI coding agents; human oversight will continue to be necessary for creating robust software systems.
Discussion on Coding Agents and AGI Timeline
The Role of Human Oversight in Coding Agents
- An engineer working on the Atlas project utilized Codex to verify its own work, demonstrating the complexity involved in ensuring accuracy within coding agents.
- Despite advancements, human oversight remains crucial at various stages to configure coding agents effectively, emphasizing the need for reasoning over mere typing speed or algorithm knowledge.
- Understanding what makes a software engineering team effective is essential; this includes being able to reason about different systems rather than just technical skills.
Advancing Knowledge Frontiers with Coding Agents
- Engaging with cutting-edge knowledge areas can be beneficial as it often necessitates leveraging coding agents to enhance workflow efficiency.
- Codex plays a significant role in managing training runs and infrastructure, catching configuration mistakes that could hinder progress.
Automation and Self-Sufficiency of Coding Agents
- The concept of Codex being "on call" during its training implies an ability to alert humans or autonomously fix issues that arise during operations.
- Training runs require constant monitoring (referred to as "babysitting"), highlighting the importance of having systems that can evaluate performance metrics continuously.
Future Perspectives on AGI Development
- The discussion shifts towards predicting timelines for achieving Artificial General Intelligence (AGI), focusing on productivity acceleration curves as indicators of progress.
- Current limitations include human typing and multitasking speeds, which affect how quickly AI can validate its outputs and reduce bottlenecks in workflows.
Productivity Gains Through Agent Integration
- To unlock greater productivity, systems must evolve so that agents operate more independently without requiring constant human prompts or validations.
- Early adopters are expected to experience significant productivity boosts starting next year, while larger organizations may take longer due to complex existing systems needing gradual updates.
- The timeline suggests a gradual transition where increased agent self-sufficiency will lead toward reaching AGI capabilities over time.
AI Efficiency and Human Limitations
The Role of AI in Coding
- Discusses the distinction between improving coding efficiency through AI and ensuring the quality of the output, highlighting that human review is a critical limiting factor.
- Emphasizes the potential for unlocking more capabilities with current AI technology by learning to use it more effectively.
Team Expansion and Hiring Needs
- Mentions that the Codeex team is growing and expresses a need for engineers, salespeople, and product managers to join their efforts.
- Provides contact information for interested candidates, suggesting they visit a jobs page or reach out via social media.
Candidate Qualities and Expectations
- Outlines expectations for applicants: they should be technical individuals who actively engage with AI tools.
- Suggests that candidates reflect on what working at OpenAI on Codeex would entail over six months as a self-filtering mechanism.
Passion for Future Technologies
- Highlights the importance of having passionate individuals who have already contemplated future developments in AI agents, regardless of differing opinions on direction.
Lightning Round Questions
Book Recommendations
- Asks Alexander about book recommendations; he mentions enjoying science fiction, particularly "The Culture" by Ian Banks due to its optimistic portrayal of an AI-driven future.
Recent Media Enjoyment
- Shares his enjoyment of the anime "Jiu-Jitsu Kaisen," noting its positive protagonist compared to older anime characters who often faced darker themes.
Exploring Optimism and Technology
The Importance of Belief in Progress
- The speaker emphasizes the necessity of belief in achieving goals, suggesting that one must believe in possibilities to manifest them into reality.
Personal Interests and Discoveries
- The speaker shares a personal interest in combustion engines and cars, noting their initial aspiration to work on US aircraft before transitioning to software development.
Insights on Tesla's Software
- A recent experience with a Tesla is discussed, highlighting the inspiring nature of its software, particularly its self-driving feature which empowers users while maintaining human control.
- The speaker appreciates how Tesla allows drivers to adjust settings without disengaging the self-driving mode, showcasing a balance between automation and user agency.
Company Values: Kindness and Candor
- The speaker reflects on their startup's core value of being "kind and candid," explaining how this principle evolved from recognizing the need for honest communication among founders.
- They discuss the challenge of balancing kindness with candor, emphasizing that candidness should be framed as an act of kindness towards others.
Personal Connections and Heritage
- When asked about their last name, Emiricos, the speaker identifies more with a Greek poet than a shipping magnate due to familial ties to an island called Andros.
- They describe Andros as beautiful yet sparsely populated, sharing insights about their family's connection to both poetry and heritage.
Engaging with the Community
- The speaker invites listeners to engage with them through social media platforms like Twitter or r/codeex for feedback on coding agents.
- They encourage interested individuals to apply for positions related to coding agents at their company, emphasizing openness to community input for improvement.
Closing Thoughts
- The conversation concludes with appreciation for meeting people involved in AI development who embody optimism and kindness—qualities deemed essential for shaping future technologies.