AI in Pharmaceutical R&D with Kim Branson

AI in Pharmaceutical R&D with Kim Branson

Getting to Know Kim Branson: A Journey into AI and Machine Learning

Early Interests and Academic Background

  • Kim Branson shares that he had no initial interest in biology until university, where he discovered molecular biology and bacterial pathogenesis.
  • He describes his childhood as more focused on math and physics, but became fascinated by the structure of biological systems through X-ray crystallography.
  • The allure of understanding how things work drew him into the field, leading to a passion for structural biology.

Transition to Computational Drug Design

  • Branson mentions his shift towards computational drug design during his PhD, despite skepticism from peers about its effectiveness at the time.
  • He worked with notable figures in early computational drug design, contributing to the development of Relenza, a drug designed using computational methods.

Experience in Startups vs. Large Pharma

  • After working at Vertex Pharmaceuticals, he reflects on the differences between startups and large pharmaceutical companies regarding innovation pace and resource availability.
  • In startups, he enjoyed the freedom to experiment with large datasets without many constraints; however, larger companies offer more capital but come with bureaucratic challenges.

Joining GSK: A New Chapter

  • Branson discusses his initial resistance to joining GSK due to preconceived notions about big pharma's stagnation but was persuaded after meeting key individuals there.
  • He recognized GSK's commitment to transformation and innovation within their organization, which changed his perspective on working for a large company.

Insights on Organizational Dynamics

  • Branson emphasizes that large companies often struggle with reinvention but noted that GSK was genuinely attempting significant changes internally.

Insights on Machine Learning in Drug Discovery

The Importance of Machine Learning and Data in Drug Development

  • The speaker highlights the significance of machine learning in drug discovery, emphasizing the need for advanced data analysis as large genetic databases and functional genomics become available.
  • They discuss the foresight of integrating machine learning into their strategy to handle an anticipated explosion of data from gene editing technologies like CRISPR.
  • Reflecting on their career path, they mention considering starting another company but ultimately decided to stay, noting that they have surpassed expectations by remaining with the organization for five years.

Leadership Challenges in Larger Organizations

  • The speaker addresses the challenge of leading larger organizations while fostering innovation and collaboration among diverse teams with varying backgrounds.
  • They introduce "Arm's Law," suggesting that just as computation scales, so must communication within a company to ensure everyone is aligned with strategic goals.
  • Emphasizing effective communication, they note that explaining complex concepts takes time due to differing levels of experience and skepticism among team members.

Navigating Innovation Dilemmas

  • The discussion touches on the classic innovative dilemma where some team members may be skeptical about new approaches while others are enthusiastic supporters.
  • The speaker stresses the importance of messaging and convincing stakeholders about new initiatives while also recognizing when to focus on building capabilities rather than just talking about them.

Implementation Phase of Technology

  • They describe a phase where they've built a robust internal capability before fully integrating it into existing processes, highlighting a strategic approach to technology adoption.
  • Communication remains crucial during this installation phase as geographical distribution adds complexity; messages take time to resonate across different teams.

Future Prospects: AI's Role in Drug Design

  • As AI becomes more prevalent, questions arise regarding its impact on drug design. The speaker notes that while some envision AI creating drugs without testing, this may still be decades away.
  • They clarify that their group works across various stages of drug development, focusing first on identifying appropriate targets for treatment based on extensive data analysis.

Understanding Genetic Variants and Machine Learning in Disease Modulation

The Role of AI in Genetic Research

  • AI models can identify continuous traits, facilitating genome-wide association studies (GWAS) to understand genetic variants.
  • Determining the biological function of genetic variants involves identifying the cell types affected and understanding their mechanisms, such as messenger RNA production or splicing changes.

Predictive Methods for Genetic Variants

  • Machine learning methods are employed to predict the directionality of genetic variant effects, aiding in disease interpretation and potential treatment pathways.
  • Various cellular imaging techniques and active learning systems enhance drug discovery by allowing real-time experimentation on biological models rather than relying solely on small molecule testing.

Active Learning Systems in Biological Research

  • Researchers utilize TALEN technology to modulate gene expression continuously, enabling precise control over protein levels during experiments.
  • An active learning system integrates data from genetics, literature, and experimental results to optimize hypotheses and guide further research directions.

Advancements in Computational Pathology

  • Machine learning enhances computational pathology by accurately assessing target expression levels in tissues, particularly useful in oncology.
  • This technology allows for detailed analysis of cell types responding to treatments based on trial data, improving patient stratification for therapies.

Efficiency Gains Through Machine Learning

  • Compared to traditional methods a decade ago, machine learning significantly reduces the number of required experiments by automating complex analyses that previously relied on manual scoring.
  • Advanced measurement technologies enable more comprehensive data collection during clinical trials, leading to better identification of effective treatment groups.

Cost and Complexity Considerations

  • The complexity of modern genetic research necessitates sophisticated tools; traditional approaches would be prohibitively time-consuming without automation.

Understanding the Role of Measurement Technologies in Medicine

The Evolution of Measurement Technologies

  • Advances in measurement technologies have made it possible to conduct general sequencing, RNA sequencing, and single-cell analyses at a lower cost.
  • Historical examples, such as the use of Swan-Ganz catheters in cardiology, illustrate how innovative measurement techniques can lead to significant medical discoveries.
  • Early experimentation often involved risky methods; for instance, an Australian doctor famously self-administered tests related to Helicobacter pylori.

Data Complexity and Machine Learning

  • As measurement technology improves, the volume and complexity of data increase, making it challenging to interpret without machine learning tools.
  • Without machine learning, understanding fluctuations in expression changes between healthy individuals and disease patients would be nearly impossible.

Insights for Startups in Drug Discovery

  • For startups focused on drug discovery or AI applications, having unique data is crucial. Founders should aim to generate their own relevant datasets rather than relying solely on existing ones.
  • A common pitfall is attempting to build solutions without access to the right data; generating new data becomes essential when existing datasets are insufficient.

Competitive Advantage through Data Generation

  • The ability to generate unique datasets can serve as a competitive advantage. Companies should focus on creating proprietary data while also leveraging publicly available information.
  • Ideally, partnerships could involve clients providing data that companies then analyze and return insights from—creating a mutually beneficial relationship.

Importance of Simple Algorithms

  • More data combined with simple algorithms can yield effective results; complex models are not always necessary for success.
  • When evaluating potential companies or projects, it's important to assess what unique data they possess and their capacity for generating additional relevant datasets.

Key Considerations for Machine Learning Applications

  • Cleanliness of the dataset (e.g., minimal batch effects), control over sampling aspects, and understanding method behavior under various scenarios are critical factors for successful ML applications.

Classifier Complexity and Illusion of Progress

Understanding Classifier Complexity

  • The discussion begins with the concept of classifier complexity, referencing older machine learning methods like linear discriminant analysis and SPM. It highlights that simpler models often yield satisfactory results on toy datasets.
  • Emphasizes the importance of defining performance criteria before development. Engaging with stakeholders to determine acceptable outcomes can prevent misalignment in expectations.

Algorithm Expectations

  • When evaluating algorithms, robustness and reliability are crucial. Point estimates should be accompanied by confidence measures to ensure validity.
  • Critiques many machine learning papers for presenting marginal improvements without substantial evidence or practical significance, stressing that a 10% improvement may not justify costs.

Integration and Usability

  • Discusses the need for clear precision metrics and engineering quality in algorithms to facilitate integration into existing systems. Consideration must be given to both users and operators during implementation.
  • Highlights the necessity for flexible deployment options (cloud vs on-premises), which can ease integration challenges across different industries.

Personal Projects and Innovations

Recent Developments

  • The speaker shares a personal project involving automating email report generation using language models, showcasing hands-on engagement with AI technologies.
  • Explores the debate over long context windows versus specialized models for task planning, indicating ongoing interest in optimizing AI capabilities.

Future Directions in Search Technology

  • Reflecting on how search paradigms have shifted from document retrieval to direct question answering, emphasizing advancements in reasoning over multiple documents.

Looking Ahead: Five Years from Now

Anticipated Changes in Computational Methods

Understanding Immunotherapy and Data in Drug Discovery

The Role of Immunotherapy

  • Discussion on the understanding of immunotherapy, emphasizing GSK's focus as an immune programming company. Vaccines are highlighted as tools for programming the immune system.
  • Noted that only about 20% of patients respond to current immunotherapies, indicating a need for deeper insights into immune diseases.

Advancements in Data Utilization

  • Emphasis on generating large-scale operational datasets to serve as lookup tables, reducing the need for repetitive experiments. This approach mirrors the Human Genome Project's utility.
  • Anticipation that future research will involve fewer but more informative experiments, integrating observational cohorts to learn about diseases without altering management strategies.

Understanding Disease Heterogeneity

  • Acknowledgment of disease heterogeneity and its complexity, with expectations that advancements will lead to clearer understandings over time.
  • Introduction of machine learning (ML) intersecting with mechanistic modeling in biology, suggesting structured prior knowledge can enhance algorithm performance.

Challenges in Data Collection

  • Identification of outcome data as a critical limiting factor in advancing drug discovery. The importance of having comprehensive data from both healthy individuals and those with diseases is stressed.
  • Highlighting the necessity for clinical trial outcome data to understand treatment effects better; this type of data is rare yet essential.

Collaborative Efforts Needed

  • Discussion on the potential benefits of larger public-private consortia to share clinical data while addressing competitive concerns among pharmaceutical companies.
  • Mentioned existing cohorts often have limited measurements due to funding constraints; advocating for broader sample banking and analysis capabilities.

Future Directions in Research

  • Emphasized the need for long-term studies tracking immune system changes over time, which remains poorly understood currently.

Machine Learning Challenges and Opportunities

Overview of Machine Learning Challenges

  • The speaker discusses the organization of machine learning challenges in Europe, highlighting their commitment to fostering innovation in this field.
  • Mentioned specific challenges such as the "gene disco challenge" aimed at operations, showcasing a focus on real-world applications of machine learning.
  • The initiatives are hosted on GSK Ai, indicating a platform for collaboration and competition among data scientists and researchers.
  • Prizes are offered for these challenges, emphasizing the potential for participants to earn significant rewards through their contributions.
Channel: a16z · Playlists: Raising Health
Video description

In this episode, Vijay Pande talks with Kim Branson, the SVP and Global Head of AI and Machine Learning for GSK, about his journey from a childhood fascination with computers to leading AI efforts in a major pharmaceutical company. They discuss Branson's career path, the intersection of computation and biology, and the challenges and opportunities in integrating AI into drug discovery and development. Topics covered: 00:00 - Introduction to Kim Branson and his background 00:15 - Early interest in computers and transition to molecular biology 01:20 - Experience with computational methods in drug design 02:10 - Differences between working in startups and large pharmaceutical companies 04:05 - Decision to join GSK and the company’s commitment to AI 06:10 - Implementing AI in drug discovery processes 08:35 - Use of genetic databases and machine learning in target identification 10:55 - Application of AI in clinical trials and pathology 12:40 - Importance of data in AI-driven drug discovery 16:55 - Challenges of integrating AI into existing pharmaceutical frameworks 18:15 - Advice for startups in the AI and drug discovery space 24:20 - Future of AI in personalized medicine and computational biology Resources: Find Vijay on Twitter: https://x.com/vijaypande Learn more about a16z Bio+Health: https://a16z.com/bio-health/ Learn more about Raising Health: https://a16z.com/podcasts/raising-health/ Stay Updated: Find a16z Bio+Health on LinkedIn: https://www.linkedin.com/showcase/a16z-bio-health/ Find a16z Bio+Health on X: https://x.com/a16zBioHealth Subscribe on your favorite podcast app: https://raisinghealth.simplecast.com/ Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.