SANS Critical Advisory: BugBusters - AI Vulnerability Discovery Hype versus Reality

SANS Critical Advisory: BugBusters - AI Vulnerability Discovery Hype versus Reality

Introduction to SANS Critical Advisory

  • Ed Scotis introduces the session on AI vulnerability discovery and its implications.
  • Co-presenters include Chris Elgie, a SANS instructor, and Joshua Wright, a SANS fellow.
  • Discussion focuses on separating myth from reality regarding recent AI model announcements.

Anthropic's Mythos Announcement

  • Anthropic announced their new model "Mythos" with select partner companies.
  • The announcement has sparked significant industry attention and concern over AI capabilities.
  • Emergency meetings held by US banks and UK government discussions highlight urgency.

Industry Reactions and Concerns

  • Mixed reactions exist; some view it as a major threat while others see marketing hype.
  • OpenAI released an updated GPT model focused on cybersecurity in response to competition.
  • Ongoing rumors suggest other AI vendors may soon release similar capabilities.

Purpose of the Webcast

  • The webcast aims to clarify current models' capabilities versus the hype surrounding Mythos.
  • Hands-on demonstrations will showcase vulnerability discovery through exploit creation.

Source-Assisted Penetration Testing Overview

  • Ed explains source-assisted penetration testing conducted for 15 years with client-provided code.
  • Traditional pen testing involves finding vulnerabilities before analyzing source code for exploitation.

AI Enabled Source-Assisted Pen Testing

  • Introduction of AI tools changes workflow in source-assisted pen testing significantly.
  • AI tools interactively analyze source code to identify potential flaws more efficiently.

AI-Enabled Pen Testing Methodology

Overview of the Process

  • The process involves identifying real flaws by weeding out hallucinations and false positives, then exploiting these flaws on target systems.
  • This workflow has led to uncovering major findings in pen testing, revealing critical vulnerabilities not found in previous tests.
  • The methodology identifies subtle flaws in obscure workflows that can lead to exploitable vulnerabilities.

Types of Vulnerabilities Discovered

  • Common vulnerabilities include authentication bypass, authorization flaws, IDOR flaws, and race conditions.
  • The advancements in AI models are significantly changing the industry landscape for penetration testing.

Legal Considerations and Model Limitations

  • It's crucial to have permission before sharing source code with models to avoid legal issues.
  • Local testing may be necessary if permissions aren't granted; however, it yields less effective results compared to using external models.

Managing Context and Exploitation

  • Models have a limited context window which affects processing capacity; managing this is essential during testing.
  • Most vulnerabilities are found in a small portion of the code; focus should be on critical areas like authorization flows.

Addressing Model Errors

  • Models can produce false positives or hallucinations; careful prompting is needed to validate results effectively.
  • Actual exploitation is necessary for validation as pentesting requires proving vulnerabilities rather than just reporting them.

Utilizing AI for Code Analysis

  • AI models excel at creating tools and proof of concepts quickly during the pentesting process.
  • They are also effective at searching for patterns within large codebases, aiding in vulnerability discovery.

Understanding LLMs in Code Analysis

Key Insights

  • Humans can follow code trails but take longer; LLMs efficiently find dependencies and iterate without fatigue.
  • Audience encouraged to ask questions via chat for a Q&A session at the end of the presentation.
  • A new two-day SANS course covers methodologies being introduced, with examples from course materials.

Workflow Steps

  • The workflow applies to various models; size matters, but technique is crucial regardless of code volume.
  • First step: map the repository to create a summary or table of contents for reference during analysis.
  • Focus on the critical 20% of code that contains most vulnerabilities, narrowing context as needed.

Vulnerability Testing

  • Once significant areas are identified, work with them to find real vulnerabilities rather than hallucinated ones.
  • Create a test harness for local testing; run tests multiple times for large code bases.
  • AI assists in report writing but requires human input to refine results and ensure accuracy.

Case Study: DataEase

  • DataEase is a large CMS with about 1.7 million lines of code; language barriers may pose challenges for humans but not LLMs.
  • Mentioned team members contributed significantly to developing methods used in this analysis.
  • Bugs discussed were previously found and patched; current focus is on understanding existing vulnerabilities.

Mapping Process

  • Initial confirmation step ensures the system focuses on the correct version of DataEase before mapping begins.
  • The mapping process involves generating a comprehensive overview of the repository's structure and functions.
  • The LLM processes data to identify key output files necessary for further analysis.

Understanding Repository Maps and Attack Surface Notes

  • A repository map is a flat text file for reference, alongside attack surface notes.
  • The system may attempt to identify vulnerabilities prematurely without sufficient context.
  • The goal is to narrow down functions and workflows worth investigating further.

Vulnerability Discovery Process

  • The demo avoids disclosing zero-day vulnerabilities by using previously patched ones.
  • This approach ensures responsible demonstration while still showcasing the system's capabilities.
  • Additional vulnerabilities were found, emphasizing the importance of thorough testing.

Candidate Matrix and Focus Context

  • A markdown file called candidate matrix highlights primary candidates for focus in testing.
  • The process helps save time during penetration testing by narrowing down potential areas of interest.
  • Each iteration refines the focus context, improving efficiency in vulnerability analysis.

Analyzing Code with Focus Context

  • The function's understanding improves as it processes specific paths within the codebase.
  • Dependencies are identified, aiding in targeted vulnerability searches within data processing flows.
  • The model analyzes paths to uncover potential issues that might be overlooked manually.

Verification Notes and Methodology Overview

  • Verification notes provide insights into findings, including possible patches for identified issues.
  • Dynamic artifact creation allows real-time inspection of results during analysis sessions.
  • Methodology from SAN's course Sec 543 is integrated into the workflow for structured vulnerability assessment.

Vulnerability Analysis Process

  • The focus is on identifying code issues and analyzing technical flaws step by step.
  • A test harness is created to simulate the vulnerable process from 1.7 million lines of code.
  • The tool built in Python allows for testing vulnerabilities on localhost.

Proving Vulnerabilities

  • Emphasis on proving vulnerabilities rather than making overclaims about their severity.
  • The AI provides JSON output as proof of an exploited vulnerability.
  • A mockup of the critical data portion is used to run tests against the exploit.

Testing Exploits

  • Initial test results show a 404 error when accessing a non-existent page.
  • Running the exploit allows unauthorized database access, demonstrating a security flaw.
  • The classic "Calc" pop-up indicates successful exploitation on a Windows system.

Reporting Findings

  • A structured report is generated using ASCII doctor format based on findings and vulnerability details.
  • Report includes an executive summary and CVSS score, indicating potential severity of the issue.

Zero-Day Vulnerabilities Discussion

  • New vulnerabilities identified are considered zero days but belong to known classes like Idoor and Bola.
  • Questions arise regarding how many new vulnerabilities are truly novel zero days; they are recognized but not entirely new types.

Model Utilization in Testing

  • Various models including Gemini, Codex, and Opus are utilized for analysis and validation tasks.

Dynamic Model Management

  • Importance of being able to switch underlying models dynamically due to potential issues with specific models.

What Does the Future Hold for Cybersecurity?

Overview of Current Cybersecurity Landscape

  • Discussion on the shift from traditional pen testing to new methodologies in cybersecurity.
  • Introduction by Joshua Wright, sharing insights from a recent report on cybersecurity stability.
  • Reflection on the perceived stability in cybersecurity over the last 20 years and emerging threats.

Challenges Ahead

  • Acknowledgment of ongoing vulnerabilities like supply chain issues and zero-day exploits.
  • Recognition that despite challenges, there has been relative stability in cybersecurity practices historically.
  • Personal experiences shared about past cybersecurity environments compared to current challenges.

The Impact of AI on Cybersecurity

  • Prediction that AI will provide attackers with significant advantages in exploiting vulnerabilities.
  • Consideration of whether recent claims about AI's impact are both PR stunts and genuine concerns.
  • Anticipation of an increase in vulnerability exploitation, potentially overwhelming defenders.

Recommendations for Improvement

  • Emphasis on improving vulnerability management processes within organizations.
  • Highlighting CounterHack's use of AI models for identifying vulnerabilities effectively.
  • Noting how attackers are increasingly using AI for more sophisticated adversarial attacks.

Enhancing Patching Processes

  • Call to action for better patching strategies across all stages: creation, distribution, testing, and validation.
  • Identification of areas where patching is effective versus those needing improvement (e.g., Mac systems).
  • Urging all vendors to enhance their patching mechanisms to keep pace with emerging threats.

Rethinking Risk Management

  • Suggestion to reassess business risk calculations regarding reported vulnerabilities.

Understanding Business Risks in a Vulnerability Discovery World

The Need for Better Risk Management

  • Businesses must improve understanding of risks and acceptable downtime.
  • Critical systems require new strategies to handle frequent vulnerabilities.

Opportunities in Crisis

  • CISO-level reconsideration is essential; crises can present opportunities.
  • Use current challenges to counsel management on proactive risk management.

Leveraging Insight for Organizational Benefit

  • Counsel executives using personal insights to address practical risks.
  • Establish yourself as an industry expert within your organization.

Importance of Traditional Controls

  • Conventional controls are vital despite increasing attacks.
  • Focus on privilege management to limit potential damage from breaches.

Enhancing Incident Response Programs

  • Threat hunting techniques should be employed to reduce detection time.
  • Modernize tabletop exercises for incident response teams to handle concurrent attacks.

Preparing for Multiple Incidents

  • Teams need strategies for managing multiple incidents simultaneously.
  • Identify resource needs early to effectively respond to new threats.

Embracing Vulnerability Management Programs

  • "Volnapps" or vulnerability management programs are crucial moving forward.
  • Organizations must adapt their approach beyond high-profile software vulnerabilities.

Evolving Nature of Vulnerabilities

  • Vulnerability management has shifted; more organizations face diverse threats now.

( t = 2366 s ) AI's Role in Identifying Vulnerabilities

  • AI is changing the landscape, making it easier to identify vulnerabilities.

( t = 2394 s ) Managing Information Overload

  • It's challenging to keep up with rapid changes in technology and threats.

( t = 2419 s ) Speculating Future Trends

  • Anticipate a significant increase in reported vulnerabilities due to AI advancements.

( t = 2444 s ) Closed Source Software Challenges

  • Future models may identify vulnerabilities in closed source software, raising new concerns.

Open Source Software Vulnerabilities

  • Mythos and platforms are focusing on open source software; Photoshop uses open source elements for various operations.
  • Anticipate a peak in AI models improving reverse engineering of closed source software within 6 to 12 months, complicating vulnerability management.
  • Expect a plateau in vulnerability discovery after an initial surge, leading to a tumultuous period of vulnerabilities.

Future of Software Security

  • Over time, software vulnerabilities may decrease, potentially leading to a more secure ecosystem in 3 to 10 years.
  • AI advancements will likely reduce vulnerabilities in both open and closed source software.
  • The future appears bleak short-term but holds opportunities for improvement long-term.

Resources for Cybersecurity Professionals

  • Upcoming AI Cybersecurity Summit hosted by SANS offers free online access; valuable content expected next week.
  • The Cybersecurity Alliance published "AI Vulnerability Storm," providing actionable advice for CISOs regarding upcoming risks.
  • SANS is hosting a Find Evil Hackathon focused on building AI tools for forensic analysis with cash prizes available.

Educational Opportunities and Tools

  • Introduction of SANS Security 543 class on AI-assisted source code analysis aimed at penetration testers.
  • New handout available summarizing techniques for using AI in vulnerability analysis; useful guidance provided.
  • Emphasis on the importance of resources shared during the session; community engagement encouraged.

Application vs. Web App Pen Testing

  • Discussion on whether the focus is on application or web app pen testing and when to switch methodologies.
  • LLMs are becoming capable of reverse engineering, with examples of their use alongside tools like Ghidra.
  • Hardware analysis is also being integrated with LLMs for vulnerability assessment.

Human Engagement in Pen Testing

  • Importance of human oversight during the pen testing process to ensure effective results.
  • The need for continuous human engagement as LLM may lead to unproductive paths.
  • Opportunities exist for LLM to assist in traditional tasks while a human supervises.

AI Pen Testing and Compliance Standards

  • Inquiry about AI pen testing becoming an industry standard for compliance purposes.
  • Current slow adoption of AI in governance but potential future integration into processes.
  • Emphasis on comprehensive assessments rather than mandatory AI usage.

Economic Considerations in AI Usage

  • Recent events may accelerate the expectation of AI usage in vulnerability assessments.
  • Economic arguments regarding the cost-effectiveness of using humans versus tokens in pen testing.
  • Concerns about token costs and practical implications if investment capital decreases.

Staffing Recommendations for High Value Projects

  • Discussion on staffing strategies for high-value pen testing projects based on team dynamics.
  • Spectrum of team members from imaginative AI users to methodical vulnerability finders.

AI Integration in Team Dynamics

  • Teams are integrating AI into processes, contributing different expertise to assessments and engagements.
  • Importance of having a mix of technical skills and creative thinking within the team for effective penetration testing.
  • Continuous development is emphasized to enhance employee skills in using LLMs effectively.

Community Engagement and Learning

  • Creation of an internal CTF for pen testing teams to learn techniques, later offered as a class through SANS.
  • Discussion on balancing optimistic and pessimistic views regarding new capabilities in cybersecurity.
  • Concerns about prioritizing resources amidst emerging vulnerabilities while maintaining ongoing projects.

Resource Management and Burnout

  • Need for decision-makers to prioritize business needs without abandoning important ongoing work.
  • Managing burnout is crucial; overwhelmed employees may stop producing meaningful work.
  • Growing gap between organizations with resources for security versus those without, affecting various sectors.

Opportunities and Community Solutions

  • AI has potential to help bridge the resource gap but current focus leans more towards attack rather than defense.
  • Community initiatives like hackathons can foster collaboration and create open-source solutions to address security challenges.

Pen Testing Recommendations

  • Pen testers provide specific recommendations for code changes but emphasize the need for evaluation by skilled developers.
  • Clear communication that penetration testers are not software developers; thorough QA is essential before implementation.

Closing Remarks and Resources

  • The session concludes with thanks to Chris Elgie and Josh Wright for their insights.
  • A resource slide is mentioned, highlighting the importance of the Cloud Security Alliance paper by Gadi Evron, Rich Mogul, and Rob T. Lee.
  • Other resources are also recommended for further reading to enhance understanding of discussed topics.
Video description

According to Anthropic, their Claude Mythos model found thousands of zero-day exploits across every major operating system and web browser. The conversation since the announcement has been split between dismissing it as marketing and treating it as the end of the world. Neither captures what is actually happening. In this livestream, Ed Skoudis opens with what is real and what is hype based on 15 months of hands-on experience using AI models for vulnerability discovery. Chris Elgee demonstrates the AI-assisted discovery-to-exploit pipeline live on screen using a current model against actual vulnerable code. Joshua Wright closes with the industry implications: why the next 12 months may see accelerated attacks, and why the years beyond that could be the safest in software history.