How An Update Destroyed the Economy | The CrowdStrike Outage

How An Update Destroyed the Economy | The CrowdStrike Outage

The Global Computer Outage: What Happened?

Overview of the Incident

  • On July 19, 2024, a significant bug affected millions of Windows computers globally, preventing users from booting up their machines.
  • The bug primarily impacted systems with sensitive information or critical roles in businesses and government facilities, leading to speculation about a global cyberattack.
  • Major institutions like airports, banks, and hospitals experienced outages for hours; some were still ongoing at the time of reporting.
  • This incident has been compared to the "modern Y2K," potentially marking one of the greatest outages in history.

Cause of the Outage

  • The video clarifies that this is not an attack on CrowdStrike but rather an explanation based on available information and theories regarding the incident.
  • The root cause was likely due to oversight and negligence related to an update from CrowdStrike's security software called Falcon.

Understanding CrowdStrike Falcon

  • Falcon operates at the kernel level of operating systems, providing advanced malware detection capabilities beyond traditional antivirus programs.
  • The kernel is crucial as it oversees all software-related tasks within a computer; errors at this level can lead to severe consequences such as system crashes.

Importance of Kernel-Level Operations

  • If applications encounter critical errors at user-level operations, the kernel can intervene by shutting down problematic programs to maintain system stability.
  • A failure at the kernel level can result in a "kernel panic," forcing shutdown and loss of unsaved data.

Certification and Security Implications

  • Microsoft requires third-party kernel drivers to undergo rigorous testing for WHQL certification to ensure stability within Windows environments.
  • While CrowdStrike Falcon is WHQL certified, any mistakes made during its operation could compromise entire systems due to its deep integration with core functions.

Update Mechanism Risks

  • Falcon uses "over-the-air" updates for security reasons; these updates occur without user knowledge to prevent malware developers from gaining insights into vulnerabilities.
  • However, this approach means that if there are issues with an update process or content, it could lead directly back to widespread operational failures.

CrowdStrike Update Bug: A Major Tech Outage

Overview of the CrowdStrike Driver Issue

  • The new CrowdStrike driver was certified but still downloaded and executed unverified additional code, leading to undetected bugs.
  • A critical bug involved a .sys file in the update that contained erroneous code, essentially a file filled with zeros, which caused operational failures.
  • When users powered on their computers, CrowdStrike attempted to access this empty file, resulting in crashes and boot loops due to memory access issues.
  • Approximately 8.5 million computers were affected by this bug, necessitating manual removal of the update by IT professionals for each device.
  • The situation is likened to the Y2K scare; while not as widespread, it highlights significant risks associated with software updates.

Historical Context: Comparison with Y2K

  • The speaker draws parallels between this incident and the Y2K issue where computers misread dates due to programming limitations from earlier decades.
  • In the 90s, many systems required manual updates via physical media like floppy disks or CDs to prevent malfunctions at the turn of the millennium.
  • Although some systems were affected during Y2K, most issues were mitigated through timely updates; however, this current bug emerged unexpectedly.

Fixing the CrowdStrike Outage

  • To resolve the issue, users can enter "Safe Mode" on Windows and use command prompt commands to delete problematic .sys files manually.
  • This process requires creating backups and may be complicated for average users who lack technical expertise in handling such situations.
  • For encrypted data scenarios where access keys are also down, additional steps are necessary to retrieve those keys before proceeding with fixes.
  • Microsoft has developed a program that allows users to download a fix onto USB or disk drives for easier resolution of these issues.

Community Reactions and Accountability

  • There is significant backlash within tech communities questioning how CrowdStrike allowed an uncertified driver to execute unverified code.
  • The incident primarily impacted Windows systems since Apple had restricted Falcon kernel access prior to 2020.

Understanding Cybersecurity Outages

The Impact of Operating Systems on Cybersecurity

  • The speaker discusses their delayed awareness of a cybersecurity outage, highlighting that as a Mac user, they were less affected due to not using CrowdStrike on their Windows machine.
  • They argue that the prevalence of Windows in the market makes it a primary target for cyberattacks, suggesting that its security vulnerabilities are exploited more frequently than those of other operating systems.
  • Malware developers tend to focus on popular software like Windows because it offers greater potential for widespread impact, while MacOS is less targeted due to its smaller user base and tighter security.

Strategies for Mitigating Cybersecurity Risks

  • The speaker advocates for diversifying digital environments by using multiple operating systems. This approach can help mitigate risks during outages or attacks.
  • They suggest that Linux can serve as an alternative OS during Windows outages, emphasizing its cost-effectiveness and suitability for basic tasks like web browsing.

Importance of Software Diversity

  • Emphasizing the need for software diversification, the speaker warns against relying solely on dominant companies like CrowdStrike. A single point of failure can have significant negative consequences.
  • They compare this situation to investing in stocks; spreading investments across various companies reduces risk if one company fails.

Lessons from Recent Outages

  • The discussion highlights how major outages reveal vulnerabilities in our reliance on specific technologies. If one service goes down (like YouTube), alternatives exist (like ABCNews).
  • The speaker suggests that a diversified software ecosystem would lessen the impact of outages by ensuring not all services are affected simultaneously.

Conclusion: Learning from Mistakes

  • The importance of having multiple competitors in technology is stressed; reducing dependency on one company could lead to better resilience against outages.
  • Ultimately, the speaker reflects on human error as a fundamental cause behind technological failures and emphasizes learning from these incidents to improve future cybersecurity measures.
Video description

The first 500 people to use my link will get a 1 month FREE trial of Skillshare! https://skl.sh/nationsquid12231 On July 19, 2024, millions of people around the world discovered that they could no longer turn on their Windows machines. It was later discovered that this was a global outage caused by a faulty update pushed by CrowdStrike, one of the top cybersecurity companies in the world, and their software known as Falcon. The outage affected over 8.5 million computers and is still ongoing, with IT professionals calling it "the modern Y2K" and one of the greatest outages in history. So what exactly happened, and how can we prevent it from occurring again in the future? Support me on Patreon! https://patreon.com/NationSquid Buy Me a Coffee: https://www.buymeacoffee.com/nationsquid Join this channel to get access to perks: https://www.youtube.com/channel/UCt_0qjzmxopG3leL3OuvlFQ/join Instagram: https://www.instagram.com/nationsquid/ Twitter: https://twitter.com/NationSquidYT Website: http://www.nationsquid.com/ Merchandise: https://teespring.com/stores/squids-secret Google +: just kidding. Beauty Flow by Kevin MacLeod Link: https://incompetech.filmmusic.io/song/5025-beauty-flow License: http://creativecommons.org/licenses/by/4.0/ Wholesome by Kevin MacLeod Link: https://filmmusic.io/song/5050-wholesome License: https://filmmusic.io/standard-license ENJOY THE PROGRAM.