CrowdStrike Incident : A Wake-Up Call for Businesses
Summary of Crowdstrike incident:
In July 2024, a seemingly routine update from CrowdStrike, a leading cybersecurity firm, caused widespread “Blue Screen of Death” (BSOD) errors on Windows systems globally. This incident disrupted operations across various sectors, including airlines, hospitals, media companies, and transportation agencies. The fallout from this event has highlighted the critical importance of robust IT infrastructure and comprehensive cyber readiness. In this post, we explore the details of the CrowdStrike outage, its impact, and why it’s essential for businesses to re-evaluate their IT stance.
The Incident
On July 19, 2024, reports began surfacing that a CrowdStrike update had caused Windows computers to crash and enter a continuous BSOD loop. The issue was traced back to a faulty update to the CrowdStrike Falcon agent, which included a problematic channel file. This file caused affected Windows computers to repeatedly crash, rendering them inaccessible (Malwarebytes) (Triskele Labs | When Experience Matters).
The impact was immediate and widespread. Businesses in Australia were the first to report the issue, followed by similar problems in Europe, the United States, and other regions. The sectors affected included airlines, where flights were grounded, hospitals that had to cancel procedures, and media companies that faced operational disruptions (Malwarebytes) (CityAM).
The Business Impact of Crowdstrike
The BSOD issue had far-reaching consequences:
- Operational Disruptions: Businesses experienced significant downtime, affecting their ability to serve customers and perform critical operations.
- Financial Losses: The disruption led to financial losses due to halted operations and the costs associated with resolving the issue.
- Reputational Damage: Companies impacted by the outage faced potential reputational damage as customers and clients were affected by the downtime.
- Increased Security Risks: Although this was not a cyberattack, the incident underscored the vulnerabilities that can arise from software updates and the importance of having robust mitigation strategies in place (Malwarebytes) (Shacknews) (CityAM).
New Recovery Tool to help with CrowdStrike issue impacting Windows endpoints
As a follow-up to the CrowdStrike Falcon agent issue impacting Windows clients and servers, Microsoft has released an updated recovery tool with two repair options to help IT admins expedite the repair process. The signed Microsoft Recovery Tool can be found in the Microsoft Download Center: https://go.microsoft.com/fwlink/?linkid=2280386. In this post we include detailed recovery steps for Windows client, servers, and OS’s hosted on Hyper-V. The two repair options are as follows:
- Recover from WinPE – this option produces boot media that will help facilitate the device repair.
- Recover from safe mode – this option produces boot media so impacted devices can boot into safe mode. The user can then login using an account with local admin privileges and run the remediation steps.
Determining which option to use
Recover from WinPE (recommended option)
This option quickly and directly recovers systems and does not require local admin privileges. However, you may need to manually enter the BitLocker recovery key (if BitLocker is used on the device) and then repair impacted systems. If you use a third-party disk encryption solution, please refer to vendor guidance to determine options to recover the drive so that the remediation script can be run from WinPE.
Recover from safe mode
This option may enable recovery on BitLocker-enabled devices without requiring the entry of BitLocker recovery keys. For this option, you must have access to an account with local administrator rights on the device. Use this approach for devices using TPM-only protectors, devices that are not encrypted, or situations where the BitLocker recovery key is unknown. However, if utilizing TPM+PIN BitLocker protectors, the user will either need to enter the PIN if known, or the BitLocker recovery key must be used. If BitLocker is not enabled, then the user will only need to sign in with an account with local administrator rights. If third-party disk encryption solutions are utilized, please work with those vendors to determine options to recover the drive so the remediation script can be run.
Additional considerations
Although the USB option is preferred, some devices may not support USB connections. In such cases, we provide detailed steps below for using the Preboot Execution Environment (PXE) option. If the device cannot connect to a PXE network and USB is not an option, reimaging the device might be a solution.
As with any recovery option, test on multiple devices prior to using it broadly in your environment.
Learn more on Microsoft website on step by step fix
Also More workaround and fixes from Crowdstike official website here (please be careful, many hackers are exploiting this bug)
A Call to Re-Evaluate Your IT Stance
This incident is a wake-up call for businesses worldwide. It highlights the need to re-evaluate and strengthen IT infrastructure and cyber readiness. Here are some key steps your business should consider:
- Conduct a Comprehensive IT Audit: Evaluate your current IT infrastructure to identify vulnerabilities and areas for improvement. Ensure that your systems are robust and capable of handling unexpected disruptions.
- Implement Redundant Systems: Establish redundant systems and backup protocols to ensure business continuity in the event of a primary system failure.
- Strengthen Cybersecurity Measures: Regularly update and patch your security systems, and implement advanced threat detection and prevention tools to safeguard against potential cyber threats.
- Develop an Incident Response Plan: Create and regularly update an incident response plan that outlines the steps to take in the event of a cybersecurity incident or IT disruption. Ensure that all employees are aware of their roles and responsibilities in executing this plan.
- Engage with IT and Cybersecurity Experts: Consider partnering with IT and cybersecurity experts who can provide guidance and support in maintaining a secure and resilient IT infrastructure.
Businesses need to reevaluate the following:
- IT Infrastructure Robustness:
- Redundancy and Failover Mechanisms: Ensure that critical systems have redundancy and failover mechanisms in place to maintain operations during outages.
- Backup Systems: Regularly update and test backup systems to ensure data integrity and availability during disruptions.
- Update and Patch Management:
- Testing Procedures: Implement comprehensive testing procedures for updates and patches before deployment across the organization to avoid widespread issues.
- Rollback Plans: Develop and maintain rollback plans to quickly revert to previous stable versions if an update causes issues.
- Incident Response Plans:
- Response Protocols: Ensure that incident response plans are up-to-date and include clear protocols for addressing IT outages and cyber incidents.
- Employee Training: Conduct regular training for employees on their roles and responsibilities during an incident to ensure swift and effective responses.
- Cybersecurity Measures:
- Threat Detection and Prevention: Utilize advanced threat detection and prevention tools to identify and mitigate potential threats in real time.
- Vulnerability Management: Regularly conduct vulnerability assessments and penetration testing to identify and address security gaps.
- Communication Strategies:
- Internal Communication: Develop clear communication strategies for informing employees about ongoing issues and steps being taken to resolve them.
- External Communication: Prepare templates and plans for communicating with customers, partners, and stakeholders during disruptions to maintain transparency and trust.
- Vendor and Third-Party Risk Management:
- Vendor Assessments: Evaluate the security practices of third-party vendors and partners to ensure they meet your organization’s security standards.
- Contracts and SLAs: Ensure that contracts and service level agreements (SLAs) with vendors include clauses for addressing security incidents and outages.
- Business Continuity Planning:
- Continuity Plans: Update and test business continuity plans to ensure that essential functions can continue during IT disruptions.
- Cross-Department Coordination: Foster coordination between IT and other departments to ensure a unified approach to continuity planning.
By addressing these areas, businesses can significantly enhance their resilience against IT outages and cyber incidents, ensuring continued operations and protection of critical assets.
Let’s Talk
At NVITS, we specialize in helping businesses navigate IT challenges and enhance their cyber readiness. Our team of experts is here to assist you in re-evaluating your IT stance, implementing robust cybersecurity measures, and ensuring your business is prepared for any future disruptions.