Security By Design- Part 12
On July 19, 2024, a Single Software Update was able to create IT Chaos Across the Entire World
A faulty CrowdStrike channel file 291 software update to their Falcon End Point Threat Data file (Detection and Response endpoint device software) impacted (crashed) 9+ million computers. This created an immediate global IT outage affecting airlines, media outlets, banks, retailers, and other organizations that use Microsoft Windows operating systems.
“Largest IT Outage in History”
The World Economic Forum is reporting this as possibly the “largest IT outage in history”, revealing the vulnerability and interconnections of Complex Digital Systems. This follows the continuing cleanup and losses due to the Change Healthcare cyber attack which occurred on February 19th, 2024, with $6.87 billion in total losses expected (including equity loss, legal fees, regulatory fines, class action claims, PR expenses, notification costs, and other expenses).
It is being reported that:
- 8.5+ million computers have been impacted.
- Fortune 500 expects $5.4 billion in direct cost impacts.
- Cyber Insurers expect to pay out $1.5 billion in claims.
- Delta, America’s second-biggest airline, canceled 5,000 flights, disrupting travel for millions.
- CrowdStrike’s cyber security services are used by 298 of the Fortune 500 companies, 538 of the Fortune 1000 companies, and 8 out of 10 of the top financial firms.
- 163 companies have been directly impacted by this outage hosted on Microsoft cloud services. (a full list of these companies can be found at List Of Companies Affected By The Global Microsoft-Crowdstrike Outage – Tech Business News) This does not include the companies impacted which host their own servers or who use other cloud services and CrowdStrike Falcon endpoint detection and response services.
- At least one putative class action lawsuit from investors arguing they were misled by the company and told its technology was “validated, tested and certified” before a faulty update triggered a global IT outage.
“We got a taste of what a global cyberattack would look like.”
Ian Thornton-Trump, CISO at Cyjax has commented:
“We learned a valuable lesson about the fragility of our infrastructure,” Thornton-Trump said. “We also learned how dangerous – we got a taste of what a global cyberattack would look like … I feel like we are at one of those inflection moments.”
As reported by Reach Chase DiFeliciantonio in the San Franciso Chronicle:
CrowdStrike’s widespread deployment across industry and government itself reflects a kind of critical security flaw, said Professor Ahmed Banafa at San Jose State University, who teaches courses on computer networking, operating systems, and cybersecurity. Friday’s incident appears to have been human error, “But what if this (had been) done intentionally?” he said.
“If you can bring millions of computers down, that is a single point of failure,” Banafa added. “I think Microsoft is really going to think about their relationship with CrowdStrike given what happened today.”
Global IT Supply Chain Implications and Impacts
Interos, a Supply Chain Risk Intelligence Company, in their CrowdStrike Outage Analysis reports:
CrowdStrike was involved in a global IT outage that has highlighted the vulnerability of interconnected global supply chains. The outage impacted 674,620 direct customer relationships of CrowdStrike and Microsoft, and over 49 million indirectly, according to Interos data. While the U.S. was the most affected country, with 41% of impacted entities, the disruption was also felt at major ports and air freight hubs in Europe and Asia. Ports from New York to Los Angeles and Rotterdam reported temporary shutdowns, while air freight suffered the hardest blow, with thousands of flights grounded or delayed. The outage exacerbates existing supply chain challenges amid rising global demand and freight prices, highlighting the potential long-term implications for global trade and finance.
Interos analyzed the extended supply chains of both CrowdStrike and Microsoft, whose Microsoft 365 systems were disrupted as part of a CrowdStrike update, leading to outages for Microsoft users across the world. When examining the direct customer relationships (Tier 1) of both Microsoft and CrowdStrike, Interos was able to identify 674,620 customer relationships. When expanding the scope of impact to include the customers of Microsoft and CrowdStrike’s customers (Tier 2), the number of customer relationships identified by Interos data grows to over 28 million, and when going one step further (Tier 3), that figure increases to over 49 million customer relationships.
Root Cause?
There is ample reporting by CrowdStrike and the media as to the root cause of this cyber event being a bug/failure in CrowdStrike’s automated software testing which did not “catch” the software “bug” in their channel file 291 software update to their Falcon End Point Threat Data file (Detection and Response endpoint device software).
REALLY?
A more discerning view should be asking these questions of CrowdStrike:
- As a prime top-tier global supplier of cyber security services, why do you and did you push out a global patch to all your customers at the exact same instant?
- Why did you not test your patch on target safe zone environments prior to pushing it out?
- Why did you not push this update / patch out to your customers incrementally as a best practice, to minimize the potential impacts in the event of such a failure as this?
- What are and where were your governance and controls for software quality test and assurance?
A more discerning view should be asking these questions of customers of CrowdStrike:
- As a customer of CrowdStrike, why do you allow a third party to auto-push a patch directly onto your production operations servers?
- Why do you not test third-party software patches in a safe zone prior to widespread software update/patch deployment?
- What are and where were your governance and controls for software quality test and assurance?
Security focusing only on Design aspects of Complex Systems will ensure Cyber Insecurity
In my last post, I commented that digital systems do not stand alone as technology systems only. As soon as we deploy technology, the technology enters the realm of becoming a system, interconnecting, and interacting with other systems with which our business operates. These systems and interconnections extend throughout our organizations, as well as throughout out our Ter 1 extended customer and supplier interconnections, our Tier 2 extended customer and supplier interconnections, our Tier 3 extended customer and supplier interconnections, and so on.
These business systems are comprised of benevolent technology, people, and processes interconnecting our business with our customers and suppliers, as well as the shared open systems we use, such as the World Wide Web and the internet. Unfortunately, bad actors also use the open and shared internet and World Wide Web systems with the sole purpose of compromising our business operations. So, … focusing only on the secure design of technology products is not enough to secure our Businesses by Design.
Cyber Events of minor and/or catastrophic consequences can also be triggered by non-bad actors from anywhere within our extended customer and supplier chains.
The very nature of the complex interconnections of digital systems mandates that Security by Design requires that all aspects of security design and operations work in concert continually across all Tiers of our organization + customer + supplier chain. Any gaps, mistakes, or lack of attentiveness by those who use, support, or operate our systems, including those responsible for oversight and assurance, will guarantee a cybersecurity compromise, either by bad actors or by “own or customer or supplier” actors.
Security by Design
- People / Process
- People / Data
- People / Technology
- Process / People
- Process / Data
- Process / Technology
- Technology / People
- Technology / Data
- Technology / Process
- Data / Process
- Data / People
- Data / Technology
Joseph F. Norton is a Risk, Security, and Crisis Management professional.
He is a founding member and Qualified Technology Executive of the Digital Directors Network, Chair of the Advisory Board with Next Era Transformation Group, and Chief Security Officer with APF Technologies.
He has served as Chief Security Officer, SVP at Atos, Chief Technology Officer and Head of Operations, SVP at Philips, Chief Technology Officer, SVP at Novartis, Executive-in-Residence with McKinsey & Company, and Chief Technology Officer at McDonald’s. He has also held professional roles during his career with JPMorgan Bank, Oracle, Sybase and Grumman Aerospace Corporation, and the United States Navy.
DISCLAIMER
Copyright ©2024 by DivIHN Integration Inc. | [email protected].