Following
The File 291 incident — so named for
CrowdStrike recently answered some of these questions in
CrowdStrike has a regular track and fast track for updating cybersecurity threat sensors installed by customers on their Windows, Mac and Linux systems. These updates allow the sensors to detect new cybersecurity threats as CrowdStrike discovers them.
Updates issued via the fast track (CrowdStrike calls these updates Rapid Response Content) differ in design from updates issued via the regular track. This design takes advantage of templates that CrowdStrike can easily fill out to issue fast-tracked updates, and because they are based on templates, they require far less testing than regular updates.
CrowdStrike calls the suite of tests it runs on fast-tracked updates a Content Validator. Last month, CrowdStrike learned the hard way that the Content Validator had a flaw. This flaw caused the test suite to overlook a problem in the update it issued, which was subsequently issued to millions of Windows computers — Microsoft estimates 8.5 million of them — that then crashed.
CrowdStrike had trusted its Content Validator and its templated design for fast-tracked updates to provide sufficient protection against a faulty update like the one that ultimately went out. The company said it had trusted the process in part because it had issued other templated updates without issue.
CrowdStrike will no longer trust this process alone to catch errors with fast-tracked updates, the company said in its post-incident review. The company promised additional testing processes to catch problems like the one that caused the File 291 incident last month.
Among the new testing CrowdStrike has promised is local developer testing. This type of testing involves deploying an update to developers’ computers before they go out to the broader public. This allows developers to catch any glaring issues (like a “blue screen of death”) before an update goes out into the wild. It’s a basic measure and standard practice in the software engineering industry.
CrowdStrike also promised better error handling in the software that crashed when running the problematic update. This ideally would ensure that, even if an error with CrowdStrike’s code causes a piece of its threat detection sensor to fail, the rest of the computer can continue to boot up and run as normal.
The cybersecurity company also said it would start using more advanced testing techniques, such as rollback testing, stress testing, fuzzing and fault injection. These techniques provide redundancy to the more basic tests CrowdStrike has promised.
Additionally, CrowdStrike is still developing a root cause analysis, which is likely to reveal more about the high-level thinking at the company that allowed, for example, fast-tracked updates to face such scant testing before getting issued to the public. The company has not provided a timeline of when it will release this analysis.