Skip to content

Manufacturing: NIST Wants to Upgrade the Incident Response Playbook

NIST releases its first concrete OT recovery playbook and it looks nothing like an IT runbook. The document is formally aimed at manufacturing, but the problem it addresses is structural across every operational technology environment where stopping production has physical consequences.

Most incident response plans assume that containment is an acceptable fallback. In a corporate IT environment, that assumption typically holds. If the organization can isolate the infected segment and preserve its logs, it can then begin recovery. The physical world keeps on keeping on regardless.

On a manufacturing plant floor, that assumption breaks down fast. That's especially so considering that too many manufacturers don't have the capabilities in place to recover, including workable backups. According to backup and recovery firm Macrium's Current State of Backup and Recovery in Manufacturing report, only 54% of OT, ICS, and SCADA systems are backed up at all.

"Imagine today somebody deploys ransomware and takes over your OT, and you have no backups and no tested procedures to come back online. How long is it going to take to come back, versus having something in place that's going to help you recover?" asked Hector Perez, global head of strategy and revenue, industrial cybersecurity at Black & Veatch, in his presentation Quantifying Cyber Risk in Dollars: A Better Way To Fund And Prioritize OT Security.

Perez also stressed the impact quality response plans can have not only on availability but also on the bottom line. "Let's say today you get attacked, and you have no protection, and your total impact is 81 to 150 million. Now, say you improve your backups and have tested response plans: your impact goes down to 38-62 million. For every $1 you put in, you get 4.3 to 8.8 dollars back in risk mitigated, focusing just on the impact."

Related:

NIST Declares “Inbox Zero,” Pulls Back on CVE Enrichment. Now Enterprise Security Teams Must Fill the Gap
An analysis of the National Vulnerability Database’s shift to risk-based triage and what it actually means for the people patching systems (first of a two-part analysis)
Bridging Public Safety and Incident Response
Battle boards meet cyber war rooms—learn how public safety response tactics can strengthen incident response, coordination, and decision-making.

However, with the dearth of such plans in place at manufacturers, the National Institute of Standards and Technology (NIST) and the National Cybersecurity Center of Excellence (NCCoE) released the initial public draft of SP 1800-41, "Responding to and Recovering from a Cyber Attack: Cybersecurity for the Manufacturing Sector," on May 21. The document is formally aimed at manufacturing, but the problem it addresses is structural across every operational technology environment where stopping production has physical consequences, such as water treatment, energy distribution, and chemical processing. The comment period runs through July 8.

The framework itself is a five-phase reference architecture covering detection, containment, eradication, recovery, and post-incident analysis. These are certainly familiar categories to anyone who has read an IR plan. What distinguishes SP 1800-41 is where it diverges. NIST explicitly addresses log preservation practices tuned to OT environments, containment strategies that must not break physical safety interlocks, and deterministic clean restoration processes for industrial control systems. Those caveats do not appear in standard IT playbooks because they don't need to in standard IT environments.

The underlying tension is not new, but it remains underaddressed. For decades, IT security frameworks have been adapted, sometimes carelessly, to OT environments that operate on fundamentally different principles. In IT, availability is important. In OT, availability is often the safety mechanism. Shutting down a compromised server to contain malware is a reasonable response. Shutting down a compromised programmable logic controller mid-process can be catastrophic or, depending on the process, more dangerous than leaving it running while a measured response is developed.

Security practitioners who have spent time in manufacturing or industrial environments are familiar with the gap. The problem is that many organizations have not staffed or structured their security programs to close it. IR plans written by security teams without OT expertise, or adapted wholesale from IT frameworks, frequently omit process safety considerations entirely. The assumption is that "containment" means the same thing everywhere.

It doesn't.

SP 1800-41 is a rare instance of a concrete, post-breach guidance document for OT environments rather than another prevention-focused framework. Most ICS security guidance concentrates on hardening and detection. What happens after a compromise, specifically, how to recover without triggering a physical incident, has received far less formal treatment. That gap has consequences: a 2024 Dragos report noted that ransomware operators increasingly target industrial organizations precisely because operational disruption creates pressure to pay, and because OT recovery is poorly understood even by the organizations that operate those systems.

Chris Wolski, founder at Applied Security Convergence, stressed the importance and the benefits of resilience in his talk, Enhancing OT Cybersecurity in Maritime Environments.

"What made Port of Houston's attack stopped successfully was the fact that we were resilient and we were able to get in and respond [Volt Typhoon zero-day attack]. We slowed the attacker down and in two hours, we had that under control, and by about 12 hours, we had fully remediated the situation. That attack was a nation-state attack and a zero-day on top of it," he said.

The July 8 comment deadline gives security leaders, OT engineers, and IR professionals the opportunity to shape a document that is still in draft. Organizations that have navigated ICS incidents and developed recovery procedures the hard way carry knowledge that NIST's reference architecture would benefit from. Submitting that expertise is more useful than waiting for the final version and later having to deal with what it missed.

Latest