Skip to main content

OT/ICS Cybersecurity

Final Thoughts on ICS Continuous Hardening: Shifting the Focus to Risk-Based Strategies

This is my final blog of the year on ICS Continuous Hardening, and I wanted to make it impactful. I wondered, what should I discuss? What can I say that hasn’t been said in my October blog or my September blog on the same topic? In these situations, I think Dale Carnegie's advice holds true: “An hour of planning can save you 10 hours of doing.” With that in mind, I turned to the trusty World Wide Web for inspiration. I spent over an hour searching and reading various websites, including those of industry analysts, other vendors, media, and even revisiting our own content. Below are the areas of continuous hardening of ICS I discovered people are talking about. 

 

Summary of Continuous Hardening of ICS 

Work Process 

Main Task 

Action 

Asset Inventory and Management 

Identify and Document Assets: Create a detailed inventory of all ICS components, including hardware, software and network devices. 

Update Regularly: Keep the inventory updated with any changes or additions to the system. 

Network Segmentation 

Segregate Networks: Divide the OT network into segments based on functionality and criticality. Implement demilitarized zones (DMZs) between IT and OT networks. 

Control Access: Use firewalls and access control lists (ACLs) to restrict traffic between segments. 

Access Control 

Role-Based Access Control (RBAC): Implement RBAC to ensure users have access only to the resources necessary for their roles. 

Regular Audits: Periodically review and update access controls to ensure they are current and effective. 

Patch Management 

Regular Updates: Apply patches and updates to ICS components and software as they become available. 

Testing: Test patches in a controlled environment before deployment to ensure they do not disrupt operations. 

Incident Response Planning 

Develop an Incident Response Plan: Create and regularly update an incident response plan tailored to ICS and OT environments. 

Training and Drills: Conduct regular training and simulated drills for incident response teams. 

Monitoring and Logging 

Continuous Monitoring: Implement continuous monitoring of ICS networks and systems for unusual activity. 

Log Management: Collect and analyze logs from all critical systems and network devices to detect potential security incidents. 

Security Policies and Procedures 

Develop and Enforce Policies: Establish comprehensive security policies and procedures specific to OT environments. 

Regular Reviews: Periodically review and update security policies to reflect changes in the threat landscape and operational requirements. 

Physical Security 

Restrict Physical Access: Ensure that physical access to ICS components is restricted to authorized personnel only. 

Environmental Controls: Implement controls to protect against environmental threats such as temperature, humidity and power fluctuations. 

Backup and Recovery 

Regular Backups: Perform regular backups of critical ICS data and configurations. 

Test Recovery Procedures: Regularly test backup and recovery procedures to ensure data integrity and availability in case of an incident. 

Threat Intelligence and Vulnerability Management 

Stay Informed: Subscribe to threat intelligence feeds relevant to OT environments. 

Vulnerability Assessments: Conduct regular vulnerability assessments of ICS endpoints and their associated components. 

Employee Training and Awareness 

Regular Training: Provide ongoing training for employees on security best practices and emerging threats. 

Awareness Programs: Implement security awareness programs to reinforce the importance of security in daily operations. 

Third-Party Risk Management 

Evaluate Vendors: Assess the security practices of third-party vendors and service providers. 

Contractual Obligations: Include security requirements in contracts with third-party vendors. 

Compliance and Audits 

Adhere to Standards: Ensure compliance with relevant industry standards and regulations (e.g., NIST, IEC 62443, NERC CIP). 

Regular Audits: Conduct regular security audits to verify compliance and identify areas for improvement. 

  

It is clear that there is a lot of discussion about executing various tasks and work processes, with the assumption that everyone should be doing these things on every asset, all the time. However, there was little mention of risk. Given that resources and funds are limited and unexpected outages are unacceptable, we must balance outages, efforts and expenditures to ensure profitable, sustainable, safe and secure operations. This balance requires shifting the conversation from task execution to identifying, evaluating and prioritizing risks. Tasks should exist solely to address prioritized risks, not the other way around.  

For some assets, we might need to implement all the suggested work processes and more, while for others, we may only need to do a small portion. Without understanding our risks, it is impossible to determine which actions are necessary, which are nice to have and which are not needed at all. 

The primary function of any security program is to mitigate risk to an acceptable level. With a prioritized list of risks, we can address each concern methodically by choosing the appropriate work processes. It's important to recognize that some level of risk will always exist; it can never be completely eliminated.

Consider this scenario: we have a vulnerability with an Attack Vector of "Adjacent Network," Attack Complexity of "Low," and a CVSS Base Score of "9.0" (making it a "Critical Vulnerability"). It has known exploits and an associated ICS vendor-approved patch. Should we stop everything and start patching immediately? Should we patch every device with that weakness? The correct answer is - it depends.

If the vulnerability is found on a device in a connected network that serves as the configuration server for a Safety Instrumented System (SIS) linked to a critical part of the plant, we would likely apply the patch in the next patching cycle, if not sooner. However, if the same vulnerability is found on an isolated network or device, we might wait until the next outage or turnaround to apply the patch.

When dealing with vulnerabilities, the first step is justifying the need for the component with the security weakness. Next, it's crucial to ensure the risk justifies any action, as the CVSS score alone doesn’t represent risk. If the risk justifies action and the component can’t be removed, the next step is to evaluate whether to upgrade or patch. An available patch isn’t automatically the best remediation.

Finally, if the risk is worth addressing and upgrading or patching isn’t an ideal path, other remediation methods like firewall rules, access controls and whitelisting should be considered. 

By applying the basic risk equation Risk = Likelihood x Consequence to your environment, you will likely discover that we are expending too much energy on tasks that do not significantly reduce risk, and too little on work processes that would have a meaningful impact.

Maybe an hour spent identifying, evaluating and prioritizing risks can save you dozens of hours in implementing ICS continuous hardening work processes. 

About the Author

Nick Cappi is Vice President, Portfolio Strategy and Enablement for OT Cybersecurity in Hexagon Asset Lifecyle Intelligence division. Nick joined PAS in 1995, which was acquired by Hexagon in 2020. In his role, Nick oversees commercial success of the business, formulates and prioritizes the strategic themes, and works with product owners to set strategic product direction. During his tenure at PAS, Nick has held a variety of positions including Vice President of Product Management and Technical Support, Director of Technical Consulting, Director of Technology, Managing Director for Asia Pacific Region, and Director of Product Management. Nick brings over 26 years of industrial control system and cybersecurity experience within the processing industries.

Profile Photo of Nick Cappi