Skip to main content

Articles & Blogs

Navigating the Complexities of Patch Deployment in ICS Environments: Balancing Risk and Operational Uptime

 

Determining the frequency of patch deployments in ICS environments is complex. This is because it varies widely based on factors such as geographical location, industry, regulations, corporate oversight, market conditions and asset criticality. Typically, patches are deployed according to ICS vendors' patch bulletins, which approve sets of patches for specific technologies. However, the time between a vendor's approval and the actual application of a patch can range from a few days to several months and, in extreme cases, even years. In this blog, I’ll explore why patch management should be part of a broader strategy to effectively mitigate risk.

  

Is patching the best method? It depends. 

Automation technologies such as Microsoft Windows Server Update Services (WSUS) are commonly used in heavy processing industries to streamline the patch deployment process. While automation reduces the manual effort involved, it can also create a false sense of security. During vulnerability assessments, it is not uncommon to find devices at levels 2 or 3 of the Purdue reference model with high attack surfaces that were overlooked and not included in the automated patching group. Changes in ICS environments, where devices are frequently added, removed or modified, can lead to gaps in patch coverage. Without proper checks and balances, these gaps can remain unnoticed for years.

The primary goal of any security program is to mitigate risk to an acceptable level. It is important to recognize that patching is just one of many work processes to reduce risk in OT environments, and not necessarily the most optimal one. When addressing vulnerabilities, the first step should be to justify the need for the component with the security weakness. Next, assess if the risk justifies any action, as the CVSS score alone does not represent the risk. If action is warranted and the component cannot be removed, evaluate whether to upgrade or patch. An available patch is not always the best remediation. 

Other remediation methods such as firewall rules, access controls and whitelisting, should be considered if patching or upgrading is not feasible. In critical environments, patches often require extensive testing, which can make patch deployment less than optimal. This testing might involve ring deployments, a progressive method of patch deployment that minimizes risk and ensures system availability by sequentially rolling out updates to different asset groups, starting with the least critical and moving to the most critical.

Despite efforts, the patch management issue in ICS environments remains unresolved. Discrepancies between corporate policy and site-level execution are common, as are inconsistencies between operating units within the same facility. However, there is a consistent adherence to vendor-approved patch lists, which relieves Owner Operators from determining patch priorities and focuses on implementing vendor recommendations. While this approach has its benefits, it does not effectively manage risk and often leads to significant effort with minimal risk reduction.

To truly mitigate risk, patch management should be part of a broader strategy that includes asset visibility, vulnerability management, obsolescence management, backups and configuration management. Shifting the conversation to a recurring process of identifying, evaluating and prioritizing risks in OT environments is essential. With a prioritized list of risks, appropriate work processes can be chosen to address each concern.  

 

Risk will always exist 

It is important to accept that some level of risk will always exist and cannot be completely eliminated. If patching is the appropriate work process, consider the timing of deployment in the context of the environment. For example, applying all vendor-approved patches might be ideal during a facility outage or turnaround. However, the decision to patch should depend on the specific context of the vulnerability and its potential impact on operations. 

Operational uptime is critical for Owner Operators but balancing the need for patching with the demands of profitable, sustainable, safe and secure operations is key. This balance requires shifting from task execution to identifying, evaluating and prioritizing risks, ensuring that the appropriate tasks are performed in the correct timeframe to address the most pressing issues. 

By applying the basic risk equation, Risk = Likelihood x Consequence, organizations can better allocate resources to activities that significantly reduce risk and avoid expending energy on tasks that do not. I did a webinar with SANS a while back discussing cyber risk if you would like to learn more. You can access the webinar here. 

About the Author

Nick Cappi is Vice President, Portfolio Strategy and Enablement for OT Cybersecurity in Hexagon Asset Lifecyle Intelligence division. Nick joined PAS in 1995, which was acquired by Hexagon in 2020. In his role, Nick oversees commercial success of the business, formulates and prioritizes the strategic themes, and works with product owners to set strategic product direction. During his tenure at PAS, Nick has held a variety of positions including Vice President of Product Management and Technical Support, Director of Technical Consulting, Director of Technology, Managing Director for Asia Pacific Region, and Director of Product Management. Nick brings over 26 years of industrial control system and cybersecurity experience within the processing industries.

Profile Photo of Nick Cappi