Content Author

William Knowles

Senior ICS Security Consultant

I Stock 664875328

Effective OT security monitoring: less network, more endpoint?

William Knowles

Senior ICS Security Consultant

For those seeking to improve their security monitoring capability in Operational Technology (OT) environments, the marketing collateral of much of what targets the industry will tell you one thing: focus on the network and ensure you have OT-specific protocol awareness. However, is this really what is the most effective for detecting and then subsequently responding to threats? The answer to this might in fact be “no”, and there is a strong argument that efforts should initially be focused instead on endpoint monitoring.

OT network security monitoring and its limitations

There has been a recent surge in industry offerings that focus on the analysis of OT network traffic, with a key distinguisher of their capability being the ability to parse and analyse OT-specific protocols. Such offerings are a positive thing as part of a wider detection and response program, but they are often marketed as an “essential”. For those looking to improve their OT security monitoring capability, there are three central limitations of these offerings which are important to consider when deciding first steps.

The first relates to their intended visibility and how this is insufficient as a primary means for detecting targeted threats. Before an attacker ever gets to the stage of influencing a physical process, there is an “attack positioning” phase where they need to laterally move towards their target. However, most OT-specific network monitoring offerings are often intended to be deployed in an area of the environment that represents the final phases of an attack (e.g., on field sites). Therefore, they will never see these attacker actions. From a “cyber kill chain” perspective, which is a series of conceptual phases representing attacker actions as part of an attack, these solutions are heavily focused on the final stages. In order to be effective at responding to threats, an organization should look to improve their detection capability throughout these phases, but arguably with the heaviest emphasis being placed on the first stages, so an attack could be stopped before it ever gains traction.

The second relates to the ability of network security monitoring solutions to perform meaningful anomaly detection at scale. Network security monitoring solutions are often effective at detecting some primitive indicators of these actions (e.g., port and vulnerability scanning), but such actions would not typically be taken by even a moderately skilled attacker due to their ease of detection. Instead, for the actions they would take, the solutions perform poorly, especially when trying to perform analysis at scale (e.g., detecting lateral movement through protocols such as RPC and SMB, and distinguishing these from an environment's legitimate use of these protocols with an acceptable rate of false positives). This issue is compounded by the focus of these solutions being on the OT-specific protocols, rather than these targeted attacker actions. It is important to also consider that there is a range of attack scenarios that don’t involve OT-specific protocols at any stage, which they may not have the capability to adequately detect. Examples include obtaining information that may be sensitive (e.g., schematics, controller logic, and architecture diagrams from workstations or file shares) and affecting the integrity of resources used for decision making (e.g., the databases and associated systems used when determining the raw resources to be purchases from upstream partners).

The third relates to the realities of their limitations for coverage when deployed. Network monitoring is typically performed by passively capturing data from switches using a port that receives a mirrored copy of all traffic on that switch. Due to the extensive network infrastructure in most real world deployments, it is often highly challenging and impractical to obtain full visibility of network traffic. Where visibility is obtained it often comes with the requirement of deploying many decentralized analysis appliances (at a significant financial cost), or aggregating this mirrored traffic for analysis at a central location (at a significant computational cost for increased network traffic, and potentially financially where geographical factors are considered).

Endpoint monitoring as the alternative

So what is the alternative? The IT industry for a number of years has been wrestling with the challenges of effective network security monitoring. As a broad generalization, there has been a strong shift away from performing network security monitoring on internal networks, and instead for efforts to be placed on detection on the endpoint in order to enable faster response activities. This has led to the rise of the countless number of Endpoint Detection and Response (EDR) vendors.

These EDR solutions provide greater insight into attacker actions across all stages of the cyber kill chain. For detection, EDR capabilities include real-time process tracing (e.g., process creation, command line arguments, process relationships), memory analysis (e.g., scanning memory for indicators of shared library injection, such as DLLs), log analysis (e.g., detecting malicious scripts, such as those written in PowerShell), and persistence location analysis (e.g., identifying when new entries are made in commonly abused locations). Coverage over an estate also allows actions such as least frequency analysis to be performed (e.g., a service being configured differently on one specific host), which further emboldens detection capability. For response, EDR capabilities include the ability to isolate hosts as well as collect forensic evidence. Such EDR solutions can be deployed in a basic anomaly detection manner to support reactive security monitoring, but also because of their extensive capability they can act as vital tools for performing more proactive threat hunting.

So why does EDR not get much coverage for OT security monitoring? One potential reason is that it is because marketing collateral is helped by pitching OT as being “different”, which is an easier argument to make where network security monitoring is concerned. The reality, however, is that “attack positioning” is largely similar for both IT and OT environments, especially when it comes to the activities of initial compromise, privilege escalation and lateral movement. By being effective here, it allows organisations to detect and respond to attacks before they ever get to a stage where OT-specific protocol awareness is a requirement. A second is that in some environments the infrastructure is managed by a third party, and there are restrictions on what software can be run on endpoints. Such challenges are not insurmountable, however, and can potentially be resolved through existing contract clauses where there is an obligation to ensure the security of these systems, but also where there are plans for an infrastructure refresh, the requirement for endpoint monitoring can be included within service agreements.

OT network monitoring as a piece of the puzzle

Although network security monitoring has its limitations, that does not mean it has no value, but rather, how and where it is used should be more carefully considered. Through Applied Risk’s experience in assessing the OT security monitoring capabilities of organisations, there are two areas which provide the most cost-effective benefits for the capability delivered where network security monitoring is concerned.

The first is where there are environment gateways, such as between IT or OT to the internet, and potentially the bridge between IT and OT. Monitoring here is relatively simple to deploy as network traffic is already aggregated and transmitted through a consolidated number of devices. This does have its own limitations, but still provides an effective way to detect attack actions through their external command and control channels (e.g., through behavioural anomalies in request patterns and the sites they interact with). The primary focus of such monitoring activities should be web traffic (e.g., HTTP/HTTPS); however, it is also important to consider other points of ingress and egress, such as mail servers, which help with the capability of detecting threats early in the kill chain (e.g., initial compromise).

The second is to perform the sort of OT network security monitoring often espoused by the marketing collateral that targets the industry, but only in key segments of the process control network (e.g., critical substations and field sites). This itself comes with the caveats discussed earlier within the post; however, it can provide a detection mechanism of last resort if other controls fail for some but not all attacker objectives. Despite this, the aim should still be to detect and respond to an attack well before they obtain this level of access.

For those who have covered the initial bases of endpoint and network security monitoring, it is also strongly encouraged that detection capability is improved around Active Directory, which is commonly abused by attackers. This will provide the capability to detect attacker actions such as domain enumeration, abuses of vulnerabilities within specific protocols (e.g., Kerberos attacks), and lateral movement. There are multiple industry vendors providing this capability, and it is also part of the Microsoft product suite in the form of Advanced Threat Analytics (ATA).

In conclusion, focus more on the endpoint, less on the network

Although they have their limitations, it can be stated without uncertainty that security monitoring appliances focused on the network with OT-specific protocol awareness are a good thing. Having this capability can only be a positive thing for improving detection and response capability. However, it is arguably important to consider them as a supplement rather than a primary mechanism. If you want to improve your detection and response capability across all stages of the cyber kill chain there are strong arguments that you should instead be focusing your efforts initially elsewhere: namely, on the endpoint.

Thank you for your submission!