Spire: Intrusion-Tolerant SCADA for the Power Grid Amy Babay University of Pittsburgh School of Computing and Information
Importance of SCADA for the Power Grid • Supervisory Control and Data Acquisition (SCADA) systems form the backbone of critical infrastructure services • To preserve control and monitoring capabilities, SCADA systems must be constantly available and run at their expected level of performance (able to react within 100-200ms)
• SCADA system failures and downtime can cause catastrophic consequences, such as equipment damage, blackouts, and human casualties October 2019
Electric Power Industry Conference (EPIC)
2
Emerging Power Grid Threats • Perimeter defenses are not sufficient against determined attackers – Stuxnet, Dragonfly/Energetic Bear, Black energy (Ukraine 2015), Crashoverride (Ukraine 2016) – Becoming a target for nation-state attackers
October 2019
Electric Power Industry Conference (EPIC)
3
SCADA Vulnerabilities SCADA systems are vulnerable on several fronts:
HMI
• SCADA system compromises – SCADA Master – system-wide damage – RTUs, PLCs – limited local effects – HMIs
• Network level attacks – Routing attacks that disrupt or delay communication – Isolating critical components from the rest of the network October 2019
Electric Power Industry Conference (EPIC)
Primary SCADA Master
Backup SCADA Master
PLC
RTU
Physical Equipment
Physical Equipment
4
The Spire System • Spire is an intrusion-tolerant SCADA system for the power grid: it continues to work correctly even if some critical components have been compromised • Intrusion tolerance as the core design principle: – Intrusion-tolerant network
– Intrusion-tolerant consistent state – Intrusion-tolerant SCADA Master
• Open Source - http://dsn.jhu.edu/spire
October 2019
Electric Power Industry Conference (EPIC)
5
Roadmap • Demonstrating the problem: Red Team Experiment at Pacific Northwest National Labs (PNNL)
• Spire System – How it works • Deployment Scenarios – Power Plant Deployment at Hawaiian Electric Company – Wide-area Transmission Architecture – Immediate Next Steps October 2019
Electric Power Industry Conference (EPIC)
6
Red Team Experiment March 27 – April 7, 2017
October 2019
Electric Power Industry Conference (EPIC)
7
DoD ESTCP Red Team Experiment • DoD ESTCP experiment hosted at Pacific Northwest National Labs • Evaluated NISTcompliant commercial SCADA architecture and Spire – Each attacked by Sandia National Labs red team October 2019
Electric Power Industry Conference (EPIC)
8
DoD ESTCP Red Team Results • NIST-compliant system completely taken over – MITM attack from enterprise network – Direct access to PLC from operational network
• Spire completely unaffected – Attacks in enterprise and operational network – Given complete access to a replica and code – Red team gave up after several days October 2019
Electric Power Industry Conference (EPIC)
9
Spire: How it works
October 2019
Electric Power Industry Conference (EPIC)
10
SCADA Vulnerabilities SCADA systems are vulnerable on several fronts:
HMI
• SCADA system compromises – SCADA Master – system-wide damage – RTUs, PLCs – limited local effects – HMIs
• Network level attacks – Routing attacks that disrupt or delay communication – Isolating critical components from the rest of the network October 2019
Electric Power Industry Conference (EPIC)
Primary SCADA Master
Backup SCADA Master
PLC
RTU
Physical Equipment
Physical Equipment
11
Spire: Addressing System Compromises • Byzantine Fault Tolerant Replication (BFT) – – – –
Correctly maintains state in the presence of compromises 3f+1 replicas needed to tolerate up to f intrusions 2f+1 connected correct replicas required to make progress Prime protocol – latency guarantees under attack [ACKL11]
October 2019
Electric Power Industry Conference (EPIC)
12
Spire: Addressing System Compromises • Byzantine Fault Tolerant Replication (BFT) – – – –
Correctly maintains state in the presence of compromises 3f+1 replicas needed to tolerate up to f intrusions 2f+1 connected correct replicas required to make progress Prime protocol – latency guarantees under attack [ACKL11]
• What prevents an attacker from reusing the same exploit to compromise more than f replicas?
October 2019
Electric Power Industry Conference (EPIC)
13
Spire: Addressing System Compromises • Byzantine Fault Tolerant Replication (BFT) – – – –
Correctly maintains state in the presence of compromises 3f+1 replicas needed to tolerate up to f intrusions 2f+1 connected correct replicas required to make progress Prime protocol – latency guarantees under attack [ACKL11]
• Diversity
– Present a different attack surface so that an adversary cannot exploit a single vulnerability to compromise all replicas – Multicompiler from UC Irvine [HNLBF13]
October 2019
Electric Power Industry Conference (EPIC)
14
Spire: Addressing System Compromises • Byzantine Fault Tolerant Replication (BFT) – – – –
Correctly maintains state in the presence of compromises 3f+1 replicas needed to tolerate up to f intrusions 2f+1 connected correct replicas required to make progress Prime protocol – latency guarantees under attack [ACKL11]
• Diversity
– Present a different attack surface so that an adversary cannot exploit a single vulnerability to compromise all replicas – Multicompiler from UC Irvine [HNLBF13]
• What prevents an attacker from compromising more than f replicas over time?
October 2019
Electric Power Industry Conference (EPIC)
15
Spire: Addressing System Compromises • Byzantine Fault Tolerant Replication (BFT) – – – –
Correctly maintains state in the presence of compromises 3f+1 replicas needed to tolerate up to f intrusions 2f+1 connected correct replicas required to make progress Prime protocol – latency guarantees under attack [ACKL11]
• Diversity
– Present a different attack surface so that an adversary cannot exploit a single vulnerability to compromise all replicas – Multicompiler from UC Irvine [HNLBF13]
• Proactive Recovery
– Periodically rejuvenate replicas to a known good state to cleanse any potentially undetected intrusions – 3f+2k+1 replicas needed to simultaneously tolerate up to f intrusions and k recovering replicas [SBCNV10] – 2f+k+1 connected correct replicas required to make progress October 2019
Electric Power Industry Conference (EPIC)
16
The Spire System: Single Control Center
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 17
The Spire System: Single Control Center
Six Spire replicas, monitoring and controlling three power grid scenarios (two distribution, one generation)
October 2019
Electric Power Industry Conference (EPIC)
18
Hawaiian Electric Company Power Plant Deployment January 22 – February 2, 2018
October 2019
Electric Power Industry Conference (EPIC)
19
DoD ESTCP Hawaiian Electric Company Deployment Setup • Spire test deployment at HECO – “Mothballed” Honolulu plant – Managed small power topology, controlling 3 physical breakers via a Modbus PLC
• Deployment goals – Operate correctly in real environment without adverse effects – Meet performance requirements
October 2019
Electric Power Industry Conference (EPIC)
20
DoD ESTCP Hawaiian Electric Company Deployment Results • Ran continuously for 6 days without adverse effects on other plant systems • Timing experiment using sensor to measure HMI reaction time showed that Spire met latency requirements October 2019
Electric Power Industry Conference (EPIC)
21
Wide-Area Transmission: Beyond a Single Control Center
October 2019
Electric Power Industry Conference (EPIC)
22
Wide-Area Transmission Systems Control Center
LAN
HMI HMI
Primary SCADA Master
Backup SCADA Master
PLC
RTU
Physical Equipment
Physical Equipment
SCADA Master Primary
SCADA Master Hot Standby
Wide Area Network
PLC
RTU
RTU
Physical Equipment
Physical Equipment
Physical Equipment
Substation
Substation
Substation
• SCADA systems support large power grids with PLCs in many substations spanning hundreds of miles • What happens if the control center is disconnected? October 2019
Electric Power Industry Conference (EPIC)
23
Wide-Area Transmission Systems Primary Control Center
Cold-Backup Control Center
LAN
LAN
HMI HMI
Primary SCADA Master
Backup SCADA Master
PLC
RTU
Physical Equipment
Physical Equipment
SCADA Master Primary
SCADA Master Hot Standby
HMI
SCADA Master Primary
SCADA Master Hot Standby
Wide-Area Network
RTU
RTU
Physical Equipment
Physical Equipment
Substation
Substation
• Primary-Backup architecture is not sufficient to address malicious attacks – cold-backup takes time to come online; hot-backup subject to “split-brain” problem October 2019
Electric Power Industry Conference (EPIC)
24
Resilient Wide-Area Architecture • To protect against sophisticated network attacks, Spire supports multiple control sites • But control sites are expensive… – Spire can operate with two control centers + additional sites that can be served by commodity data centers (that lack the ability to communicate with PLCs and RTUs in the field)
October 2019
Electric Power Industry Conference (EPIC)
25
Resilient Wide-Area Architecture • Successfully withstands: 1 intrusion, 1 disconnected site, 1 replica undergoing proactive recovery Data Center 1 SM
Data Center 2 SM
SM
Control Center 1 HMI
SM
SM
SM
SM
Control Center 2
Spines SM
SM
SM
SM
SM
HMI
Spines
RTU
RTU Physical Equipment
Substation October 2019
…
Physical Equipment
Substation
Electric Power Industry Conference (EPIC)
26
Resilient Wide-Area Architecture • Successfully withstands: 1 intrusion, 1 disconnected site, 1 replica undergoing proactive recovery Data Center 1 SM
Data Center 2 SM
SM
Control Center 1 HMI
SM
SM
SM
SM
Control Center 2
Spines SM
SM
SM
SM
SM
HMI
Spines
RTU
RTU Physical Equipment
Substation October 2019
…
Physical Equipment
Substation
f = 1, k = 4 3f + 2k + 1 = 12 total replicas 2f + k + 1 = 7 correct needed
Electric Power Industry Conference (EPIC)
27
Wide Area Update Latency Plot
• 30-hour wide-area deployment of configuration 3+3+3+3 • Control centers at JHU and SVG, data centers at WAS and NYC • 10 emulated substations sending periodic updates • 1.08 million updates (108K from each substation) • Nearly 99.999% of updates delivered within 100ms (56.5ms average) October 2019
Electric Power Industry Conference (EPIC)
28
Wide Area: Latency Under Attack
• Targeted attacks designed to disrupt the system • All combinations of site disconnection (due to network attack) + proactive recovery October 2019
Electric Power Industry Conference (EPIC)
29
Wide Area: Latency Under Attack
• Targeted attacks designed to disrupt the system • All combinations of intrusion + site disconnection (due to network attack) + proactive recovery October 2019
Electric Power Industry Conference (EPIC)
30
Deployment Plan: Next Steps
October 2019
Electric Power Industry Conference (EPIC)
31
First Step: Network-Level Resilience • Protects communication infrastructure – Standard, insecure protocols (e.g. Modbus, DNP3) limited to direct (cable) connection between proxy and device
• Accommodates existing system architecture and components Control Center HMI
SCADA Master
Proxy
Proxy
Intrusion-Tolerant Communication Infrastructure Proxy
Proxy RTU
October 2019
…
PLC
Physical Equipment
Physical Equipment
Substation
Substation Electric Power Industry Conference (EPIC)
32
Second Step: Full Intrusion-Tolerance • Employs full power of Spire system, protecting against system intrusions/compromises as well as network attacks – Protects communication infrastructure (intrusion-tolerant network) – Protects SCADA Master (intrusion-tolerant system state)
• Requires substantial architectural changes Data Center 1 SM
Data Center 2 SM
SM
Control Center 1 HMI
SM
SM
SM
SM
Control Center 2
Spines SM
SM
SM
SM
SM
HMI
Spines RTU
RTU Physical Equipment
Substation October 2019
…
Physical Equipment
Substation
Electric Power Industry Conference (EPIC)
33
Resources • Amy Babay
www.cs.pitt.edu/~babay/
• Yair Amir www.cs.jhu.edu/~yairamir/ • JHU DSN Lab www.dsn.jhu.edu • Spread Concepts LLC www.spreadconcepts.com • Spire www.dsn.jhu.edu/spire/ • Multicompiler www.github.com/securesystemslab/multicompiler
• Papers:
– Deploying Intrusion-Tolerant SCADA for the Power Grid www.dsn.jhu.edu/papers/DSN_2019_SCADA_Experience.pdf – Toward an Intrusion-Tolerant Power Grid: Challenges and Opportunities www.dsn.jhu.edu/papers/scada_ICDCS_2018.pdf – Network-Attack-Resilient Intrusion-Tolerant SCADA for the Power Grid www.dsn.jhu.edu/papers/scada_DSN_2018.pdf
October 2019
Electric Power Industry Conference (EPIC)
34
Backup Slides
October 2019
Electric Power Industry Conference (EPIC)
35
DoD ESTCP Red Team Takeaways • Today’s power grid is vulnerable • There is a difference between current best practices and state-of-the-art research-based solutions • Secure network setup using cloud expertise (protected the system for two days) • Customized intrusion-tolerant protocols (defended the system in the presence of an intrusion on the third day)
October 2019
Electric Power Industry Conference (EPIC)
36
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 37
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID
* Operator Button-Click *
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 38
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID
Send to f+k+1 replicas
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 39
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID Agreement protocol execution
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 40
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID Threshold signature generation
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 41
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID
Signed Command Delivery
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 42
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID
Signed Command Validation, Delivery and Execution
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 43
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID
Status update generation
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 44
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID
Send to f+k+1 replicas
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 45
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID Agreement protocol execution
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 46
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID Threshold signature generation
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 47
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID
Signed Command Delivery
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 48
Spire Operation
CHAPTER 3. INTRUSION-TOLERANT SCADA FOR THE POWER GRID
Signed Command Validation and Display Update
Figure 3.5: Int rusion-Tolerant SCADA syst em archit ecture for a single control center deployment with 6 replicas (f Electric = 1, Power k =Industry 1). Conference (EPIC) October 2019 49