Björn Rodén (roden@ae.ibm.com) http://www.ibm.com/systems/services/labservices/ http://www.ibm.com/systems/power/support/powercare/
Architecting HA and DR solutions, including PowerHA 7.1.3 (SE) Migration
© Copyright IBM Corporation 2014 Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. 9.0
Session Objectives • This session discuss considerations for designing and deploying high and continuously available IT systems. – The session start from identifying business impact, risk & threats, discuss KPIs and metrics such as RTO/RPO/MTTR, but focus on designing and planning deployment of IT continuity design with verification and validation, and on maintaining the solutions until decommission. – And how to leverage PowerHA SystemMirror 7.1 Standard Edition
objective
Björn Rodén
© Copyright IBM Corporation 2014
You will learn how to approach high availability solution design, planning and implementing.
2
Business challenges & needs
• Information management for business processes needs to… – Ensure appropriate level of service – Manage risks (mitigate, ignore, transfer) – Reduce cost (CAPEX/OPEX)
93% 40% of companies that suffer a massive data loss will never reopen 1
of companies that lost their data center for 10 days or more due to a disaster filed for bankruptcy within one year of the disaster2
Reference: (1) “Disaster Recovery Plans and Systems Are Essential”, Gartner Group, 2001 Reference: (2) US National Archives and Records Administration
Björn Rodén
© Copyright IBM Corporation 2014
3
Information management – availability challenge
More business on-line Higher end-user expectations Greater dependency on applications Increasing impact of information unavailability
The Widening Gap
Ability to meet expectations through
restore – restart Fewer independent application systems Smaller window to restore or recover Less tolerance for downtime Björn Rodén
© Copyright IBM Corporation 2014
4
Business Continuity in IT perspective
BjĂśrn RodĂŠn
Business Continuity
Ability to adapt and respond to risks as well as opportunities in order to maintain continuous business operations
High Availability
The attribute of a system to provide service during defined periods, at acceptable or agreed upon levels and masks unplanned outages
Disaster Recovery
Capability to recover a data center at a different site if the primary site becomes inoperable
Continuous Operations
The attribute of a system to continuously operate and mask planned outages
Š Copyright IBM Corporation 2014
5
IT Availability Life cycle
architecture, solution design, deployment, governance, system
maintenance and change management, skill building, migration and decommissioning …
Björn Rodén
© Copyright IBM Corporation 2014
A lot to analyze, plan, do and check…
DESIGN > BUILD > OPERATE > REPLACE
6
What protection is the solution expected to provide? Global Distance Recovery Compliance
Data Loss or Corruption
Björn Rodén
Metro Distance Recovery High Availability
Single System Failure Human error Software error Component failures Single system failures
Local Disaster Human error Electric grid failure HAVC or power failures Burst water pipe Building fire Architectural failures Gas explosion Terrorist attack
© Copyright IBM Corporation 2014
Regional Disaster Electric grid failure Floods Hurricanes Earthquakes Tornados Tsunamis Warfighting
7
Balance business impact vs. solution costs
Consider the whole solution lifecycle
Cost
Down Time Costs (Business Impact)
Total Cost Balance1
Needs & Reqs
Solution Costs
Solution Costs (CAPEX/OPEX)
Balance
Down Time Costs
Risk
Business Recovery Time
(1): Quick Total Cost Balance (TCB) = TCO or TCA + Business Down Time Costs Björn Rodén
© Copyright IBM Corporation 2014
8
Expectations vs Requirements vs Interpretations
Expectations
Requirements
Interpretations Proprietary
RTO RPO MTTR Degree of Availability …
Björn Rodén
© Copyright IBM Corporation 2014
Open Source (+ a Chef…) 6-7 (500 gram) beetroots 2-3 onions 1-2 carrots 250 grams of cabbage 2-3 tbsp butter 1 1/2 liters of bullion 1 pinch black pepper 1-2 laurel leaves 1-2 tbsp vinegar Some salt Sour cream or equiv for serving How To: chop, cut, crush, boil, stir, serve. 9
IT service implementation process example
Failure to verify non-functional AVAILABILITY requirements can create exposures – you will know if it works when you need it Björn Rodén
© Copyright IBM Corporation 2014
10
Brief systematic approach IT services continuity with Availability governance focus: 1. 2. 3. 4.
Identify critical business processes (from BIA/BCP) Identify risk & threats (from BIA/BCP) Identify business impacts & costs (from BIA/BCP) Identify/Decide acceptable levels of service, risk, cost (from BIA/BCP)
---------------------------------------------------------------------------------------------5. 6. 7. 8. 9. 10. 11.
Define availability categories and classifying business applications according to business impact of unavailability Architect Availability infrastructure Design solution from Availability architecture Plan Availability solution implementation Build Availability solution Verify Availability solution Operate and Maintain deployed Availability solution
---------------------------------------------------------------------------------------------12. 13.
Validate Availability solution SLO, implementation, design and architecture Decommission/Migrate/Replace BIA – Business Impact Analysis BCP – Business Continuity Plan SLO – Service Level Objectives
Björn Rodén
© Copyright IBM Corporation 2014
11
Get the ducks in a row
• Know why – Business and regulatory requirements – Services, Risks, Costs – Key Performance Indicators (KPIs)
• Understand how – Architect, Design, Plan
• Can implement – Build, verify, deploy, skill-up
• Will govern – – – – –
Service and Availability management Change, Incident and problem management Security and Performance management Capacity planning Migrate, replace and decommission
Björn Rodén
© Copyright IBM Corporation 2014
12
Key IT Availability Metrics
Björn Rodén
© Copyright IBM Corporation 2014
13
What are your key Availability Requirements?
Recovery Time Objective (RTO) How long time can you afford to be without your systems?
Recovery Point Objective (RPO) How much data can you afford to recreate or lose?
Maximum Time To Restart/Recover (MTTR) How long time until services are restored for the users?
Degree of Availability (Coverage Requirement) Annual percentage of a given time period when the business service should be available?
Björn Rodén
© Copyright IBM Corporation 2014
14
Notes on Degree of Availability • IT service availability can be measured in percentage of a given time period when the business service is available for it’s intended purpose – Usually expressed with a number of nines (9) over a year (rounded): • 99% => 88 hours/year • 99.9% => 9 hours/year • 99.95% => 4 1/2 hours/year • 99.99% => 52 min/year • 99.999% => 5 min/year • 99.9999% => ½ min/year
• IT system vs. IT service (ripple effect) – e.g. IT service dependent on five IT systems, if all target levels are met but not at the same time: • (99.9*99.9*99.5*99.5*99.0)/1005 => 97.82% or 191-192h/period total degree • MIN(99.9*99.9*99.5*99.5*99.0) => 99.0% or 88h/period as the highest degree indicator
• Determine the time period for the degree of availability – Are time for planned maintenance excluded during the year? •
Such as planned service windows and/or fixed number of days per month/quarter
– How many hours are used per year • Calendar year hours – 8760 h for 365 days non-leap years – 8784 h for 366 days leap years
• Decided amount of time per year (global coverage with 24 time zones, add one day) – 365 days (non-leap), then if global coverage add 24h d/y=366 or 8784 h – 366 days (leap), then if global coverage add 24h d/y=367 or 8808 h Björn Rodén
© Copyright IBM Corporation 2014
15
Common Availability and Disaster requirements
High Availability • • • • • • • • •
Disaster Tolerance
RPO – zero (or near zero) data loss RTO – measured in minutes at the most NRO – zero PRO – zero from UPS & generator Coverage Requirement (e.g. 24x7 / 24x365) Degree of Availability (e.g. 99.9% or ~9h/year) No single point-of-failure (SPOF) – System level Geographic affinity (Metro distance) Automatic failover/continuance/recovery to redundant components including application components – up to in-flight transaction integrity
• RPO – near zero data loss (may require manual recovery of orphaned data) • RTO/NRO – measured in hours, days, weeks • PRO – depend on generator fuel storage • Maximum Tolerable Period of Degraded Operations • Maximum Time To Restart/Recover (MTTR) • Business Process Recovery Objective (BPRO) • No single point-of-failure (SPOF) – DC level • Geographic dispersion (Global distance) • Declaring disaster is a management decision • Rotating site swap or periodic site swap • Full or Partial swap
Timeline Checkpoint in Time
RPO
Outage
Minimum Service Delivery
System repair
Service Delivery at 100%
New Business RTO
Your Recovery Objectives - Example
PRO – Power Recovery Objective NRO – Network Recovery Objective DOT – Degraded Operations Tolerance
Björn Rodén
© Copyright IBM Corporation 2014
16
Identify Points of Failure
Björn Rodén
© Copyright IBM Corporation 2014
17
Review your Availability Architecture •
Is the Availability Architecture still in place? –
–
–
Björn Rodén
Or might it have been altered when performing changes for: • Servers • Storage • Networks • Data Centres • Software upgrades • IT Service Management • Staffing • External suppliers and vendors Assumption: • The longer time duration an IT environment is exposed to opportunities for human error, the risk increase for deviation between Reality (facts on the ground) and the Availability Architecture (the map) Key areas: • Redundancy and Single Points of Failure (SPOF) • Communication flow and Server Service Dependencies • Local Area Network and Storage Area Network cabling • Application, system software and firmware currency • Staff attrition, mobility and cross skill focus
© Copyright IBM Corporation 2014
18
Identify critical IT resources – information flow perspective
DON’T FORGET
Business process information flow
CORE SYSTEMS
Information providing systems
Depend-on
Information receiving systems
Needed-by
Buffer time Degree of Availability
Björn Rodén
DON’T FORGET
Buffer time Degree of Availability
© Copyright IBM Corporation 2014
Degree of Availability
19
Identify critical IT resources – deployment connectivity perspective • Protocols (colors): – – – – – – – – –
RMI / IIOP HTTP / HTTPS CIFS NFS LPD / IPP MQ DB2 JDBC Java serializing
Björn Rodén
© Copyright IBM Corporation 2014
20
Björn Rodén
Redundancy and Single Points of Failure (SPOF)
Find the
Enterprise environment
SPOF
Site environment Data Centre environment Server Storage
Server
Server
Storage
MAN WAN
Application SAN
Middleware Operating System & System Software
UPS Gen.
Local Area Network Storage Area Network
Logical/Virtual Machine
Kernel stack
Physical Machine Network
Storage Hypervisor
Hardware (cores, cache, nest)
Björn Rodén
© Copyright IBM Corporation 2014
22
Redundancy and Single Points of Failure (SPOF)
ISP (external)
FW/IPS
Find the
SPOF
Routers
Switches Network Servers Storage
Switches
Storage
Björn Rodén
© Copyright IBM Corporation 2014
23
Redundancy and Single Points of Failure (SPOF)
Find the Your major goal throughout the planning process is to eliminate single points of failure and verify redundancy.
SPOF
A single point of failure exists when a critical Service function is provided by a single component. If that component fails, the Service has no other way of providing that function, and the application or service dependent on that component becomes unavailable. http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.powerha.plangd/ha_plan_over_ppg.htm
Björn Rodén
© Copyright IBM Corporation 2014
24
Application and data resiliency examples
• Application – Application restart after node failure (physical/virtual) • active / standby (automatic/manual) – Application concurrency (cluster horizontal scaling) • active / active (separate or shared transaction tracking)
• Data – Single site, single or dual storage • Storage based controlled by host (Hyperswap) • Host based (LVM mirroring/GPFS) • Database based (transaction replication) – Dual site, dual storage • Storage based (Metro/Global mirror) • Host based (GLVM/GPFS) • Database based (transaction replication)
Björn Rodén
© Copyright IBM Corporation 2014
25
Calculation examples for redundancy – Vital business function/process depending on three systems -
Uptime/year: 24*365=8760 System#1: 12+(4*4)=28 (8760/(8760+28)=0.9968 (incident + planned service window, 1/q) System#2: 12*2=24 (8760/(8760+24)=0.9973 (planned service window, 1/m) System #3: 2*3=6 (8760/(8760+6)=0.9993 (planned service window, 2/y) End to end availability: (0.9968 * 0.9973 * 0.9993) = 0.9934 * 100 = 99.34% Can be used as a baseline for improvement
Availability =
– Estimated failure rate for continuously used disk -
Uptime/year: 24*365=8760 Storage system #1: 14*7=98 Storage system #2: 14*7*4=392 MTBF: 300,000h Estimated disk failures per year: 8760/300k*490≈14 disks that might fail during one (1) year Can be used to increase awareness and motivate RAIDXX configuration
AFR =
uptime uptime + downtime
hours per year * n − disks MTBF
– Reliability indicator example -
Björn Rodén
MTBF: 300,000h MTTR: 20h (here including total service down time with Administrative or Logistic Downtime/ALDT) Reliability: 300,000/(300,000+20) * 100 = 99.99% MTBF MTBF: 100,000h Reliability = MTBF + (MTTR + ALDT) MTTR: 1h Reliability: 100,000/(100,000+1) * 100 = 99.999% MTBF Can be used to illustrate how difficult it is to obtain 100% Reliability = MTBF + MTTR © Copyright IBM Corporation 2014
26
PowerHA SystemMirror
Björn Rodén
27
PowerHA SystemMirror Edition basics • PowerHA SystemMirror for AIX Standard Edition – Cluster management for the data center • Monitors, detects and reacts to events • Multiple channels heartbeat between the systems > Network > SAN > Central Repository
• Enables automatic switch-over – SAN shared storage clustering – Smart Assists • HA agent Support – Discover, Configure, and Manage • Resource Group Management – Advanced Relationships • Support for Custom Resource Management • Out of the box support for – DB2, WebSphere, Oracle, SAP, TSM, LDAP, IBM HTTP, etc
• PowerHA SystemMirror for AIX Enterprise Edition – Cluster management for the Enterprise (Disaster Tolerance) • Multi-site cluster management • Automated or manual confirmation of swap-over • Third site tie-breaker support • Separate storage synchronization – Metro Mirror, Global Mirror, GLVM, HyperSwap with DS8800 (<100KM)
Björn Rodén
© Copyright IBM Corporation 2014
28
PowerHA SystemMirror support •
PowerHA 6.1 End of Support (EOS): 30-Apr-2015 (extended from 30-Sep-2014) – End of Support (EOS) is the last date on which IBM will deliver standard support services for a given version/release of a product. –
Björn Rodén
Any further service support extension, you will find it on this website: http://www-01.ibm.com/software/support/aix/lifecycle/index.html
© Copyright IBM Corporation 2014
29
Eliminating SPOF by using redundant components Cluster components
To eliminate as single point of failure
PowerHA SystemMirror supports
Nodes Power sources
Use multiple nodes Use multiple circuits or uninterruptible power supplies
Up to 16. As many as needed.
Networks
Use multiple networks to connect nodes
Up to 48.
Network interfaces, devices, and labels
Use redundant network adapters
Up to 256.
TCP/IP subsystems
Use networks to connect adjoining nodes and clients
As many as needed.
Disk adapters Controllers Disks
Use redundant disk adapters As many as needed. Use redundant disk controllers As many as needed. Use redundant hardware and disk mirroring, striping, or both As many as needed.
Applications
Assign a node for application takeover, to configure an Flexible configuration policies for high availability within a application monitor, and to configure clusters with nodes at site and between sites. more than one site.
Sites
Use more than one site for disaster recovery.
Resource groups
Use resource groups to specify how a set of entities should Up to 64 per cluster. perform.
Cluster resources
Use multiple cluster resources.
Up to 128 for the clinfo daemon (more can exist).
Virtual I/O Server (VIOS)
Use redundant VIOS
As many as needed.
HMC Managed System hosting a cluster node
Use redundant HMC Use separate managed systems for each cluster node
Up to 2. Up to 16.
Cluster repository disk
Use RAID protection
One active repository disk per site that has the ability to replace the disk after a failure. You must have a spare disk that is available to replace the failed repository disk in the live cluster.
Up to two sites.
http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.powerha.plangd/ha_plan_eliminate_spf.htm Björn Rodén
© Copyright IBM Corporation 2014
30
Some key changes in PowerHA 7.1 vs. 6.1 •
Architectural changes from PowerHA 6.1 (CAA/RSCT, Heatbeating, RG) –
•
PowerHA 7.1 is built on Cluster Aware AIX (CAA) functionality which provide fundamental clustering capabilities in the base operating system. PowerHA 6.1.0 use Reliable Scalable Clustering Technology (RSCT) for clustering framework.
PowerHA 7.1.3 require AIX 6.1 TL9 SP1 or AIX 7.1 TL3 SP1 –
•
Cluster Aware AIX (CAA) manage the heartbeats, not RSCT –
• •
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD101347 CAA use a Repository Disk to store configuration information persistent and must be shared by all cluster nodes
Event management is handled by using AIX pseudo file-system architecture Autonomic Health Advisor File System (AHAFS), not cluster manager and RSCT IP multicast with gossip protocol in 7.1, replaced unicast UDP IP in 6.1 –
• • •
Non-IP networks – diskhb, mndhb, rs232 etc removed No IPAT via Replacement (HW Address Takeover HWAT) Restrictions on changing hostname – –
• •
Communication Path to a node can be set from 7.1.2 (IP address mapping to hostname) Eased further in 7.1.3 (capability to dynamically modify the host name of a clustered node)
Smart Assist technology improved and extended Graphical Cluster Simulator with 7.1.3 –
• • •
With 7.1.3 unicast (TCP) is the default option in addition to multicast
Based on PowerHA ISD plug-in, saved XML config can be deployed
WebSMIT and 2-node cluster assistant removed ISD plug-in introduced (not PowerVC) Priority override location (POL) are not used and persistence after reboot is not retained http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.powerha.insgd/ha_install_priority_override.htm
Björn Rodén
© Copyright IBM Corporation 2014
31
Basic implementation flow PowerHA 7.1 vs. 6.1 PowerHA 6.1 1. 2.
3. 4.
5. 6. 7.
PowerHA 7.1
Plan for network, storage, and application – Eliminate single points of failure. Define and configure the infrastructure – Application planning, and start and stop scripts – Networks (IP interfaces, /etc/hosts, non-IP devices) – Storage (adapters, LVM volume group, filesystem) Install the PowerHA filesets. Configure the PowerHA environment: – Topology • Cluster, node names, PowerHA IP, and non-IP networks ----– Resources, resource group, attributes: • Resources: Application server, service label, volume group • Resource group: Identify name, nodes, policies • Add attributes: Application server, service label, VG, filesystem. Synchronize, save configuration (snapshot) Start/stop cluster services Verify, test configuration
Björn Rodén
1. 2.
3. 4.
5. 6. 7.
Plan for network, storage, and application – Eliminate single points of failure. Define and configure the infrastructure – Application planning, and start and stop scripts – Networks (IP interfaces, /etc/hosts, non-IP devices) – Storage (adapters, LVM volume group, filesystem) Install the PowerHA filesets Configure the PowerHA environment: – Topology: • Cluster, node names, PowerHA IP networks, Repository Disk and SFWcomm • Multicast or unicast network for heartbeat • Cluster Aware AIX (CAA) cluster – Resources, resource group, attributes: • Resources: Application server, service label, volume group • Resource group: Identify name, nodes, policies – Add attributes: Application server, service label, VG, filesystem. Synchronize, save configuration (snapshot) Start/stop cluster services Verify, test configuration
© Copyright IBM Corporation 2014
32
Configure PowerHA 7.1 vs. 6.1 PowerHA SE 6.1 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
PowerHA SE 7.1
Clear previous cluster configuration Configure cluster netmon.cf and (optional) rhosts files (optional) Customize cluster communication Create cluster definition Create node definitions Create LAN network definitions Create SAN disk heartbeat network definitions Add boot IP address definitions Add heartbeat disks definitions Add cluster service IP address definitions Create cluster resource group definitions Create cluster application server definitions Customize cluster application server monitoring definitions 14. Customize cluster resource group definitions 15. Verify and synchronize cluster configuration
1. 2. 3. 4. 5. 6. 7. 8.
9. 10. 11. 12. 13. 14. 15.
Clear previous cluster configuration Verify shared repository disk between cluster nodes Configure cluster netmon.cf and rhosts files (AIX 6.1.6) (optional) Customize cluster communication Create cluster definition Create node definitions (will create Node and LAN definitions automatically) Configure Repository disk and IP multicast id (unicast) (optional) Configure FC adapters for Target Mode and zone FC adapters WWPNs for SFWcomm, and Virtual Ethernet if VIOS Verify and synchronize cluster configuration Add cluster service IP address definitions Create cluster resource group definitions Create cluster application server definitions Customize cluster application server monitoring definitions Customize cluster resource group definitions Verify and synchronize cluster configuration
PowerHA 7.1.3 have additional SmartAssists and a new SmartAssist framwork. Björn Rodén
© Copyright IBM Corporation 2014
33
PowerHA 7.1 with dual node single/dual site
Baseline – Ordinary run-of-the-mill dual node cluster – Using Mirror Pools for LVM mirroring – Single Virtual Ethernet adapter per node backed by the same VIOS SEA LAGG PowerHA cluster
HA1 LPAR
HA2 LPAR
– Set "Communication Path to Node" to the cluster nodes hostname network interface – netmon.cf configured for ping outside the box from partition (cluster file) – /usr/es/sbin/cluster/netmon.cf – rhosts configured cluster nodes (cluster file) – /etc/cluster/rhosts – netsvc.conf configured with DNS (system file) – /etc/netsvc.conf
– Single or dual SAN Fabric – If dual sites, within a few km distance for minimal latency and throughput degradation
LVM Mirror
– Single LAN with ISL – If dual sites, use VLAN spanning Single or Dual Enterprise Storage
Björn Rodén
If the cluster node (partition) have multiple Virtual Ethernet adapters, set the "Communication Path to Node" to the IP address and Virtual Ethernet network interface device which maps to the hostname. © Copyright IBM Corporation 2014
34
PowerHA 7.1 with dual node single/dual site
Multicast between nodes •
Multicast is optional from 7.1.3 •
•
Default with 7.1.3 is TCP unicast
If desired, verify multicast is working between nodes before creating the 7.1 cluster –
PowerHA cluster
HA1 LPAR
HA2 LPAR
•
Check assigned multicast IP: –
•
lscluster -i | grep -i multi
Test with the mping command: – – –
LVM Mirror
Multicast IP can be set manually, or CAA will assign one based on the nodes lower 24-bit IP address after upper 8-bit multicast of 228, such as: 192.1.2.3 => 228.1.2.3
Start receiver first • mping -r -c 100 Start sender • mping -s -c 100 Use the -a <multicastip> flag to set the multicast address to be used by mping
Single or Dual Enterprise Storage
http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.powerha.trgd/ha_trgd_test_multicast.htm Björn Rodén
© Copyright IBM Corporation 2014
35
PowerHA 7.1 with dual node single/dual site
Repository Disk • • • • PowerHA cluster
HA1 LPAR
HA2 LPAR
• • •
Access from all nodes and paths Raw disk device driver I/O Direct access by CAA only Minimum 512MB and no larger than 460 GB, but do ~10GB Define a spare for the repos disk Do not manually write to the repos disk ! Check repos disk status – – – –
clras clras clras clras
lsrepos dumprepos dumprepos -r <reposdisk> dpcomm_status
LVM Mirror
Single or Dual Enterprise Storage
http://publib.boulder.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.clusteraware/claware_repository.htm Björn Rodén
© Copyright IBM Corporation 2014
36
PowerHA 7.1 with dual node single/dual site
Storage Framework •
Fibre Channel adapters with target mode support only – –
PowerHA cluster
HA1 LPAR
HA2 LPAR
–
•
All physical FC adapters WWPNs zoned – –
TM-ZONE •
LVM Mirror
• Single or Dual Enterprise Storage
One Fabric supported with SFWcomm For dual Fabric, it is supposed to work, if it do not work with your implementation and system software levels, please open a PMR with IBM Support
LPM do not migrate SFWcomm configuration –
•
Attribute on fcsX tme=yes Attribute on fscsiX dyntrk=yes and fc_err_recov=fast_fail Enable the new settings, such as through reboot
It is recommended that SAN communication be reconfigured after LPM is performed
Datalink layer communication over VLAN between AIX cluster node and VIOS with the physical FC adapters Check SFWcomm status – – –
lscluster -i sfwinfo -a clras sancomm_status
http://publib.boulder.ibm.com/infocenter/aix/v7r1/index.jsp?topic=/com.ibm.aix.clusteraware/claware_comm_setup.htm http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.powerha.concepts/ha_concepts_ex_san.htm Björn Rodén
© Copyright IBM Corporation 2014
37
PowerHA IP heartbeating over VIOS SEA •
Network heartbeating is used as a reliable means of monitoring an adapter's state over a long period of time. –
•
When heartbeating is broken, a decision has to be made as to whether the local adapter has gone bad, or the neighbor (or something between them) has a problem. – The local node only needs to take action if the local adapter is the problem; if its own adapter is good, then we assume it is still reachable by other clients regardless of the neighbor's state (the neighbor is responsible for acting on its local adapters failures). – This decision (local vs remote bad) is made based on whether any network traffic can be seen on the local adapter, using the inbound byte count of the interface. – Where Virtual Ethernet is involved, this test becomes unreliable since there is no way to distinguish whether inbound traffic came in from the VIO server's connection to the outside world, or just from a neighbouring VIO client (This is a design point of VIO that its virtual adapters be indistinguishable to the LPAR from a real adapter). For PowerHA 6.1 use netcmon.cf facility which is part of the cluster.* fileset –
•
RSCT Topology Services
For PowerHA 7.1 use netmon.cf –
RSCT Group Services
–
For pre 7.1.3 it is part of rsct.basic.* PTF U852423 or U851850 •
Install the PTF, refer to APAR: – http://www-01.ibm.com/support/docview.wss?uid=isg1IV14422
•
Configure netmon.cf according to APAR: – http://www-01.ibm.com/support/docview.wss?uid=isg1IZ01331
Björn Rodén
© Copyright IBM Corporation 2014
38
Cluster Topology Configuration – netmon.cf facility Without this feature network link and network switch failure will not be properly detected by the cluster node.
•
•
•
•
For single adapter PowerHA network adapters use the netmon.cf configuration file: – /usr/es/sbin/cluster/netmon.cf When netmon needs to stimulate the network to ensure adapter function, it sends ICMP ECHO requests to each IP address. After sending the request to every address, netmon checks the inbound packet count before determining whether an adapter has failed or not. Specify remote hosts that are not in the cluster configuration and that can be accessed from PowerHA interfaces, and who reply consistently to ICMP ECHO without delay, such as default gateways and equiv. Up to 32 different targets can be provided for each interface, if *any* given target is pingable, the adapter will be considered up (ICMP ECHO).
!REQD <owner> <target> Parameters: ---------!REQD : An explicit string; it *must* be at the beginning of the line (no leading spaces). <owner> : The interface this line is intended to be used by; that is, the code monitoring the adapter specified here will determine its own up/down status by whether it can ping any of the targets (below) specified in these lines. The owner can be specified as a hostname, IP address, or interface name. In the case of hostname or IP address, it *must* refer to the boot name/IP (no service aliases). In the case of a hostname, it must be resolvable to an IP address or the line will be ignored. The string "!ALL" will specify all adapters. <target> : The IP address or hostname you want the owner to try to ping. As with normal netmon.cf entries, a hostname target must be resolvable to an IP address in order to be usable.
http://www-01.ibm.com/support/docview.wss?uid=isg1IZ01331 Björn Rodén
© Copyright IBM Corporation 2014
39
Cluster partitioning, aka node isolation or “split brain” 1/2 • Cluster partitioning, aka node isolation or “split brain”, is a failure situation where more than one server acts as a primary. – Partitioning occurs when a cluster node stops receiving all interconnecting heartbeat traffic from its peer-node, and assumes that the peer-node has failed. – Due to the lack of synchronization, a split brain situation is problematic and can cause undesirable behaviour, such as data corruption. – Once the peer-node is determined to be down due to lack of heartbeats, both nodes on each side of the cluster attempt to take over resources (if so configured) from a node that is actually still active and running. – When the interconnection is restored and hearbeats resume, the cluster will merge and at this point, the cluster manager identify that a partitioning has occurred, and the cluster node with the highest node number will stop itself immediately. – During partitioning, if both nodes have acquired its respective peer-nodes resource groups and have had applications running with users connected and updating data for the same application on both nodes separately, data integrity is lost.
Björn Rodén
© Copyright IBM Corporation 2014
40
Cluster partitioning, aka node isolation or “split brain” 2/2 Common approaches regarding cluster partitioning: – Maximize independent interconnects between sites •
Use multiple IP and non-IP interconnects for cluster node heartbeats, with all physical links provided separately, and well isolated from failure at the same time, such as: – Dual IP-networks (LAN), each over separate physical adapters and network switches, and interconnection between cluster node sites. – Dual non-IP-networks (SAN), each over separate physical adapters and network switches, and interconnection between cluster node sites. – Consider using a third network interconnect for heartbeat only between nodes, such as if primary interconnections between nodes/sites use DWDM, use a non-landbased or VPN over ISP connection.
– Use third site as tie breaker •
Using “Tie-Breaker” disk/node/service concept, where a third site disk/node/service is used to choose surviving partition. For PowerHA, please refer to: –
•
http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=%2Fcom.ibm.aix.powerha.admngd%2Fha_admin_mergesplit_policy_713.htm
Optimally also use separate physical interconnect from each cluster node site to the third site.
– Classify node-failure as site-down event and/or start secondary by operator • •
Active site declares itself down and expect that secondary site will take over the failed services, secondary site takes over services if communication is lost to active site. Active site declares itself down, and secondary site is started by operator.
– Accept as-is •
Decide that the risk for partitioning occurring is unlikely, the cost for redundancy is too high, and accepting longer downtime relying on backup restore in case of data inconsistency.
NOTE: External access to nodes can still be available to primary site, even if site interconnects fail. Björn Rodén
© Copyright IBM Corporation 2014
41
PowerHA/EE Merge/Split Policy Options
Policy Setting
Split
Merge
Comments
Majority Rule
>N/2 side wins (N= total nodes in the cluster) In case of a tie, side with the smallest node id wins
Tie Breaker
Tie break holder side wins
Manual
Operator interventions enabled for split/merge processing
•
X
Tie breaker policy – A means of determining the winner when a split-site condition occurs – Losing side is quiecsed
•
Tie breaker policy options – Majority rules (site with largest number of nodes wins) – SCSI 2 or 3 tie breaker reservation disk (first one wins) – Operator intervention (operator decides)
X Majority rules Björn Rodén
© Copyright IBM Corporation 2014
43
Manual (operator controlled failover)
• Split/Merge Policies – – – –
Administrator prompts Cluster will wait for Admin inputs Optional Policy: After N prompts allow auto-recovery Custom action scripts can invoked at the time of split or merge as well
• Defaults – Number of prompts (N)=infinite – Interval between notifications: once in 30 seconds and then increasing in frequency – Auto-Recovery after N prompts
site down
X
cluster split
Björn Rodén
© Copyright IBM Corporation 2014
44
Migrating to PowerHA 7.1.3
Björn Rodén
© Copyright IBM Corporation 2014
45
Migration process to PowerHA 7.1.3 from 6.1 1. 2. 3.
Verify current PowerHA 6.1 availability functionality – Run cluster verification and make sure no errors are reported Verify PowerHA 7.1 preconditions, heartbeat networks and SPOFs AIX upgrade – Upgrade all nodes in the cluster to AIX 6.1 TL9 SP1 or AIX 7.1 TL3 SP1 or higher •
4.
Migrate the PowerHA 6.1 cluster – Rolling migration •
–
–
This type of migration involves bringing down the entire PowerHA cluster, reconfiguring the active cluster to fit, installing the new PowerHA and restarting cluster services one node at a time.
Snapshot upgrade •
–
You can upgrade a PowerHA cluster while keeping your applications running and available, during the upgrade process, a new version of the software is installed on each cluster node while the remaining nodes continue to run the earlier version.
Offline upgrade •
This type of migration involves bringing down the entire PowerHA cluster, reconfiguring the snapshot configuration, installing the new PowerHA and restarting cluster services one node at a time.
New install and configure •
5.
Leverage altdisk install and rotating one node at a time http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.install/doc/insgdrf/alt_disk_migration.htm
Design and install PowerHA cluster from scratch.
Verify cluster and high availability functionality – Cluster system functionality tests – Component failure tests – Failure scenario tests
Björn Rodén
© Copyright IBM Corporation 2014
46
Todo before migration •
Software levels for currency –
Upgrade AIX and RSCT to supporting levels and ensure that the same level of cluster software (including PTFs) are on all nodes before beginning a migration •
– –
• • • •
AIX 6.1 TL9 SP1
• AIX 7.1 TL3 SP1 • RSCT 3.1.2 or later Ensure that the PowerHA cluster software is committed (not applied) When performing a rolling migration, all nodes in the cluster must be upgraded to the new base release before applying any updates for that release
Run cluster verification and make sure no errors are reported Take a snapshot of the cluster configuration Backup and mksysb Use the /usr/sbin/clmigcheck tool 7.1
AIX 6.1 TL6+ AIX 7.1
7.1.1
AIX 6.1 TL7 SP2 AIX 7.1 TL1 SP2
RSCT 3.1.2.0 or higher for both AIX 6.1 and 7.1
7.1.2
AIX 6.1 TL8 SP1 AIX 7.1 TL2 SP1
RSCT 3.1.2.0 or higher for both AIX 6.1 and 7.1
AIX 6.1 TL9 SP1 AIX 7.1 TL3 SP1
RSCT 3.1.2.0 or higher for both AIX 6.1 and 7.1
7.1.3
AIX 6.1 RSCT 3.1.0.0 or higher AIX 7.1 RSCT 3.1.0.0
The "Communication Path to Node" on the PowerHA cluster nodes must be set to an IP-address mapping to the hostname. All cluster node hostnames must be resolved locally using the /etc/hosts file (IP address and label), use netsvc.conf, irs.conf or NSORDER in /etc/environment to set the order. Pre-7.1.3: After you have synchronized the initial cluster configuration, it is not supported to change the hostname or IP resolution of the hostname.
http://www-01.ibm.com/support/knowledgecenter/SSPHQG_7.1.0/com.ibm.powerha.insgd/ha_install_required_aix.htm Björn Rodén
© Copyright IBM Corporation 2014
47
Todo before migration •
Verify cluster conditions and settings – – – –
• •
Take a snapshot of the cluster configuration and save off customized scripts, such as start, stop, monitor and event script files Remove configurations which can’t be migrated – – – – –
•
Configurations with IPAT via replacement or hardware address takeover (MAC address) Configurations with heartbeat via IP aliasing Configurations with non-IP networking, such as RS232, TMSCSI/SSA, DISKHB or MNDHB Configurations which use other than Ethernet for network communication, such as FDDI, ATM, X25, TokenRing Note that clmigcheck doesn't flag an error if DISKHB network is found and PowerHA migration utility automatically takes care of removing that network
SAN storage for Repository Disk and Target Mode – –
•
Use clstat to review the cluster state and to make certain that the cluster is in a stable state Review the /etc/hosts file on each node to make certain it is correct Review the /etc/netsvc.conf (equiv) file on each node to make certain it is correct After AIX Version 6.1.6, or later is installed, enter the fully qualified host name of every node in the cluster in the /etc/cluster/rhosts file
The repository is stored on a disk that must be SAN attached and zoned to be shared by every node in the cluster and only the nodes in the cluster – and not part of a volume group SAN zoning of FC adapters WWPN for Target Mode communication
Multicast IP address for the monitoring technology (optional) – – –
Björn Rodén
You can explicitly specify multicast addresses, or one will be assigned by CAA Ensure that multicast communication is functional in your network topology before migration Note that from PowerHA 7.1.3 unicast is default
© Copyright IBM Corporation 2014
48
clmigcheck tool (1/2) clmigcheck tool is part of base AIX from 6.1 TL6 or 7.1 (/usr/sbin/clmigcheck) •
An interactive tool that verifies the current cluster configuration, checks for unsupported elements, and collects additional information required for migration
•
Saves migration check to file /tmp/clmigcheck/clmigcheck.log
•
You must run this command on all cluster nodes, one node at a time, before installing PowerHA 7.1.3
•
When the clmigcheck command is run on the last node of the cluster before installing PowerHA 7.1.3, the CAA infrastructure will be started (check with lscluster -m command).
----------[PowerHA System Mirror Migration Check] ------------Please select one of the following options: 1 = Check ODM configuration. 2 = Check snapshot configuration. 3 = Enter repository disk and multicast IP addresses. Select one of the above, "x" to exit or "h" for help:
Björn Rodén
© Copyright IBM Corporation 2014
49
clmigcheck tool (2/2) •
Option 1 – –
•
Option 2 – –
•
Checks configuration data (/etc/es/objrepos) and provides errors and warnings if there are any elements in the configuration that must be removed manually. In that case, the flagged elements must be removed, cluster configuration verified and synchronized, and clmigcheck must be rerun until the configuration data check completes without errors. Checks a snapshot (present in /usr/es/sbin/cluster/snapshots) and provides error information if there are any elements in the configuration that will not migrate. Errors checking the snapshot indicate that the snapshot cannot be used as it is for migration, and PowerHA do not provide tools to edit a snapshot.
Option 3 – – –
Queries for additional configuration needed and saves it in a file in /var on every node in the cluster. When option 3 is selected from the main screen, you will be prompted for repository disk and multicast dotted decimal IP addresses. Newer version of AIX has updated /usr/sbin/clmighcheck command and ask to select "Unicast" or "Multicast“.
Use either option 1 or option 2 successfully before running option 3, which collects and stores configuration data in the node file /var/clmigcheck/clmigcheck.txt, which is used when PowerHA 7.1.3 is installed. Björn Rodén
© Copyright IBM Corporation 2014
50
Rolling Migration Overview Steps 1. 2.
Stop cluster services on one node (move rg as needed) Upgrade AIX (if needed) and reboot •
3. 4.
Also install additional CAA filesets, bos.cluster and bos.ahafs
Verify /etc/hosts and /etc/netsvc.conf (and /usr/es/sbin/cluster/netmon.cf) Update /etc/cluster/rhosts •
5. 6. 7.
Enter cluster node hostname IP addresses. Only one IP address per line.
Refresh -s clcomd Execute clmigcheck (option1, then option 3) Upgrade PowerHA •
Install base level install images and complete upgrade procedures
•
Then comeback and apply lastest SPs on top of it. Can be done non-disruptively.
8. Review the /tmp/clconvert.log file 9. Restart cluster services (move rg back if needed) 10. Repeat steps above for each node (minus the additional options on clmigcheck)
Björn Rodén
© Copyright IBM Corporation 2014
51
Basic PowerHA cluster functionality verification Verify PowerHA cluster functionality – – –
After system functionality verification (file systems, users, network, backup, etc) Before or after cluster application server verification (start/stop/monitor integration hardening) Before end-to-end application resiliency verification (environment/enterprise wide failure scenarios)
Procedure Reboot both NODE1 & NODE2 and restart HA on both RG stop on NODE1 w/RG on NODE1 RG start on NODE1 RG stop on NODE1 w/RG on NODE1 RG start on NODE2 RG stop on NODE2 w/RG on NODE2 RG move from NODE2 to NODE1 w/RG on NODE2 RG move from NODE1 to NODE2 w/RG on NODE1 IP Failure test NODE1 Reintegrate NODE1 IP Failure test NODE2 Reintegrate NODE2 IP Failure test NODE1&NODE2 Reintegrate NODE1 & NODE2 Stop of PowerHA on NODE1 w/ migration to NODE2 Re-start PowerHA on NODE1 to reintegrate Stop of PowerHA on NODE2 w/ migration to NODE1 Re-start PowerHA on NODE2 to reintegrate SAN Availability Test on NODE1 Reintegrate NODE1 SAN SAN Availability Test on NODE2 Reintegrate NODE2 SAN HMC Power Off of NODE1 w/ RG on NODE1 HMC Activate of NODE1 & Restart Power-HA on NODE1 HMC Power Off of NODE2 w/ RG on NODE2 HMC Activate of NODE2 & Re-start Power-HA on NODE2 Reboot both NODE1 & NODE2 and restart HA on both Björn Rodén
Actions --- EXAMPLES
Excepted outcome Actual outcome
clRGmove -d clRGmove -u clRGmove -d clRGmove -u clRGmove -d clRGmove -m clRGmove -m ifconfig en# down w/ RG on NODE1 ifconfig en# up on NODE1 ifconfig en# down w/ RG on NODE2 ifconfig en# up on NODE2 ifconfig en# down on NODE1 & NODE2 ifconfig en# down on NODE1 & NODE2 cl_clstop cl_clstop SAN Admin SAN Admin SAN Admin SAN Admin chsysstate chsysstate chsysstate chsysstate chsysstate © Copyright IBM Corporation 2014
52
Further reading •
PowerHA for AIX –
•
PowerHA for AIX Version Compatibility Matrix –
•
•
– – –
http://publib.boulder.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.powerha.navigation/powerha_main.htm http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.powerha.insgd/ha_install_offline_61to710.htm http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.powerha.insgd/ha_install_upgrade_snapshot_61to71x.htm
–
http://publib.boulder.ibm.com/infocenter/aix/v6r1/topic/com.ibm.aix.powerha.insgd/ha_install_rolling_migration_61to710.htm
What's new in PowerHA 7.1
http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=AN&subtype=CA&htmlfid=897/ENUS213-416
IBM PowerHA SystemMirror for AIX 7.1.3 Enhancements –
•
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS5241
PowerHA 7.1.3 Announcement letter –
•
http://publib.boulder.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.powerha.navigation/powerha_whatsnew.htm
PowerHA 7.1.3 Release Notes –
•
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD101347
PowerHA 7.1 Infocenter
–
•
http://www-03.ibm.com/systems/power/software/availability/aix/index.html
http://www.redbooks.ibm.com/abstracts/tips1097.html
IBM PowerHA cluster migration –
Björn Rodén
http://www.ibm.com/developerworks/aix/library/au-aix-powerha-cluster-migration/
© Copyright IBM Corporation 2014
53
Thank you – Tack !
Björn Rodén roden@ae.ibm.com http://www.linkedin.com/in/roden Björn Rodén
© Copyright IBM Corporation 2014
54
A few extra slides
Björn Rodén
© Copyright IBM Corporation 2014
55
Architecting for Business Continuity 1
Can use BCI Good Practice or similar, or just start with… 1. Develop contingency planning policy 2. Perform Business Impact Analysis 3. Identify preventive controls 4. Develop recovery strategies
⌦ Note that Business Continuity Management (BCM) encompass much more than IT Continuity. ⌦ Some national and international standards and organizational recommendations:
5. Develop IT contingency plan
Focus on business purpose
Note: -ITIL: “Availability Management – To optimize the capability of the IT infrastructure, services and supporting organization to deliver a cost effective and sustained level of availability enabling the business to meet their objectives”. -COBIT: “DS4 Ensure Continuous Service objectives are control over the IT process to ensure continuous service that satisfies the business requirement for IT of ensuring minimal business impact in the event of an IT service interruption.” Björn Rodén
© Copyright IBM Corporation 2014
(1)BCI, Good Practice, http://www.thebci.org/ (2)DRII, Professional Practices, http://www.drii.org/ (3)ITIL IT Service Continuity: “Continuity management is the process by which plans are put in place and managed to ensure that IT Services can recover and continue should a serious incident occur.” (4) ISO Information Security and Continuity, ISO 17799/27001 (5) US NIST Contingency Planning Guide for Information Technology Systems, NIST 800-34 (6) British Standard for Business Continuity Management: BS 25999-1:2006 (7) British Standard for Information and Communications Technology Continuity Management: BS 25777:2008 (Paperback) (8) BITS – Basnivå för informationssäkerhet, https://www.msb.se/RibData/Filer/pdf/24855.pdf 56
Architecting for IT Service Continuity 1
Can use TOGAF ADM to bring clarity and understanding from an enterprise perspective on the availability/continuity requirements for different IT services…
Focus on IT design & governance
(1) The Open Group Architecture Framework (TOGAF) Architecture Development Method (ADM) is a step-by-step approach to developing an enterprise architecture. The term "enterprise" in the context of "enterprise architecture" can be used to denote both an entire enterprise – or just a specific domain within the enterprise. Björn Rodén
© Copyright IBM Corporation 2014
http://www.opengroup.org/
57
Controlling IT Service Continuity 1
COBIT DS4 to bring clarity and understanding from an enterprise perspective on the availability/continuity requirements for different IT services…
Focus on control of IT processes http://www.itgi.org/
IT Governance
(1) The IT Governance Institute (ITGI) Control Objectives for Information and related Technology (COBIT) is an international unifying framework that integrates all of the main global IT standards, including ITIL, CMMI and ISO17799, which provides good practices, representing the consensus of experts, across a domain and process framework and presents activities in a manageable and logical structure, focused on control.
Björn Rodén
© Copyright IBM Corporation 2014
Resource Management
58
IBM Systems Lab Services and Training
Björn Rodén