MySQL High Availability Baron Schwartz February 2011
Agenda ● Defining High Availability ● High Availability Techniques ● High Availability Technologies ● Recommendations (?)
What is High Availability? ● What is Availability? ● How high is enough? ○ "I need six nines" ● MTTR and MTBF ● Service availability versus data availability
Achieving HA ● Increase MTBF ○ "Best practices" and proactive measures ○ Monitoring and alerting ○ Build systems that can soft-fail or run degraded ● Reduce MTTR ○ This is reactive, not proactive ○ Add redundancy to remove SPOFs ○ Add failover/takeover capabilities
Increasing MTBF ● Test regularly; find failures before they matter ● Least privilege ● Keep things neat, clean, systematic ● Manage changes carefully ○ More about this in "causes of downtime" talk ● Good system architecture & design ○ Loose coupling, degraded functionality, load shedding, etc
Decreasing MTTR ● Two important components: 1. Notice problems quickly 2. Resolve them quickly ● Technical measures help with part 2 ○ Systems that have no SPOF, etc ○ Redundancy and failover capability
Data Availability ● This is a related topic, but distinct from HA ● Typically want a strong D guarantee in ACID ● Usually implemented with synchronous replication ● In practice, usually inseparable from HA ● Makes HA much harder ○ It's easy to be HA if it's OK to lose your data!
Technologies ● Replication ● SAN ● DRBD ● MySQL Cluster (NDB) ● Percona XtraDB Cluster (Galera) ● Clustrix and similar ● Load Balancing and Proxies
Replication ● The classic approach, used for years by many ● Can have very short MTTR ● Essential problem: asynchronous ● Has a "glass ceiling" ● Some failover managers available, but most aren't great
SAN ● "Enterprise" approach ● Really a zero-data-loss technique, not HA ● MTTR can be (very) high ● SAN is still a SPOF (can be mitigated)
DRBD ● Replicated storage ● "Enterprise" zero-data-loss approach ● Relatively high MTTR ● DRBD + replication historically not a great approach ● There is Percona-PRM; worth knowing about
Percona XtraDB Cluster ● Built on modified InnoDB + Galera sync replication library ● Multi-master, synchronous, write-anywhere ● Beta; see percona.com/software/
Percona XtraDB Cluster ● Strengths? ○ Transparent, familiar technology* ○ Real HA and protection from data loss ○ No lagging replicas ○ Data stored redundantly; all nodes equal ● Weaknesses? ○ As slow as the slowest node ○ Data stored redundantly; probably limits total size ○ Deadlocks and rollbacks can increase * Uses optimistic conflict resolution, not pessimistic
MySQL Cluster (NDB) ● Shared-nothing approach: true HA ● Not fully general-purpose, but good for lots of things ● Not "vanilla MySQL" - NDB is a separate database ● Improving rapidly; MySQL Cluster 7.2 GA today! ● Unbeatable for specific purposes
Clustrix et al. ● Clustrix ○ NDB-ish, but queries execute fully on the nodes ○ Validated extensively by Percona ● Xeround ○ Ditto; for the cloud; not evaluated by Percona yet ● Continuent Tungsten ○ Replacement for MySQL replication ○ Kind of an opensource GoldenGate
Load Balancing and Proxies ● Usually used in combination with replication ● Usually require some scripting/integration ● $YOUR_LOAD_BALANCER_HERE ● HAProxy, pen, etc ● MySQL Proxy ● ScaleBase, ScaleArc
Recommendations ● Sorry, too complex for these slides :) ● Usual approach: ○ Cross off unsuitable solutions ○ Examine what's left ● Anti-recommendations: ○ Be careful with replication-based HA ○ MMM can be troublesome in some configurations ○ MySQL Proxy doesn't really excel in most cases
Comparison of Clustering Methods Percona XtraDB Cluster
â—? Writes update all nodes â—? Reads execute locally
Comparison of Clustering Methods MySQL Cluster (NDB)
â—? Writes update all copies of data â—? Reads execute locally + distributed
Comparison of Clustering Methods Clustrix
â—? Writes update all copies of data â—? Reads execute fully distributed
What's Percona Been Up To? http://tools.percona.com/
Resources ● Free webinars: www.percona.com/webinars ● White papers on preventing downtime: ○ www.percona.com/about-us/mysql-white-papers ○ "Causes of downtime" paper ○ "Preventing downtime" paper ● Slides from Percona Live DC 2012 ○ Yves's talk on Percona-PRM and MySQL Cluster ○ percona.com/live/
Santa Clara, CA April 10-13 Be There!