EngineSmith's Blog

Engineering Craftsman

Posts Tagged ‘mysql mmm failover scalability’

MMM fail – MySQL Master Master

Posted by EngineSmith on September 11, 2010

We have been using MMM for over a year now in Production to manage master-master MySQL cluster. It worked out pretty well for a while, until recently we had couple outages related to it and realized its limitations.

MMM basically manages VIPs (virtual IP) and associates roles with MySQL nodes in the cluster. In case of two nodes, you configure two VIPs: one for reader, one for writer (if you do read-write split, which we did before). In case of failure, or replication delay is too long, MMM is going to switch the VIP to point the right host. So what’s the catch?

  • First of all, the VIP is configured on each MySQL node itself. In order to make the switch, MMM needs to SSH to the node, and change its network config to release or take the VIP.
  • In order for the switch to happen, MMM actually expects to issue some MySQL commands to the node (like clean up, or set read-only etc)

What is wrong with those? MySQL node can fail, when things fail, they can fail in ANY POSSIBLE WAYS. We had two outages recently, once due to a hardware failure, the MySQL node can’t accept any SSH connections; another time, the MySQL instance was hosed and it held the MMM request and didn’t respond at all (it would have worked if it actually rejected the request). In both scenarios, MMM basically got stuck (internal states were all messed up) and didn’t do its job. Of course outages happened, 3-5 AM in the morning. 😦

What is the lesson? The fail-over mechanism should NEVER rely on the operations on the fail-able nodes, and states should be kept outside as well. The simpler the fail-over mechanism, the better off you are (KISS rule). Here are some thoughts we are still experimenting:

  • Don’t use VIP. Instead, let application side handle fail-over (i.e. specifying two physical IP addresses, if one fail, use the other). Monitor script activate/de-activate network switch port to implement fail-over (hardware network path is guaranteed to be a boolean). Most MySQL server node has two network ports, use one for application access, another one for replication/management.
  • Using LVS to manage VIP. Seems still too complicated, and I am concerned about the split-brain scenario, and what kind of availability it provides. The worst case is both MySQL nodes are taking traffic and you don’t even know about it.
  • Using hardware to manage VIP, the simplest, efficient and guaranteed way. We are considering using F5 BigIP. The issue is cost and bandwidth limit.

We are still investigating, will keep you posted. By the way, forgot to mention, we currently have 12 nodes in 6 clusters, all of them running on SSD drives, with peak 3-4 QPS per cluster.

Posted in Engineering, Operations | Tagged: | 2 Comments »