EngineSmith's Blog

Engineering Craftsman

Archive for the ‘Operations’ Category

MySQL read write split myth, and why I wouldn’t use it

Posted by EngineSmith on September 11, 2010

In MySQL scale-out practices, many people chose to split their read and write to different nodes. This may work well for a read-most environment, but actually significantly painful and wrong in write-most settings, like ours.

  • Two MySQL nodes, master-master replication
  • Only the active node is taking writes (writer)
  • Two readers, round-robin between the two nodes
  • MMM is used to monitor replication delay and handle role switching

This seems to be the common sense approach, make sure only one node is writer to guarantee consistency, while utilize both nodes as readers to share the load. We did that for the last 6 months, and found out in practice this is wrong:

  • Plan your capacity for the worst case scenario. The point of using two nodes is to handle fail-over. When one node dies, all read and write traffic will go to the other node. You should plan to have one node handling all traffic, the read/write split gives you an illusion that you have the capacity until the failover, it is too dangerous.
  • Splitting read/write actually made your application logic super complicated due to replication latency. Regardless of your tolerance level, at some critical point, you have to write some ugly code like this:  a = reader.get(); if (a == null) a = writer.get();
  • MMM is not designed very well to handle role switching during failure cases (I will write another post about it later), using two roles make the situation fairly messy.

So, here is what we did recently, in one cluster always have only one active node taking both reads and writes. MMM only handles fail-over (not replication delay). Then over-sharding on the cluster (prepare to split the shard in case it can’t handle the load). Life is much much simpler now.

Posted in Engineering, Operations | Tagged: , , | Leave a Comment »

SSD faults finally resolved

Posted by EngineSmith on August 28, 2009

We finally got our Production environment stabilized with MySQL, Open Solaris, ZFS, SSD/JBOD. The performance is just super awesome! With sysbench, we could reach 21K QPS stable! It took us quite some time to get it done, especially we faced a nasty SSD fault issues.

We were inspired by this SmugBlog: Success with OpenSolaris + ZFS + MySQL in production! and ordered JBOD with 2 Intel X-25E 32GB SSD and 10 SAS 15K drives for our MySQL se rver. The SSDs are configured as ZFS Cache (write cache). The SAS controller is LSI 3801E.

After some load testing, we found that there are many “scsci bus reset” errors in the log and “iostat -E” show “Hard Errors” for the SSD drives. Whenever “scsi bus reset” error happens, everything comes to a halt on MySQL side, thus we got very in-consistent load testing results. The “reset” happens every 30 minutes or 1 hour. It was just 3 days before our launch date!

We upgraded OpenSolaris to 2009.06, LSI and SSD firmware, didn’t help. Finally our mighty consultant found that turn off OpenSolaris Fault Management can ease off the errors. It took us through the first couple weeks’ Production launch period. Eventually our vendor Silicon Mechanics helped out and we updated the LSI driver. Wowla, not a single SSD error after that, performance is simply amazing!

Fault Management

  • fmadm unload disk-transport   # turn it off
  • fmadm -eV      # show status

Sysbench Load Testing

  • sysbench –test=oltp –oltp-table-size=100000 –mysql-db=test –mysql-user=xxxx –mysql-host=host –mysql-password=xxxx prepare
  • sysbench –test=oltp –oltp-table-size=100000 –mysql-db=test –mysql-user=xxxx –mysql-host=host –mysql-password=xxxx –max-time=60 –oltp-read-only=on –max-requests=0 –num-threads=8 run

Update LSI Driver

root@opensolaris:~# wget
http://www.lsi.com/DistributionSystem/AssetDocument/itmpt_x86_5.07.04.zip
root@opensolaris:~# uncompress itmpt-x86-XXX.tar.Z
root@opensolaris:~# tar -xvf itmpt-x86-XXX.tar
root@opensolaris:~# cd install
root@opensolaris:~# pkgadd -d .
root@opensolaris:~# vim /etc/driver_aliases (Or the editor of your choice)
Here you will need to comment out the mpt driver aliases and uncomment
the itmpt driver at the end of the file.
example:
#mpt "pci1000,30"
#mpt "pci1000,50"
#mpt "pci1000,54"
#mpt "pci1000,56"
#mpt "pci1000,58"
#mpt "pci1000,62"
#mpt "pciex1000,56"
#mpt "pciex1000,58"
#mpt "pciex1000,62"

itmpt "pci1000,30"
itmpt "pci1000,50"
itmpt "pci1000,54"
itmpt "pci1000,56"
itmpt "pci1000,58"
itmpt "pci1000,62"
itmpt "pci1000,621"
itmpt "pci1000,622"
itmpt "pci1000,624"
itmpt "pci1000,626"
itmpt "pci1000,628"
itmpt "pci1000,640"
itmpt "pci1000,642"
itmpt "pci1000,646"

Reboot OS after this


Posted in Engineering, Hardware, Operations | Tagged: | 7 Comments »