SSD faults finally resolved
Posted by EngineSmith on August 28, 2009
We finally got our Production environment stabilized with MySQL, Open Solaris, ZFS, SSD/JBOD. The performance is just super awesome! With sysbench, we could reach 21K QPS stable! It took us quite some time to get it done, especially we faced a nasty SSD fault issues.
We were inspired by this SmugBlog: Success with OpenSolaris + ZFS + MySQL in production! and ordered JBOD with 2 Intel X-25E 32GB SSD and 10 SAS 15K drives for our MySQL se rver. The SSDs are configured as ZFS Cache (write cache). The SAS controller is LSI 3801E.
After some load testing, we found that there are many “scsci bus reset” errors in the log and “iostat -E” show “Hard Errors” for the SSD drives. Whenever “scsi bus reset” error happens, everything comes to a halt on MySQL side, thus we got very in-consistent load testing results. The “reset” happens every 30 minutes or 1 hour. It was just 3 days before our launch date!
We upgraded OpenSolaris to 2009.06, LSI and SSD firmware, didn’t help. Finally our mighty consultant found that turn off OpenSolaris Fault Management can ease off the errors. It took us through the first couple weeks’ Production launch period. Eventually our vendor Silicon Mechanics helped out and we updated the LSI driver. Wowla, not a single SSD error after that, performance is simply amazing!
- fmadm unload disk-transport # turn it off
- fmadm -eV # show status
Sysbench Load Testing
- sysbench –test=oltp –oltp-table-size=100000 –mysql-db=test –mysql-user=xxxx –mysql-host=host –mysql-password=xxxx prepare
- sysbench –test=oltp –oltp-table-size=100000 –mysql-db=test –mysql-user=xxxx –mysql-host=host –mysql-password=xxxx –max-time=60 –oltp-read-only=on –max-requests=0 –num-threads=8 run
Update LSI Driver
root@opensolaris:~# wget http://www.lsi.com/DistributionSystem/AssetDocument/itmpt_x86_5.07.04.zip root@opensolaris:~# uncompress itmpt-x86-XXX.tar.Z root@opensolaris:~# tar -xvf itmpt-x86-XXX.tar root@opensolaris:~# cd install root@opensolaris:~# pkgadd -d . root@opensolaris:~# vim /etc/driver_aliases (Or the editor of your choice) Here you will need to comment out the mpt driver aliases and uncomment the itmpt driver at the end of the file. example: #mpt "pci1000,30" #mpt "pci1000,50" #mpt "pci1000,54" #mpt "pci1000,56" #mpt "pci1000,58" #mpt "pci1000,62" #mpt "pciex1000,56" #mpt "pciex1000,58" #mpt "pciex1000,62" itmpt "pci1000,30" itmpt "pci1000,50" itmpt "pci1000,54" itmpt "pci1000,56" itmpt "pci1000,58" itmpt "pci1000,62" itmpt "pci1000,621" itmpt "pci1000,622" itmpt "pci1000,624" itmpt "pci1000,626" itmpt "pci1000,628" itmpt "pci1000,640" itmpt "pci1000,642" itmpt "pci1000,646"
Reboot OS after this