Cassandra fiasco – NoSql kills
Posted by EngineSmith on September 12, 2010
We implemented one of our product purely on top Cassandra back in March 2010 with version 0.5.1. It went live in May, and tanked in the first week due to crazy traffic. Three painful, sleepless weeks later, Cassandra finally dropped its pants and started corrupting our data. We shutdown the site, spent three full days, converted backend to MySQL clusters. Since then, our product has grown steadily until now (September 2010), with 8M+ users and 1M+ DAU (daily active user).
I don’t want to restate all the great benefits Cassandra provides, we were so excited by that. Unfortunately, if it is too good to be true, it usually is:
- We started with a 6-node Cassandra cluster, big mistake. Some people later suggested to start with 50+ nodes. You will see the reason why below. Yes, it may be web-scale, but not for a small startup I guess.
- We set quorum=2, replication=3, which means 5 nodes are minimum. Unfortunately, during the initial weeks, we lost one out of six node, so our ability to stand for node failures are pretty low.
- Cassandra has a “binary-log” kind of mechanism, it appends to that file(s) until it reaches a limit. Then it needs to compress compact that big file, during which, this node is kind of “unresponsive”. All nodes uses a protocol “Gossip” to communicate, if one node is unresponsive, all its neighbours will think it is dead. (Even without node failure, very often we see discrepancy of the topology from each node’s view). When this compression compaction happens at high traffic, we were seeing cascading failures across the ring all the time.
- Later people suggested to use different disk partitions for “binary-log” and data. Well, that will require at least 4 disks on a server (mirrored 2 for each, and probably you will want another pair for OS, I don’t think we are crazy enough to run on single disks), and I wouldn’t call it a “commodity hardware” any more.
- At the time (May 2010 version 0.5.1) there is only ONE very few I/O parameter you can tune for Cassandra, it makes you doubt: hmm, I wonder why MySQL spent the last 10+ years have those many I/O tuning mechanisms?
I felt lucky we dumped it early on, at the moment people were asking us: why Digg/Facebook/Twitter all jumped on Cassandra and you guys ditched it? Maybe you are not competent enough? Turns out later the reality is:
- Digg struggles
- Cassandra at Twitter [Update] Twitter’s Original Plan with Cassandra (obviously they changed their mind), thanks “Kees Dijk“
Recently we also hear often: why don’t you use MongoDB, it does everything for you, and xxx is using it. If you pay just a little bit attention, MongoDB sacrifice your data consistency for performance, big time, almost like cheating:
Enjoy this one, “/dev/null is web scale“.