A majority of dynamically created websites on the web today are backended* by MySQL. Even though NoSQL solutions like project voldemort are all the rage, for a majority of people not doing Facebook type traffic, MySQL is still going to be the backend of choice. In other words, the reports of MySQL’s death are greatly exaggerated and that means scaling MySQL is still going to be a talent and skill required for web application architects.
Which MySQL Cluster?
Often when you see web based application white boarded the entire DB backend is referred as the “SQL cluster”. When you’re dealing with MySQL that could mean many things. There is a high-availability, high-redundancy version of MySQL called “MySQL Cluster”. The non cluster versions of MySQL can replicate data to multiple SQL servers via a master/slave relationship and multiple servers set up in this fashion are often called a MySQL cluster. Do not confuse the two! The official “MySQL Cluster” only supports one type of storage engine –
NDBCLUSTER. So if you develop your application to use
InnoDB then you have to perform some major rewriting or some other surgery for your application and data or you’re going to be out of luck. For many environments that makes the official “MySQL Cluster” a show stopper. If you’re not using data from a current MySQL Cluster, or you haven’t been coding/creating with the
NDBCLUSTER engine and it’s limitations in mind from the get-go, then using the official MySQL Cluster is a no-go. This product is a specialized version of MySQL with it’s own quirks and it’s use must fit the problem you are trying to solve. For this exercise in scaling we’ll use the regular MySQL and not the clustered versions because the databases and application code were all designed around InnoDB.
Our MySQL Scaling Goals
Our scaling goals for this project are simple. In our configuration the application has been well thought out and it expects a read only database for read only queries and a write database for everything else, we want to take advantage of that. We also know that read traffic is executed seven to ten times more than write traffic from testing. The first goal is high availability. We don’t want to change any web server config files or have our application wait for MySQL to timeout before switching to another SQL server. Switching from a slow or down server has to happen automatically. The second goal is higher performance. In our example we have 4 servers available for our backend, plus a monitoring server (build monitoring into your application architecture upfront and save yourself the downtime later). The final goal is the ability to grow to meet demand.
A quick note about our goals: If you divorce your application from the database architecture you won’t be able to have an application that scales or performs very well. In this article we’re looking at a pure backend solution, but what that architecture looks like was dictated by the application itself! In the real world high performance applications should be able to take advantage of a caching layer provided by something like memcache and code needs to be designed from the get go to look at multiple SQL clusters or to separate read queries from writes etc. In many cases memcache alone could replace or mitigate the need for more SQL servers. Relying on pure MySQL replication to scale only gets you so far and there is a point of diminishing returns. Kellan Elliot-McCrea from flickr brings those points home in his article “using and abusing mysql“.
Getting the right tools
The very first thing we need to decide up front is, which compiled version of MySQL do we want to use? Do we want to compile them ourselves? Do we use our vendors binaries or the pre-compiled binaries from MySQL or do we want to look at one of the MySQL project forks? Here’s my advice, for most people in small, low transaction environments use what your vendor provides or the official MySQL built binaries. The releases are well supported and updates are rolled out on a regular basis. When you start needing other capabilities or need to squeeze more performance out of your SQL server, then it’s time to look at the high performance forks. In our case we’ve been very happy with the percona MySQL builds, especially the ability to use their XtraBackup program. This makes setting up MySQL slave servers easy and much faster, especially with larger data sets and
InnoDB tables. (In actual testing doing a raw mysqldump and setting up a slave server with a 47G data base took almost an hour, using XtraBackup the same function took less than 15 minutes on a rather vanilla server).
Get, use, and love MMM (Multi Master replication Manager for MySQL). It is a collection of scripts that performs automated fail over of your MySQL cluster in much the same way as UltraMonkey does for other services. The advantage of MMM is that it is specifically designed for MySQL. It allows you to define servers by their role (writer or reader). With MMM only one node is writeable at a time, this prevents data getting out of sync in large waterfall environments. Reader roles can be balanced across several servers. More importantly MMM will detect if a server’s replication is running behind and remove it from the being queried, until the servers replication catches up. In the real world this is a life saver.
Get the maatkit tool set and install it on all your MySQL servers. This toolkit should be de rigeur for any MySQL installation that has replication. It is a collection of scripts that allows your DBA to more easily manage MySQL. It has hooks built in for memcache and postgres as well. Like MMM it is a project that grew out of google code.
We set up the first two MySQL servers in master/master replication mode. Here’s the twist, we will probably want to add more master SQL servers to the cluster later on, so plan for it now. You can add several MySQL servers fully synced in a water fall style configuration. When you create your my.cnf file configure the auto_increment_increment to a value of two times the expected number of master servers. So if you expect to only ever have five masters in replication, ensure that auto_increment_increment=10. This allows you to add more servers to the cluster with a minimum of downtime. Never set auto_increment_offset to zero and no two servers should ever have the same offset (common mistakes).
Our decision here was to have two servers in master/master replication with each master server having it’s own slave. With a read load seven times the write load we need to spread those selects across the cluster. This is where MMM really shines. The read load is spread out among all of the machines while the write load is quarantined to the master servers alone. The cluster can handle a huge read load and is orders of magnitude faster under load than a single server, satisfying the performance goal. If a server goes down or starts to fall behind in replication, it’s removed from the cluster so it has a chance to catch up. This happens automatically and without intervention, satisfying the high availability part of our goals.
We satisfy our scalability goal by planning the architecture to grow upfront. If we see a spike in read traffic we can add more MySQL slaves on the fly. If we see the need to spread out write traffic, we can add more master servers. Proper monitoring and logging provide those statistics.
Since we’ve planned for more masters up front we don’t have to restart each server. The ability to add a master sever on the fly without taking down the entire cluster is what makes MMM and the Percona Xtrabackup tool so critical. When we run the Xtrabackup tool it provides us the logfile name and position as part of the output! That means we have all the information required to setup and start a slave, performed in one action. We use the MMM scripts to take servers in and out of service and also monitor their status.
The architecture offered here was for a specific problem where we had some good metrics. If the read vs write traffic was more even we would have set up the servers in a waterfall configuration. All of the servers were using directly attached storage utilizing SAS drives in RAID 10. The databases were small enough so directly attached storage provided the best redundancy and performance for the cost. Once you start talking BIG databases then one needs to look at SAN architectures and ensure those considerations are baked into any design.
*While backended isn’t really a word, it perfectly describes what we’re talking about. Please feel free to use backended in your next database or application discussion.