<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>GeeForce LLC &#187; Scalability</title>
	<atom:link href="http://www.geeforce.net/tag/scalability/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.geeforce.net</link>
	<description>We get technology out of the way of doing business</description>
	<lastBuildDate>Thu, 28 Oct 2010 14:15:17 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<atom:link rel='hub' href='http://www.geeforce.net/?pushpress=hub'/>
		<item>
		<title>Highly Available &amp; Scalable MySQL</title>
		<link>http://www.geeforce.net/2010/03/highly-available-scalable-mysql/</link>
		<comments>http://www.geeforce.net/2010/03/highly-available-scalable-mysql/#comments</comments>
		<pubDate>Tue, 09 Mar 2010 19:48:32 +0000</pubDate>
		<dc:creator>aaron_gee</dc:creator>
				<category><![CDATA[Clients]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Networks]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[highly available]]></category>
		<category><![CDATA[Maatkit]]></category>
		<category><![CDATA[MMM]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[MySQL Replication]]></category>
		<category><![CDATA[Percona]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[SQL Clusters]]></category>

		<guid isPermaLink="false">http://www.geeforce.net/?p=292</guid>
		<description><![CDATA[A majority of dynamically created websites on the web today are backended* by MySQL.  Even though NoSQL solutions like project voldemort are all the rage, for a majority of people not doing Facebook type traffic, MySQL is still going to be the backend of choice.  In other words, the reports of MySQL&#8217;s death are greatly [...]]]></description>
			<content:encoded><![CDATA[<p>A majority of dynamically created websites on the web today are backended<sup>*</sup> by <a title="The MySQL web site" href="http://www.mysql.com" target="_blank">MySQL</a>.  Even though <a title="Wiki entry for NoSQL" href="http://en.wikipedia.org/wiki/NoSQL">NoSQL </a>solutions like <a title="The Project Voldemort Home Page" href="http://project-voldemort.com/" target="_blank">project voldemort </a>are all the rage, for a majority of people not doing Facebook type traffic, MySQL is still going to be the backend of choice.  In other words, <a title="The High Scalability Blog's post on the end of MySQL &amp; Memcache" href="http://highscalability.com/blog/2010/2/26/mysql-and-memcached-end-of-an-era.html" target="_blank">the reports of MySQL&#8217;s</a> death are greatly exaggerated and that means scaling MySQL is still going to be a talent and skill required for web application architects.</p>
<h3>Which MySQL Cluster?</h3>
<p>Often when you see web based application white boarded the entire DB backend is referred as the &#8220;SQL cluster&#8221;.    When you&#8217;re dealing with MySQL that could mean many things.  There is a high-availability, high-redundancy version of MySQL called &#8220;MySQL Cluster&#8221;.  The non cluster versions of MySQL can replicate data to multiple SQL servers via a master/slave relationship and multiple servers set up in this fashion are often called a MySQL cluster.  <strong>Do not confuse the two! </strong>The official &#8220;MySQL Cluster&#8221; only supports one type of storage engine &#8211; <code>NDBCLUSTER. </code>So if you develop your application to use <code>MyISAM</code> or <code>InnoDB</code> then you have to perform some major rewriting or some  other surgery for your application and data or you&#8217;re going to be out of luck.  For many environments that makes the official &#8220;MySQL Cluster&#8221; a show stopper.  If you&#8217;re not using data from a current  MySQL Cluster, or you haven&#8217;t been coding/creating with the <code>NDBCLUSTER</code> engine and it&#8217;s limitations in mind from the get-go, then using the official MySQL Cluster is a no-go.  This product is a specialized version of MySQL with it&#8217;s own quirks and it&#8217;s use must fit the problem you are trying to solve.   For this exercise in scaling we&#8217;ll use the regular MySQL and not the clustered versions because the databases and application code were all designed around InnoDB.</p>
<h3>Our MySQL Scaling Goals</h3>
<p>Our scaling goals for this project are simple.   In our configuration the application has been well thought out and it expects a read only database for read only queries and a write database for everything else, we want to take advantage of that.  We also know that read traffic is executed seven to ten times more than write traffic from testing.  The first goal is high availability.  We don&#8217;t want to change any web server config files or have our application wait for MySQL to timeout before switching to another SQL server.  Switching from a slow or down server has to happen automatically.  The second goal is higher performance. In our example we have 4 servers available for our backend, plus a monitoring server (build monitoring into your application architecture upfront and save yourself the downtime later).  The final goal is the ability to grow to meet demand.</p>
<p><strong>A quick note about our goals:</strong> If you divorce your application from the database architecture you won&#8217;t be able to have an application that scales or performs very well.  In this article we&#8217;re looking at a pure backend solution, but what that architecture looks like was dictated by the application itself!  In the real world high performance applications should be able to take advantage of a caching layer provided by something like <a title="MemCache Home Page" href="http://memcached.org/" target="_blank">memcache</a> and code needs to be designed from the get go to look at multiple SQL clusters or to separate read queries from writes etc. In many cases memcache alone could replace or mitigate the need for more SQL servers.  Relying on pure MySQL replication to scale only gets you so far and there is a point of diminishing returns.   <a title="Kellan Elliot-McCrea's Blog" href="http://laughingmeme.org/" target="_blank">Kellan Elliot-McCrea</a> from flickr brings those points home in his article &#8220;<a title="Using and abusing mysql at flickr" href="http://code.flickr.com/blog/tag/using-and-abusing-mysql/" target="_blank">using and abusing mysql</a>&#8220;.</p>
<h3>Getting the right tools</h3>
<p>The very first thing we need to decide up front is, which compiled version of MySQL do we want to use?  Do we want to compile them ourselves? Do we use our vendors binaries or the pre-compiled binaries from MySQL or do we want to look at one of the <a title="Article at LWN on MySQL forks" href="http://lwn.net/Articles/329626/">MySQL project forks</a>?  Here&#8217;s my advice, for most people in small, low transaction environments use what your vendor provides or the official MySQL built binaries. The releases are well supported and updates are rolled out on a regular basis. When you start needing other capabilities or need to squeeze more performance out of your SQL server, then it&#8217;s time to look at the high performance forks.  In our case we&#8217;ve been very happy with the<a title="Percona Labs Page" href="http://www.percona.com/percona-lab.html" target="_blank"> percona MySQL builds</a>, especially the ability to use their <a title="OpenSource version of InnoDB backup with support of Percona extensions" href="http://www.percona.com/percona-lab.html" target="_blank">XtraBackup</a> program.  This makes setting up MySQL slave servers easy and much faster, especially with larger data sets and <code>InnoDB</code> tables. (In actual testing doing a raw mysqldump and setting up a slave server with a 47G data base took almost an hour, using XtraBackup the same function took less than 15 minutes on a rather vanilla server).</p>
<div id="attachment_303" class="wp-caption alignright" style="width: 292px"><a href="http://www.geeforce.net/wp-content/uploads/2010/03/MySQL_in_waterfall_master_slave_relationship.png"><img class="size-medium wp-image-303" title="MySQL_in_waterfall_master_slave_relationship" src="http://www.geeforce.net/wp-content/uploads/2010/03/MySQL_in_waterfall_master_slave_relationship-282x300.png" alt="MySQL Servers in Waterfall Master/Slave setup" width="282" height="300" /></a><p class="wp-caption-text">MySQL Servers in Waterfall Master/Slave setup</p></div>
<p>Get, use, and love <a title="MMM home page" href="http://mysql-mmm.org/start">MMM</a> (Multi Master replication Manager for MySQL).  It is a collection of scripts that performs automated fail over of your MySQL cluster in much the same way as <a title="Ultra Monkey" href="http://www.ultramonkey.org/" target="_blank">UltraMonkey </a>does for other services.  The advantage of MMM is that it is specifically designed for MySQL.  It allows you to define servers by their role (writer or reader).  With MMM only one node is writeable at a time, this prevents data getting out of sync in large waterfall environments. Reader roles can be balanced across several servers.  More importantly MMM will detect if a server&#8217;s replication is running behind and remove it from the being queried, until the servers replication catches up.  In the real world this is a life saver.</p>
<p>Get the <a title="Maatkit home page" href="http://www.maatkit.org/" target="_blank">maatkit tool </a>set and install it on all your MySQL servers.  This toolkit should be de rigeur for any MySQL installation that has replication.  It is a collection of scripts that allows your DBA to more easily manage MySQL.  It has hooks built in for memcache and postgres as well. Like MMM it is a project that grew out of <a title="Google Code" href="http://code.google.com/">google code</a>.</p>
<h3>The Architecture</h3>
<p>We set up the first two MySQL servers in master/master replication mode.  Here&#8217;s the twist, we will probably want to add more master SQL servers to the cluster later on, so plan for it now.  You can add several MySQL servers fully synced in a water fall style configuration. When you create your my.cnf file configure the auto_increment_increment to a value of two times the expected number of master servers.  So if you expect to only ever have five masters in replication, ensure that auto_increment_increment=10.  This allows you to add more servers to the cluster with a minimum of downtime.  Never set auto_increment_offset to zero and no two servers should ever have the same offset (common mistakes).</p>
<p>Our decision here was to have two servers in master/master replication with each master server having it&#8217;s own slave.  With a read load seven times the write load we need to spread those selects across the cluster.   This is where MMM really shines.  The read load is spread out among all of the machines while the write load is quarantined to the master servers alone.  The cluster can handle a huge read load and is orders of magnitude faster under load than a single server, satisfying the performance goal. If a server goes down or starts to fall behind in replication, it&#8217;s removed from the cluster so it has a chance to catch up.  This happens automatically and without intervention, satisfying the high availability part of our goals.</p>
<div id="attachment_308" class="wp-caption alignleft" style="width: 305px"><a href="http://www.geeforce.net/wp-content/uploads/2010/03/Final_MySQL_Architecture.png"><img class="size-medium wp-image-308" title="Final_MySQL_Architecture" src="http://www.geeforce.net/wp-content/uploads/2010/03/Final_MySQL_Architecture-295x300.png" alt="Final MySQL Architecture" width="295" height="300" /></a><p class="wp-caption-text">Final MySQL Architecture</p></div>
<p>We satisfy our scalability goal by planning the architecture to grow upfront.  If we see a spike in read traffic we can add more MySQL slaves on the fly.  If we see the need to spread out write traffic, we can add more master servers.  Proper monitoring and logging provide those statistics.</p>
<p>Since we&#8217;ve planned for more masters up front we don&#8217;t have to restart each server.  The ability to add a master sever on the fly without taking down the entire cluster is what makes MMM and the Percona Xtrabackup tool so critical.  When we run the Xtrabackup tool it provides us the logfile name and position as part of the output!  That means we have all the information required to setup and start a slave, performed in one action.  We use the MMM scripts to take servers in and out of service and also monitor their status.</p>
<h3>Caveats</h3>
<p>The architecture offered here was for a specific problem where we had some good metrics.  If the read vs write traffic was more even we would have set up the servers in a waterfall configuration.   All of the servers were using directly attached storage utilizing SAS drives in RAID 10.  The databases were small enough so directly attached storage provided the best redundancy and performance for the cost.  Once you start talking BIG databases then one needs to look at SAN architectures and ensure those considerations are baked into any design.</p>
<p><em><span>*While backended isn&#8217;t really a word, it perfectly describes what we&#8217;re talking about.  Please feel free to use backended in your next database or application discussion.</span></em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.geeforce.net/2010/03/highly-available-scalable-mysql/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>A Scalable E-Mail Architecture</title>
		<link>http://www.geeforce.net/2010/03/e-mail-architecture-2/</link>
		<comments>http://www.geeforce.net/2010/03/e-mail-architecture-2/#comments</comments>
		<pubDate>Wed, 03 Mar 2010 00:31:55 +0000</pubDate>
		<dc:creator>aaron_gee</dc:creator>
				<category><![CDATA[Clients]]></category>
		<category><![CDATA[Hardware]]></category>
		<category><![CDATA[Latest]]></category>
		<category><![CDATA[Networks]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Document Management]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Qmail]]></category>
		<category><![CDATA[SAN]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[single sign in]]></category>
		<category><![CDATA[tiered storage]]></category>

		<guid isPermaLink="false">http://www.geeforce.net/?p=223</guid>
		<description><![CDATA[According to a study done last year by Forrester Research nearly half large enterprise are &#8220;evaluating alternative options for managing and providing email&#8221;.  Why?  It&#8217;s relatively easy to build a highly available, highly redundant email system that can support tens or hundreds of thousands of users easily with free software. The answer to the&#8221;why&#8221; is [...]]]></description>
			<content:encoded><![CDATA[<p>According to <a title="Forrester Study Touted by Google" href="http://www.google.com/a/help/intl/en/admins/pdf/forrester_cloud_email_infrastructure_and_operations_analysis.pdf" target="_blank">a study</a> done last year by Forrester Research nearly half large enterprise are &#8220;evaluating alternative options for managing and providing email&#8221;.  Why?  It&#8217;s relatively easy to build a highly available, highly redundant email system that can support tens or hundreds of thousands of users easily with free software. <a href="http://www.geeforce.net/wp-content/uploads/2010/03/Single_Server_Solution1.png"><img class="alignright size-medium wp-image-250" title="Single_Server_Solution" src="http://www.geeforce.net/wp-content/uploads/2010/03/Single_Server_Solution1-300x219.png" alt="" width="300" height="219" /></a>The answer to the&#8221;why&#8221; is a bit complex and different for every company but the leading cause for email headaches is poor architecture.  Most corporate email systems evolved from a single box.  In a lot of SME&#8217;s there is only &#8220;the mail server&#8221;. That ideal coupled with proprietary software has lead a lot of companies down an unsustainable email path.</p>
<p>A lot of email problems simply go away if the system architecture has been well designed.  The architecture that we lay out here took into consideration ease of email management, high availability, storage growth, data retention, and retrieval.  It is based on open source software, but the ideas and architecture can be applied to proprietary solutions with some modifications.</p>
<p>The analysis of this email problem started by breaking out each action of a typical email transaction (both delivery, management, and retrieval) into very specific tasks and then based on our requirements decide where those tasks belong.  We try to push task intelligence to parts of this clustered design where they make the most sense and provide the most benefit.  The key here was to never create a single point of failure and architect the design so that each task can be scaled seperately from the other tasks.  That way adding another layer of spam protection doesn&#8217;t require a total redesign.</p>
<p>Our solution creates 4 zones;</p>
<ol>
<li>Inbound Zone (SMTP servers facing the Internet)</li>
<li>Storage Zone (Mail delivery and SAN)</li>
<li>Client Zone (Webmail &amp; IMAP servers for client access and outbound SMTP servers)</li>
<li>Business Intelligence Zone (Archival, Tiered Storage Decisions, Company Wide Searches)</li>
</ol>
<h3>Common Data Between Zones</h3>
<p>There are some elements of your email infrastructure that are required to be understood across all zones such as valid usernames, while other information such as password, or mailbox location only needs to be known by some of the zones.  The user information can be stored in a SQL or LDAP server and the information is replicated to each zone.  The data stored in SQL or LDAP can be used for other applications not related to mail such as user authentication, instant messaging, and billing.  In some Enterprises this requires the user SQL/LDAP layer to be pulled out into it&#8217;s own environment in others it requires a hybrid LDAP/SQL solution.  In our sample architecture the system in question relied on <a title="MySQL the worlds most popular open source SQL database" href="http://mysql.com/" target="_blank">MySQL </a>and replication was used on each machine to provide a local SQL store.</p>
<h3><strong>Zone 1</strong> : Inbound</h3>
<p>Inbound mail servers are defined in a domain&#8217;s DNS and it&#8217;s simple to delegate multiple inbound servers.  In the classic single box solution, there is only one inbound server.  The single server has to handle all inbound connections, all filtering, the mail store, and client connections. When the single server is flooded with lots of traffic, that traffic eats up resources and  ruins the end users email experience.  In the properly architected solution the load of incoming traffic is spread out among multiple servers that can be geographically diverse.</p>
<p>The inbound servers are also the first line of defense against unwanted mail.  The ideal is to prevent all suspect mail from ever making it into the mail infrastructure.  Why waste the end user CPU cycles, or mail storage on spam or virus emails?  In this configuration the inbound servers protect the mail store from unnecessary email traffic. After processing the accepted mail the inbound servers hand the email off to the mail store over a private network and deliver messages via QMQP or SMTP, adding another layer of protection as those connections can be throttled by the mail delivery servers to protect the mail store allowing the zone1 servers to act as a buffer during extreme traffic conditions.</p>
<h4><span style="text-decoration: underline;">Zone 1 features:</span></h4>
<ul>
<li>Inbound servers have their own mail queue so that they can store mail if Zone 2 goes offline for any reason</li>
<li>Inbound servers make decisions on accepting connectivity via real time black lists (RBL)</li>
<li>Inbound servers make decisions on accepting mail for users during the SMTP transaction (don&#8217;t accept mail that has to be bounced later)</li>
<li>Inbound servers handle SPAM and Virus tagging before handing messages to Zone 2</li>
<li>Virus &amp; spam analysis can be offloaded to other servers if the load is too high on the inbound servers providing an easy solution for additional capacity by simply adding more machines (virtual or otherwise) to the zone.</li>
</ul>
<h3>Zone 2 Storage</h3>
<p>The mail store consists of 2 parts, the delivery machines and the storage area network (SAN).  The delivery machines receive email from Zone 1 and store in on the SAN, following any user specific delivery rules.  Unlike other systems the mail sorting is done during delivery.  This reduces the number of times a message &#8220;moves&#8221; around on the file system, and requires less handling. Both front ends mounted the same SAN share using a distributed file system (<a title="Wiki entry for GFS (Global File Syestem)" href="http://en.wikipedia.org/wiki/Global_File_System" target="_blank">gfs2</a>).</p>
<p>In our system the delivery machines were also the master SQL servers in master/master replication and master/slave replication to the other zones.  All user updates, adds and deletes are managed via a web interface attached to the SQL servers in zone2.  All of the zone 1 machines were pointed to a single IP, and the two delivery machines run in high availability mode with load balancing.</p>
<h4><span style="text-decoration: underline;">Zone 2 features:</span></h4>
<ul>
<li>Storage growth is handled by the SAN &amp; choice of File system.  Simply add more storage and then <a title="Redhat manual for managing GFS file system" href="http://www.redhat.com/docs/manuals/csgfs/admin-guide/s1-manage-growfs.html" target="_blank">grow the file system</a>.</li>
<li>Tiered Storage can be provided by multiple SANs.  A high performance SAN for recent email and a slower but larger SAN for archival purposes.</li>
<li>Delivery rules are stored and executed during the first delivery.</li>
<li>Delivery can be scaled by adding front ends to either a common distributed backend storage or multiple common backends.</li>
<li>The SAN is fully mirrored.  Should the primary SAN fail the backup SAN comes online automatically.  File system mirroring is handled at the SAN level.</li>
<li>Since each clients mail store location is kept in a SQL server the ability to migrate from one SAN to another can be done &#8220;online&#8221; with no downtime.</li>
</ul>
<h3>
<div id="attachment_258" class="wp-caption alignleft" style="width: 227px"><a href="http://www.geeforce.net/wp-content/uploads/2010/03/Distributed_Architecture1.png"><img class="size-medium wp-image-258" title="Distributed_Architecture" src="http://www.geeforce.net/wp-content/uploads/2010/03/Distributed_Architecture1-217x300.png" alt="Distributed_Architecture" width="217" height="300" /></a><p class="wp-caption-text">Distributed Architecture</p></div>
<p>Zone 3: Clients</h3>
<p>Zone 3 is the end user zone.  This zone takes care of webmail, smtp relaying (outbound), and imap clients (outlook &amp; smart phones).  In our configuration there are two machines that mount the same SAN and run 3 services IMAP, HTTPS, &amp; SMTP.  The 2 servers run in loadbalancing/high availability mode.  In this case the traffic combined with webmail load was light enough to combine all of the client services onto single machine.  Each client service can be easily moved to their own server providing scalability.  This zone deals entirely with internal client requests.  If a client receives, checks, or sends an email, regardless of device (laptop, phone, etc) it goes through this zone.</p>
<h3>Zone 4: Business Intelligence</h3>
<p>This zone mounts the same SAN and handles things like auto archiving, indexing of emails for better IMAP performance and other functions the touch your email but whose primary function ISN&#8217;T email.  Email management tools live in this zone (Web based in this case). The advantage of having a dedicated business intelligence zone is that this provides for application specific functionality and connectivity without adding to the performance requirements of any one specific area of typical email transactions.</p>
<p>Examples of good use zone 4 include document management software that indexes company wide emails.  This types of indexing becomes invaluable when discovery orders are issued or an executive leaves under dubious circumstances.  Custom reporting on email usage and quotas organized across corporate divisions provide reporting that enables IT to make rational choices on where resources will be best spent.  This zone is also where programs designed to automate tired storage and auto archiving decisions need to go.</p>
<p>Having one place to go to write/execute that intelligence provides an enterprise the flexibility that they need when addressing email specific issues AND it does it in a way that minimally impacts email.  A perfect example of what happens when you build that intelligence into the wrong place would be an auto archive program that a certain hypothetical email admin might install for their enterprise.  The auto archiving is too aggressive in it&#8217;s endeavor to archive everything older than (x) days (the default setting), leading to a huge slow down in the enterprise&#8217;s email delivery. The helpdesk phones won&#8217;t stop ringing and one can expect the fainter of heart support staff to be reduced to quivering piles of jello in a cubicle.  In the enterprise clients get cranky when the email doesn&#8217;t work.  When things finally get caught up the legal staff shows up on the admin&#8217;s doorsteps with pitchforks and torches.  Not Good.</p>
<p>Some system architects or vendors want tiered storage or auto archiving to live on the primary mail store, or in storage.  The issue is that neither of those areas has the native intelligence to understand how users use, or are required to access to email better than the user.  It gets hard to tell your SAN which users email folders needs to be faster; For example the CEO that refuses to archive and calls when searches take more than 5 seconds or try to have your mail server define which email documents are connected to a legal case. Business intelligence isn&#8217;t an oxymoron until your SAN decides which email is archived for you.</p>
<p>Design your business intelligence where it belongs, and where you can react quickly without impacting the primary function of your email system, which is to deliver mail.  When you tie it all together you have a low maintenance highly scalable email solution that a Fortune 100 company would be proud of.  All it took was a little bit of up front thought to design the proper architecture.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.geeforce.net/2010/03/e-mail-architecture-2/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>

