According to a study done last year by Forrester Research nearly half large enterprise are “evaluating alternative options for managing and providing email”. Why? It’s relatively easy to build a highly available, highly redundant email system that can support tens or hundreds of thousands of users easily with free software. The answer to the”why” is a bit complex and different for every company but the leading cause for email headaches is poor architecture. Most corporate email systems evolved from a single box. In a lot of SME’s there is only “the mail server”. That ideal coupled with proprietary software has lead a lot of companies down an unsustainable email path.
A lot of email problems simply go away if the system architecture has been well designed. The architecture that we lay out here took into consideration ease of email management, high availability, storage growth, data retention, and retrieval. It is based on open source software, but the ideas and architecture can be applied to proprietary solutions with some modifications.
The analysis of this email problem started by breaking out each action of a typical email transaction (both delivery, management, and retrieval) into very specific tasks and then based on our requirements decide where those tasks belong. We try to push task intelligence to parts of this clustered design where they make the most sense and provide the most benefit. The key here was to never create a single point of failure and architect the design so that each task can be scaled seperately from the other tasks. That way adding another layer of spam protection doesn’t require a total redesign.
Our solution creates 4 zones;
- Inbound Zone (SMTP servers facing the Internet)
- Storage Zone (Mail delivery and SAN)
- Client Zone (Webmail & IMAP servers for client access and outbound SMTP servers)
- Business Intelligence Zone (Archival, Tiered Storage Decisions, Company Wide Searches)
Common Data Between Zones
There are some elements of your email infrastructure that are required to be understood across all zones such as valid usernames, while other information such as password, or mailbox location only needs to be known by some of the zones. The user information can be stored in a SQL or LDAP server and the information is replicated to each zone. The data stored in SQL or LDAP can be used for other applications not related to mail such as user authentication, instant messaging, and billing. In some Enterprises this requires the user SQL/LDAP layer to be pulled out into it’s own environment in others it requires a hybrid LDAP/SQL solution. In our sample architecture the system in question relied on MySQL and replication was used on each machine to provide a local SQL store.
Zone 1 : Inbound
Inbound mail servers are defined in a domain’s DNS and it’s simple to delegate multiple inbound servers. In the classic single box solution, there is only one inbound server. The single server has to handle all inbound connections, all filtering, the mail store, and client connections. When the single server is flooded with lots of traffic, that traffic eats up resources and ruins the end users email experience. In the properly architected solution the load of incoming traffic is spread out among multiple servers that can be geographically diverse.
The inbound servers are also the first line of defense against unwanted mail. The ideal is to prevent all suspect mail from ever making it into the mail infrastructure. Why waste the end user CPU cycles, or mail storage on spam or virus emails? In this configuration the inbound servers protect the mail store from unnecessary email traffic. After processing the accepted mail the inbound servers hand the email off to the mail store over a private network and deliver messages via QMQP or SMTP, adding another layer of protection as those connections can be throttled by the mail delivery servers to protect the mail store allowing the zone1 servers to act as a buffer during extreme traffic conditions.
Zone 1 features:
- Inbound servers have their own mail queue so that they can store mail if Zone 2 goes offline for any reason
- Inbound servers make decisions on accepting connectivity via real time black lists (RBL)
- Inbound servers make decisions on accepting mail for users during the SMTP transaction (don’t accept mail that has to be bounced later)
- Inbound servers handle SPAM and Virus tagging before handing messages to Zone 2
- Virus & spam analysis can be offloaded to other servers if the load is too high on the inbound servers providing an easy solution for additional capacity by simply adding more machines (virtual or otherwise) to the zone.
Zone 2 Storage
The mail store consists of 2 parts, the delivery machines and the storage area network (SAN). The delivery machines receive email from Zone 1 and store in on the SAN, following any user specific delivery rules. Unlike other systems the mail sorting is done during delivery. This reduces the number of times a message “moves” around on the file system, and requires less handling. Both front ends mounted the same SAN share using a distributed file system (gfs2).
In our system the delivery machines were also the master SQL servers in master/master replication and master/slave replication to the other zones. All user updates, adds and deletes are managed via a web interface attached to the SQL servers in zone2. All of the zone 1 machines were pointed to a single IP, and the two delivery machines run in high availability mode with load balancing.
Zone 2 features:
- Storage growth is handled by the SAN & choice of File system. Simply add more storage and then grow the file system.
- Tiered Storage can be provided by multiple SANs. A high performance SAN for recent email and a slower but larger SAN for archival purposes.
- Delivery rules are stored and executed during the first delivery.
- Delivery can be scaled by adding front ends to either a common distributed backend storage or multiple common backends.
- The SAN is fully mirrored. Should the primary SAN fail the backup SAN comes online automatically. File system mirroring is handled at the SAN level.
- Since each clients mail store location is kept in a SQL server the ability to migrate from one SAN to another can be done “online” with no downtime.
Zone 3: Clients
Zone 3 is the end user zone. This zone takes care of webmail, smtp relaying (outbound), and imap clients (outlook & smart phones). In our configuration there are two machines that mount the same SAN and run 3 services IMAP, HTTPS, & SMTP. The 2 servers run in loadbalancing/high availability mode. In this case the traffic combined with webmail load was light enough to combine all of the client services onto single machine. Each client service can be easily moved to their own server providing scalability. This zone deals entirely with internal client requests. If a client receives, checks, or sends an email, regardless of device (laptop, phone, etc) it goes through this zone.
Zone 4: Business Intelligence
This zone mounts the same SAN and handles things like auto archiving, indexing of emails for better IMAP performance and other functions the touch your email but whose primary function ISN’T email. Email management tools live in this zone (Web based in this case). The advantage of having a dedicated business intelligence zone is that this provides for application specific functionality and connectivity without adding to the performance requirements of any one specific area of typical email transactions.
Examples of good use zone 4 include document management software that indexes company wide emails. This types of indexing becomes invaluable when discovery orders are issued or an executive leaves under dubious circumstances. Custom reporting on email usage and quotas organized across corporate divisions provide reporting that enables IT to make rational choices on where resources will be best spent. This zone is also where programs designed to automate tired storage and auto archiving decisions need to go.
Having one place to go to write/execute that intelligence provides an enterprise the flexibility that they need when addressing email specific issues AND it does it in a way that minimally impacts email. A perfect example of what happens when you build that intelligence into the wrong place would be an auto archive program that a certain hypothetical email admin might install for their enterprise. The auto archiving is too aggressive in it’s endeavor to archive everything older than (x) days (the default setting), leading to a huge slow down in the enterprise’s email delivery. The helpdesk phones won’t stop ringing and one can expect the fainter of heart support staff to be reduced to quivering piles of jello in a cubicle. In the enterprise clients get cranky when the email doesn’t work. When things finally get caught up the legal staff shows up on the admin’s doorsteps with pitchforks and torches. Not Good.
Some system architects or vendors want tiered storage or auto archiving to live on the primary mail store, or in storage. The issue is that neither of those areas has the native intelligence to understand how users use, or are required to access to email better than the user. It gets hard to tell your SAN which users email folders needs to be faster; For example the CEO that refuses to archive and calls when searches take more than 5 seconds or try to have your mail server define which email documents are connected to a legal case. Business intelligence isn’t an oxymoron until your SAN decides which email is archived for you.
Design your business intelligence where it belongs, and where you can react quickly without impacting the primary function of your email system, which is to deliver mail. When you tie it all together you have a low maintenance highly scalable email solution that a Fortune 100 company would be proud of. All it took was a little bit of up front thought to design the proper architecture.