Mammatus Blog‎ > ‎

MongoDB Database: Architecture Replica Sets, Autosharding (part 1)

posted Sep 27, 2012, 1:45 AM by Rick Hightower   [ updated Sep 30, 2012, 3:15 PM ]

The model of MongoDB is such that you can start basic and use features as your growth/needs change without too much trouble or change in design. MongoDB uses replica sets to provide read scalability, and high availability. Autosharding is used to scale writes (and reads). Replica sets and autosharding go hand in hand if you need mass scale out. With MongoDB scaling out seems easier than traditional approaches as many things seem to come built-in and happen automatically. Less operation/administration and a lower TCO than other solutions seems likely. However you still need capacity planning (good guess), monitoring (test your guess), and the ability to determine your current needs (adjust your guess).

Replica Sets

The major advantages of replica sets are business continuity through high availability, data safety through data redundancy, and read scalability through load sharing (reads). Replica sets use a share nothing architecture. A fair bit of the brains of replica sets is in the client libraries. The client libraries are replica set aware. With replica sets, MongoDB language drivers know the current primary. Langauge driver is a library for a particular programing language, think JDBC driver or ODBC driver, but for MongoDB. All write operations go to the primary. If the primary is down, the drivers know how to get to the new primary (an elected new primary), this is auto failover for high availability. The data is replicated after writing. Drivers always write to the replica set's primary (called the master), the master then replicates to slaves. The primary is not fixed. The master/primary is nominated.

Typically you have at least three MongoDB instances in a replica set on different server machines (see figure 2). You can add more replicas of the primary if you like for read scalability, but you only need three for high availability failover. There is a way to sort of get down to two, but let's leave that out for this article. Except for this small tidbit, there are advantages of having three versus two in general. If you have two instances and one goes down, the remaining instance has 200% more load than before. If you have three instances and one goes down, the load for the remaining instances only go up by 50%. If you run your boxes at 50% capacity typically and you have an outage that means your boxes will run at 75% capacity until you get the remaining box repaired or replaced. If business continuity is your thing or important for your application, then having at least three instances in a replica set sounds like a good plan anyway (not all applications need it).


Figure 2: Replica Sets

MongoDB Replica Sets


In general, once replication is setup it just works. However, in your monitoring, which is easy to do in MongoDB, you want to see how fast data is getting replicated from the primary (master) to replicas (slaves). The slower the replication is, the dirtier your reads are. The replication is by default async (non-blocking). Slave data and primary data can be out of sync for however long it takes to do the replication. There are already whole books written just on making Mongo scalable, if you work at a Foursquare like company or a company where high availability is very important, and use Mongo, I suggest reading such a book.

By default replication is non-blocking/async. This might be acceptable for some data (category descriptions in an online store), but not other data (shopping cart's credit card transaction data). For important data, the client can block until data is replicated on all servers or written to the journal (journaling is optional). The client can force the master to sync to slaves before continuing. This sync blocking is slower. Async/non-blocking is faster and is often described as eventual consistency. Waiting for a master to sync is a form of data safety. There are several forms of data safety and options available to MongoDB from syncing to at least one other server to waiting for the data to be written to a journal (durability). Here is a list of some data safety options for MongoDB:

  1. Wait until write has happened on all replicas
  2. Wait until write is on two servers (primary and one other)
  3. Wait until write has occurred on majority of replicas
  4. Wait until write operation has been written to journal

(The above is not an exhaustive list of options.) The key word of each option above is wait. The more syncing and durability the more waiting, and harder it is to scale cost effectively.

If you would like to learn more about MongoDB consider the following resources: