Internet-based systems have grown in popularity over the last few years. Whether you are building an application for end users or an enterprise-level one, if the application grows (either in user or transactions count) and the relies upon persistence, then databases scalability will probably become your bottleneck.
The CAP Theorem
In 2000, computer scientist Eric Brewer developed the CAP theorem (also named Brewer’s conjecture) which states that in distributed/networked systems, architects have to choose two of the following three requirements (can’t promise your users all three):
- Consistency: Every read receives the most recent write or an error. Note that consistency as defined in the CAP theorem is quite different from the consistency guaranteed in ACID database transactions.
- Availability: Every request receives a (non-error) response – without the guarantee that it contains the most recent write.
- Partition Tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes (partitions). For a distributed system, partition tolerance means the system will continue to work unless there is a total network failure. A few nodes can fail and the system keeps going.
In particular, the CAP theorem implies that in the presence of a network partition, one has to choose between consistency and availability.
Systems using traditional relational technologies (CA Type) such as MySQL, Oracle Database, SQL Server, etc. normally aren’t partition tolerant, but they can guarantee consistency and availability. In other words, if one part of these traditional relational systems is offline, the whole system is offline. Otherwise, the system would not be consistent as some nodes will not have the latest information while others do.
Systems where partition tolerance and availability are of primary importance (PA Type) cannot guarantee consistency, because updates can be made on either side of the partition and if a network problem occurs, consistency will not longer exists as nodes will contain different versions of the database. The key-value stores Dynamo and CouchDB and the column-family store Cassandra are popular examples of partition tolerant/availability systems.
Systems where partition tolerance and consistency are of primary importance (CP Type) can’t guarantee availability because the systems return errors until the partitioned state is resolved. MongoDB and Hadoop are considered CP systems (consistent and partition tolerant). With data stored redundantly across many slave nodes, outages to large portions (partitions) of a cluster can be tolerated.
ACID Compliant Databases
One of the concepts tied to relational database systems (RDBMS) is something known as ACID compliance. ACID is an acronym describing not the database itself but the features of individual database transactions run through the system. In simple terms, all transactions of an ACID compliant database system comply with the following requirements.
Atomicity: Each database transaction must completely succeed or completely fail. Partial success is not allowed. If the transaction includes several “steps” (either read or write) and one or more fails, the transaction will fail and the database will return to the state it was before starting the transaction.
Consistency: During the database transaction, the RDBMS progresses from one valid state to another. The state is never invalid. This means that if the database is replicated across several nodes (copies of the database), all nodes will hold the same information.
Isolation: The client’s database transaction must occur in isolation from other clients attempting to transact with the RDBMS. This means that if a transaction is taking place, other users will have to wait until it ends.
Durability: The data operation that was part of the transaction must be reflected in nonvolatile storage (computer memory that can retrieve stored information even when not powered — like a hard disk) and persist after the transaction successfully completes. Transaction failures cannot leave the data in a partially committed state.
Certain use cases for RDBMSs, such as online transaction processing, strongly rely ACID-compliant transactions between the client and the RDBMS for the system to function properly. A great example of an ACID-compliant transaction is a transfer of funds from one bank account to another.
This breaks down into two database transactions, where the originating account shows a withdrawal, and the destination account shows a deposit. Obviously, these two transactions have to be tied together in order to be valid so that if either of them fail, the whole operation must fail to ensure both balances remain valid.
Also, all other transactions must wait until the current one is completed, so this places a high constraint on performance. In most cases, to provide better performance without sacrificing consistency, the solution is vertical scaling. That is, getting more powerful (and increasingly more expensive) systems in order to take less time to process each transaction so it becomes available to process other requests.
Horizontally Scaling Databases: Enter the BASE model
If we look at ACID-compliant systems, Consistency and Isolation are the two major roadblocks to performance. They are also a problem when designing a distributed system that uses a horizontal scalability, such as Cloud-based, approach to increase performance (more nodes nodes can process queries) and availability (more nodes, less chances of a total system failure).
Luckily for the world of distributed computing systems, systems and databases engineers are clever and they came up with a new type of database system that requires different characteristics to enable flexibility and scalability (albeit partially sacrificing consistency). These opposing characteristics are cleverly captured in the acronym BASE:
- Basically Available: The system is guaranteed to be available for querying by all users (without isolating queries). It achieves this by using a highly distributed approach to database management. Instead of maintaining a single large data store and focusing on the fault tolerance of that store, BASE databases spread data across many storage systems with a high degree of replication. In the unlikely event that a failure disrupts access to a segment of data, this does not necessarily result in a complete database outage.
- Soft State: The values stored in the system may change because of the eventual consistency model, as described in the next bullet. In the BASE model data consistency is the developer’s problem and should not be handled by the database.
- Eventually Consistent: As data is added to the system, the system’s state is gradually replicated across all nodes. For example, in Hadoop, when a file is written to the HDFS, the replicas of the data blocks are created in different data nodes after the original data blocks have been written. Again, during the short period of time before all updated blocks are replicated, the state of the file system isn’t consistent. It will not always happen but users should know that it might.
Whereas ACID is pessimistic and forces consistency at the end of every operation, BASE is optimistic and accepts that the database consistency will be in a state of flux. Although this sounds impossible to cope with, in reality it is quite manageable and leads to levels of scalability that cannot be obtained with ACID.
By sacrificing Permanent Consistency in favor of Eventual Consistency developers enable Horizontal Scalability. That is, because we do not need to maintain 100% consistency at all times, we can distribute the query load among many small nodes instead of using a single, very large (and expensive) server. This also improves availability since even if a few nodes fail, the rest of them can process the queries (maybe a bit slower but without making the system fail).
ACID systems tend to scale worse than BASE systems both from complexity and from performance points of view, but their trade-offs of consistency and availability are less polarized (i.e., a higher level of both concepts can be reached).
BASE systems usually avoid joins and most have no need of a schema (maybe not a key factor, since a majority of cases actually uses it). Typically, they get a burst in throughput by not performing too many check operations during the inserts or updates. On the other side, they usually set a strong trade-off between consistency and availability. Developers must achieve consistency by using other tools, such as client-side checks or similar tools.
Scaling systems to dramatic transaction rates requires a new way of thinking about managing resources. The traditional transactional models are problematic when loads need to be spread across a large number of components. Decoupling the operations and performing them in turn provides for improved availability and scale at the cost of consistency. BASE provides a model for thinking about this decoupling.
It’s harder to develop software in the fault-tolerant and data-inconsistent BASE world compared to the fastidious ACID world, but Brewer’s CAP theorem says you have no choice if you want to scale up. as Brewer points out, there is a continuum between ACID and BASE systems. You can decide how close you want to be to one end of the continuum or the other according to your priorities (and how much money you have to implement it).